Why is the Tensor variation more practical to implement?
In the tensor variant, as described in Accidental Computer, you do a tensor encoding, which practically means encoding the columns first with G and then encoding the encoded rows with G’, which can be done by using encoders as blackbox, and therefore as a drop-in replacement. Specifically, it looks like:
which, when encoding, is done by:
You then commit to the rows and the columns of that matrix, receive randomness, and then take random linear combinations of the rows and the columns of the original data, as part of what you send in the proof. In the paper it looks like this:
which translates to:
where the RLC is done on the columns.
A big benefit of the tensor variation is that you sample rows and columns from an encoding of the original matrix, which means you only sample elements from the base field. In vanilla ZODA, you assume that you work over a big enough field to sample randomness from, which makes zero-overhead. Even if you were to only sample randomness from am extension field, and otherwise use a smaller base field, one of the matrices you sample from has randomness embedded within it which is sampled from an extension field, making the matrix have extension field elements. This implies the samples are much larger in the extension field variant of ZODA.
What does it mean to receive a row/column/entry?
It means that the when full node sends you the rows, columns and other elements, these refer to existing commitments. This, in turn, means that you also receive opening proofs of these commitments to verify that the sent data is valid. Commonly, you would use Merkle trees and proofs for this.
Why would wrapping the ZODA sampling algorithm in a succinct proof result in less sampling per-node?
The ZODA guarantees you get are unique encoding, correct encoding and adversarial reconstruction. Unlike the other guarantees, to ensure correct encoding, you need to do sampling over the rows and columns. This means that if you guarantee correct encoding through running the sampler in a succinct proof, you remove that need.
What does it mean that the Hadamard variant doesn’t support distributed reconstruction?
A really nice property to have is local decoding, meaning that you can decode specific elements in the matrix in sublinear space - importantly, without decoding the whole matrix. Fundamentally, it's because Hadamard ZODA is not a tensor encoding, and the right matrix is a small amount of random columns. Essentially, only the columns are encoded. For the ZODA proof, if you encode the columns, you have to sample over the rows, and only the rows are committed to. This suggests that a client that wants to access an element that they’re missing, when no one else would give them an entire row containing it, would have to essentially decode the column containing it to recover the element. And the only way to get a column, since columns aren’t committed to, would be to get enough rows to reconstruct the entire matrix. In other variants, you have commitments for both rows and columns and you can easily access specific elements. This is where the notion of distributed reconstruction comes in - clients can reconstruct missing data bit by bit, without having to collect the entire data matrix themselves.
Side note - even in the Hadamard variant, if you were given the entire row containing the element, you could verify it’s good using the ZODA proof, as you do for any sampled row.