Blockchain State Representation should be Abstract and not part of Consensus

in #etheruem8 years ago (edited)

In my last article I argued that blockchains should be designed like multiplayer games where all that is necessary is to synchronize user inputs in order to reach consensus on game state. Today I would like to make the argument that the exact form the game state takes is irrelevant so long as all players agree on the validity of new game inputs. Stated another way, I am going to argue that there is no need to reach consensus on a canonical global state that can be represented by a hash.

The reason I am writing this is because many people in the blockchain ecosystem consider this controversial. Ethereum is the primary example of a message-based (input based) protocol that doesn't use UTXO but does require all block validators to agree on the precise hash of the canonical state. I consider this requirement superfluous.

Any developer will tell you that the more constraints (requirements) you place on a solution, the more difficult it becomes to optimize the solution. If you over-specify your problem then you force yourself into design patterns that are less flexible and more computationally intensive. In the worst case, an over constrained problem is not solvable. This is why it is important to make sure every requirement actually adds value to the resulting product.

Why Does Ethereum maintain a global state hash?

The primary reason for maintaining a global state hash is for the purpose of generating proofs. If you have the hash of the global state and the state is organized in a merkle tree-like structure, then with a 1kb proof you can show that at a particular point in time a certain property had a particular value. For example, you could prove that last Monday your balance was 100 ETH or that 15 seconds ago your balance was 20 ETH.

At first glance that sounds wonderful, but what you cannot do is prove that right now your balance is greater than 10 ETH. The value of past states is meaningless when every pending transaction has the power to change the current state. Sophisticated developers would design around this by implementing time-locks into the state such that if you prove that your balance was 20 ETH and 30 seconds ago and that it was locked in a state that would prevent it from changing for 1 minute, then you can prove that "right now" it is greater than 10 ETH.

The problem with the time-lock design is it makes the assumption that the code will behave in such a way that will enforce that constraint on possible future values of the balance. This means that in addition to proving state, you must also prove the behavior of the code.

The secondary reason for maintaining a global state hash is to detect software bugs. In this context, a software bug is defined as non-deterministic behavior. A global state hash will not detect logic bugs of the kind that resulted in the failure of the DAO. In the event non-deterministic behavior is detected, the network will split into two or more fragments depending upon the nature of the non-determinism. In the best case, there exists a clear majority and the network continues advancing while the minority attempt to figure out why their machines do not match the majority. In the worst case there is no clear majority, just a largest minority. In no case does having a state hash enable a minority node to automatically correct itself and join the majority fork.

What happens if you don't have a global state hash?

Blockchains like Steem and BitShares opt to not generate nor require all nodes to agree on a global state hash. This means that it isn't possible to prove anything about blockchain state from block headers and a merkle proof alone. All Steem and BitShares can prove is that at a particular time a particular transaction was valid. Any transaction that was invalid would not be part of the blockchain.

If two peers suffer from a determinism bug then it will go undetected until one of the peers allows a transaction that is rejected by another peer. This could happen days, months, or years after the original divergence in the state. When it does happen the result is the same as what happens on Ethereum; minority peers are forced to figure out why they disagree with the majority and then fix the bug.

What we can conclude from this is that the benefit of a state hash is to fail-early in the event of non-deterministic behavior.

Sources of Nondeterministic Behavior

If we want to enforce a design constraint around detecting non-deterministic behavior, then it is important that we understand how these things happen in the first place. Generally they boil down to the following broad categories:

  1. Reference to external state such as time, uninitialized memory, or a random number generator
  2. Differences in compiler optimization
  3. Differences in CPU architecture
  4. Differences in dynamically linked libraries
  5. Cosmic Rays and Hardware Bugs

Assuming all machines run the software in an identical environment and that cosmic rays or hardware bugs only impact individual nodes, then the only remaining sources for non-determinism are structural programming bugs.

Alternative means of Detecting Programming Bugs

Just because the hash of the state isn't part of the official consensus does not mean that peers cannot calculate a hash of their state for the purpose of detecting bugs early. In theory two peers running the same code should produce the same state and therefore if they hash their state it should produce the same value. If they do this periodically, then they would be able to catch a bug in determinism before it results in different peers drawing different conclusions on the validity of a particular transaction.

The key thing that makes performing checks outside consensus different from requiring these same checks as part of consensus is that it allows peers to completely change how they represent state without invalidating the global consensus. Two different peers can have completely different representations of state so long as they agree on which transactions are valid and invalid.

Debugging Code should not be part of Production

Developers everywhere know that before you ship code you strip out all of your performance monitoring and debugging tools. This developer instrumentation hurts performance and ultimately can interfere with meeting the real-time performance requirements of the product. In the case including hashes as part of the global consensus, it forces developers to exactly reproduce all historic bugs even if those bugs had no visible impact on the behavior of the platform. An example of a "bug" that does not impact behavior is switching from an array to a linked list. In this case the meaning of the data structure is the same, but the performance characteristics are vastly different.

Mathematical Generalization of Blockchain State

I normally avoid using math to prove things because I find that the vast majority of people cannot follow the logic. Today I am going to provide some math to generate a proof that blockchain behavior can be fully abstracted with different representations of the state while being mathematically equivalent.

Imagine there exists a function over all possible inputs f(i) such that it returned true if the input was valid and false if the input was invalid. Let . I be the set of all possible inputs and let i represent a specific input from that set. A blockchain such as Steem or BitShares can be defined such that f(i) returns f(i) if the input is invalid and returns f2(i) if the input is valid. f2(i) is a new function that behaves in a similar way to f(i) and which should be passed the next input. We could say that if the retuned function is the same as the original function then it returned false, else it returned true.

Two functions are considered identical if for each i in I they produce the same output. From a logical perspective, the internal implementation of f(i) is a black box. Therefore, the most abstract possible representation of state is as a mapping from i to true or false. With every input a new function is returned with a new mapping for every possible input.

It should be obvious that representing state this way is not practical, even though it is mathematically accurate. It isn't practical because the size of the input set I is effectively infinite. To overcome this problem we can use algorithmic compression techniques. If for example all ODD numbers are invalid and all EVEN numbers are valid, then we could express f(i) as i % 2 == 0. We could also express it as i&1 == 0. In both cases the implementation of f(i) is different, but both implementations are mathematically equivalent.

We can therefore conclude that it doesn't matter how the function is implemented so long as all parties have an implementation that produces the same true/false output for every possible input. We can also say that if blockchain validation is performed by checking the result of f(i) for each transaction, then all peers with equivalent functions will remain in consensus.

Impossible to Prove two functions Equal

The critics will quickly point out that the mathematical proof is meaningless in practice because it is impossible to prove that two different implementations of f(i) produce the same output for all input i. Anything beyond simple mathematical functions becomes impossible to prove. Ethereum attempts to prove they are equal by generating a hash of the state contained within f(i) and then asserting that given identical states and identical logic the two functions are equal.

Here is the rub, Etheruem this still requires proof that the logic is equal and free of determinism bugs. It is a circular proof! If we assert the code is free of determinism bugs then we can assert the states will be equal; however, if we cannot prove the code is free of determinism bugs then proving the state is equal does not actually help us prove that the two functions are equal. You can prove the presence of a bug, but not the absence of one. A difference in state hash may not matter if it is combined with an offsetting difference in logic.

Given that proving states to be equal by hash comparison does not actually prove consensus where consensus is defined as all parties agreeing on the validity or invalidity of all possible inputs, then what good is it? This is especially true in the case of multiple implementations in different languages. Proving you produced the same state is a good start, but doesn't actually prove anything as can be demonstrated by past Ethereum forks.

Why does any of this matter?

This matters because in an attempt to prove the impossible, blockchain engineers are over constraining the consensus process in ways that prevent blockchains from scaling in ways required by real world applications. Developers are forced into canonical representations and validators are forced into performing computations that are redundant for deterministic code.

This matters because proofing historical states is far less useful than proving the current state. It matters because building scalable blockchains that are not overly brittle is critical to the general adoption of blockchain technology.

Conclusion

Blockchains should not make any particular expression of state an explicit part of their consensus algorithm. Doing so will only hurt performance while making software brittle and bug fixes more complex to retroactively apply. If developers wish to check for bugs, they can still run a version of the code that generates and compares hashes. If anyone cares about historical proofs, they can still be generated and used by the parties running the developer-build of the software.

Given the benefits of having a state hash can still be achieved without including it in consensus, and that including a state hash in consensus comes at a high price, there is no compelling reason for blockchain developers to make state hashes part of consensus.

Sort:  

I'll be honest, I am not one of the fanboys that followed you to steem and I honestly didn't have any idea what your stances were on blockchain development. That being said, I tend to agree with you. I have felt for a while that the actual blockchain aspect of cryptocurrencies is redundant and useless. It's not like we need to physically go out and mine gold to make gold valuable. Witness and staking serve the same function to decentralize the network if upheld properly. I'm not sure if you know or are aware of Douglas Pike or his stances on development, but I feel like you two would get along well. I have my own personal concerns about Vericoin and Verium, but his core beliefs and philosophy are what keep me watching the development process. I also really like Stratis Platform's approach to the business and enterprise focused aspects of blockchain development. I guess I shall now be following your future projects as well. Thanks for the interesting read.

That was a lot of learning for me and definitely followed the logic. Thanks for pointing out the way the language coding is used and ow it greatly influences the resulting effects on the blockchain as a whole. Namaste :)

Thank you for posting @dantheman.

This in depth material regarding the Blockchain delineates the details we are in need of. Appreciate your attention to detail and walking us through it.

All the best to you. Cheers.

Hi @dantheman,

You mispelled ethereum in the tags. Not sure if you can edit it now given that it's the main tag.

i got into steem by misspelling dao. funny how we do that

f1(i) =f2(i,j) as long as j=0.

j could be equivalent to partitions in the ledger .

It clicked for me at:

Debugging Code should not be part of Production

I didn't realize how much of this technology was about proving mistakes weren't made to such a degree. I wonder if these early days of blockchain technology are too conservative or if that's required for trust when dealing with important financial transactions.

Thanks again for continuing to post here. I always learn something new.

My chimp brain doesn't understand this, but I think Dan is warming us up for his next blockchain. (some sort of ethereum killer maybe)

Way above my blockchain head, but regardless, an interesting read.

Nice posting. I am dummy in cryptocureency but. I know your idea have purpose to make steemit better. Nice ! :)

Some food for thought. The recent surge in Eth price made me think whether if the crowd has priced in the eventual success of casper-pos, which I think would take a much longer time to research and develop than expected. Vlad has been halfway here and there, it's hard to tell.

Reading this comment after two years. From the way things are looks like the crowds did price in these features. Now that these will take time, I think Ethereum is due for one more dump. What are your thoughts?

Can this model be used for compartmentalizing transactions? For example, an organization could use the system for their own internal accounting without revealing for the outer world what is happening, but at the same time the outer world can be sure that whatever transactions happen inside that organization, all the transactions are valid. When the organization makes transactions with the outer world, everybody can trust that it's real money and not counterfeited inside the company.

This might be a good way to make blockchain privacy better: as long as all the transactions are saved to the blockchain, they can be tracked afterwards. But if the early transactions are not known, only the current state, the privacy of users is much better. Only thing that is known are the balances of accounts. Nobody can know if there have been only one transaction or millions (except the account owner).

For a Steemit-like system this could be used for group editing of posts. A small group collaboratively writes a post little by little, making "internal" transactions (not shown for outsiders). When the post is ready, it's published for the outer world and it pops up in a Steemit-like website. The advantage is that the system stores early drafts (as long as somebody is maintaining the compartmentalized part) so there is no need for backups. The same system is used for making early drafts and to publish the finished text which is great for the end-user.

BTW, does this system resemble Urbit? It seems to me that you have a similar kind of approach here.

That's why different, non-blockchain ideas emerge. Like Bitlattice that I'm part of.

TLDR / ELI5 :

What data should a blockchain keep track of:

1st Block: Alice receive 10 from the blockchain
2nd Block: Alice send 10 to Bob

What data should a blockchain NOT keep track of:

3rd Block (State): Alice has 0 and Bob has 10

Why:

Scalability / Performance
 

Please correct me if I'm wrong.

Some of your ideas are very interesting, almost like you're testing the water on certain ideas. For example, I finally understand your article on not keeping state. Basically this means that bitcoin would order its transactions somehow etc, making it deterministic.

Are many games really doing this stateless thing completely? There is a game where it creates the universe as far as you want go based on a formula. But wow, determining state from millions of actions gets really really hard the more interactions happen or intersect in one place. I suppose each 'location' or however you prune the tree, could independantly watch only those events effecting that location.

But these theories are useful and interesting. I like image something then try to find a way to do it, but knowing what is possible helps us see the possibilities. Surely though, you are writing about something specific you're trying to accomplish.

Let me think ideas through, here. What if we could determine state independantly as nodes simply by recieving transactions. After 1 or 2 second of waiting, you have all transactions from various nodes, thus a deterministic ordered list. Then you make a tiny signed summary of what you have, send it out, and nodes reply: yes we and others have the same thing.

"...the exact form the game state takes is irrelevant so long as all players agree on the validity of new game inputs"

From what I understood, trying to generate a canonical state doesn't make sense because all participants had already agreed with the state of the transaction in the first place when it was included in the ledger.

I would say that it makes sense if you have the ability to make the participants agree in the transaction very fast, which I assume is the case of EOS, because you basically have very few of them.
If not, making all participants agree with every transaction would take longer, that's why I assume you have to pack those transaction in a block and validate it with hashes.