Simplicity

This post is a response to Vitalik's most recent post about simplicity. I think it's one of the most important features Ethereum (and any other system) should look for. Ironically, in the same article, Vitalik's proposals contradict the direction he encourages to follow.

I know a few things about simplicity since it's the key design rule of Blueprints. Bugs are difficult to make and easy to find in simple systems. Others can understand the system easily. As Vitalik said, this yields a lot of positive results:

it's easy to contribute to the system
it's easy to build on top of the system
it's easy to reason about the system

The best decisions

In that post, Vitalik realizes that some of the complexity is "because of his own decisions", "often in pursuit of benefits that have proven illusory".

Why were decisions made whose benefits were just not there? Couldn't Vitalik and the Ethereum community think harder when making these decisions, to see they don't make sense?

Maybe. Maybe not. But I argue that even if it was possible to think harder, just knowing this won't make us inconceivably smart so we'll make only the best decisions from now on. The world and our thinking directions evolve. This is why sometimes people change their opinion with time, even though there haven't been any new arguments that could change their mind: why did someone who used to prefer using tabs for indentation start using spaces instead?

At any point in time, we may change what we think is the best decision. I think this is the only general lesson from Vitalik's observation. We have to know that solutions we think are now optimal may turn out to provide less value than expected when they were created, or our thinking or the world may have evolved by then, making our solution inferior to some other one.

But it seems Vitalik failed to see this lesson. Let's take a look at his VM migration proposal (it won't be technical).

VM migration thoughts by Vitalik

In the post describing the road to simplicity, Vitalik shows loose thoughts on how EVM migration to a different VM could look like. It is composed of 5 stages (the fifth only annotated, not shown in his illustration), meaning that with the current state, we'll turn one current state change function into six, that all historical nodes will have to support. This is not simplicity.

I may be getting a little ahead of myself since Vitalik argued about this approach earlier. He stated that it's impossible to decrease the size of all the code/specs that are needed to understand history. And then he said we should be optimizing for the simplicity of consensus, not the size of historical spec or the amount of code needed in specialized nodes. Since it's impossible to decrease the amount of historical code, then should we not care about it? No. The fact that we can't decrease it doesn't mean that we can't make the increases smaller – that's still optimizing for size. And I doubt that it holds zero weight in decision-making to optimize for historical nodes.

Either way, there is a much more severe issue I can see. In stages 1, 2, and 3, nodes will have to understand both EVM and RISC-V. We just learned that the goals may change during our journey towards them, and Vitalik suggests making the system more complicated to simplify it, even if we have to change direction during the course, we'll just end up with a system of immense complexity.

I think the migration of the VM may be good, but it must happen all at once. There may just be one additional stage beyond the initial and final one where the new EVM code deployed by the current transaction format gets automatically translated to RISC-V on deployment, to ensure the current development processes won't be interrupted. This requires the nodes to have the EVM-to-RISC-V translations code, which they would have to have anyway for the migration, and it will never change anyway. So it's not a large sacrifice.

Measure of simplicity

I think a very important aspect of optimizing for simplicity is defining what simplicity is. Is it the number of lines of code? Or maybe the number of characters in the code? Should we take into account the complexity of expressions, and count that in?

The number of lines of code (without packing things into single lines of code) is a fairly good heuristic. But at times it may be better to write many more lines of code, and I'd still say the system is simplified. I'll show this with an example.

Let's say I'm writing a piece of resilient infrastructure and I'll use counters in a few places. So, I decide to make a single implementation of a counter. I'll call it Counter.

I'm using a 32-bit machine and store the state of the counter in a single word. I realize: I may have an issue with overflows. Luckily, none of the use cases will have to do with such large numbers, so I can just ignore it or throw an error whenever the counter is about to overflow. I'll call this behavior an oddity – by just knowing the name Counter, I would suspect it increments/decrements some internal value, and I can treat it as a blackbox, but it has an atypical behavior where it can't be incremented to a value larger than $2^{32} - 1$.

So, I could make a Counter. It would take me 10 lines of very simple code.

As I write more code, I realize that I could use the Counter in one more place, and I may forget that this place may try to use values larger than the maximum supported value. Then, maybe it should just silently overflow, or maybe I should handle it in a specific way. But I could have forgotten to consider this case. This may also happen in future changes to the system, especially when these updates were not done by me. The most dangerous scenario is when some restrictions are lifted, and suddenly one of the existing use cases may have a situation where it needs to handle an overflowing number in a different way than initially.

So, there are quite a few things that can go wrong. In real implementations, when they are factored into their simplest parts, there could be 1-3 oddities on average.

Oddities are very dangerous, not because they are wrong, but because they aren't – but have the potential to be whenever we forget about them.

Whenever we use basic components to create something higher-level, we can, for example, combine 2-3 primitives. With each primitive having 2-3 oddities, we have to remember about 6-9 oddities in total, and potentially consider the interactions between oddities that grow exponentially – can reach even dozens with such a small example (path explosion). That's a lot of cases to consider to make sure the system will work as expected in any situation.

So, what can be done to prevent oddities?

Don't make the assumptions implicit, or hidden. Documenting them is important, but I don't think any programmer checks the documentation of every function every time it's used – coders would have to have all languages' docs and specs in front of them at all times. Instead, make it explicit in the name: for example by calling the counter Counter32. This strategy doesn't work well for slightly more complex types though, for example, someActionRoundingUpFloat128PrecisionAndOneMoreDetail is not a great function name.
Simplify the properties of code segments, removing oddities at the cost of the complexity of these code segments.

So, I could make my counter never overflow, which will take 40 lines of code instead of 10. But I believe it will be a net simplification of the code since I can ignore overflows when using the counter. If they do matter, I'll remember to handle that explicitly.

It heavily depends on which of the two strategies will be more fitting. But in the case of a simple counter, the first may make more sense.

In the end, we made the code more complex – more lines of code, more to verify. But the code is more easily understood, maintainable, and usable – if it's a library for other uses. I think it was a net simplification because there is one less thing we have to remember when developing the codebase.

The real measure of simplicity is how complex our mental model has to be in total when considering everything about the system. We can think of it as the size of the specification and the ease of reasoning about it. Usually it's correlated with the number of written lines of code, but not always.

If the specification is large, then the system is complex and it's easy to forget about some fragments when considering the properties of the system, for reasoning and coding purposes.

If it's difficult to reason about the specification, then again, we may lack understanding of the properties of the system for coding purposes.

Examples in Blueprints

I've considered many tradeoffs for Blueprints. The goal is simplicity, of both Blueprints and smart contracts created on top. But whenever the applications would be forced to consider some edge cases, it's worth complicating the Blueprints infra, so that anything on top would be even more radically simple.

A good example is the flash accounting module. It could have been 10-20 lines of code to match an analogous implementation to Uniswap v4. But this has a lot of edge cases:

the system is not composable across smart contracts, since:
- untrusted calls can cause a revert on the highest (flash accounting) level
- it's impossible to create a new flash accounting session while one is active
one has to remember all tokens and amounts that were in flash accounting to take them out of the system and settle them separately
- if they are determined at runtime, then they need to be read and checked by another piece of code
the accounting can only handle int128 balance changes – ERC20 interfaces are counting tokens in 256 bits, so 127 bits may be too little
- if the interactions were in 256 bits, then there would be an issue of overflows – the same actions executed in a different order would either revert or not, while the whole point of flash accounting is to be independent of the required order of operations

In Blueprints, we took these oddities very seriously, and the implementer on top of the Blueprints infrastructure doesn't have to care about any of these issues – they're solved. Unfortunately, instead of 10-20 lines of code, we had to write 200-300 lines of code for flash accounting. Despite that, I believe this has enabled a net simplification of our system.

This example is not perfect for Ethereum, because non-fundamental limitations that affect developers building on top are rather rare. But I hope I managed to illustrate the depth of the idea of simplicity.

While simplicity is extremely important for Ethereum, and it's great to see interest in simplicity from Vitalik, I'm hoping that it will be more present in Ethereum Improvement Proposals than in Vitalik's post.