Ethereum's Roadmap

945G...fRid

26 May 2024

Two and a half years ago, in an article discussing the “endgame” for Ethereum, I noted that technically, the future development paths of blockchain seemed remarkably similar. In both scenarios, there’s a high volume of transactions on the chain, and processing these transactions requires significant computational power and extensive data bandwidth.
Regular Ethereum nodes are insufficient to directly validate such vast amounts of data and computation, even with excellent software engineering works and Verkle trees.

In both “L1 sharding” and rollup-centric worlds, ZK-SNARKs are used to verify computations, and Data Availability Sampling (DAS) is used to verify data availability. The DAS in both cases is the same. The ZK-SNARKs technology is also the same in both cases; the difference lies in whether it’s smart contract code or an embedded feature of the protocol. Technically, Ethereum is indeed undergoing sharding, and rollups are part of that sharding.

This raises a natural question: what’s the difference between these two worlds? One answer is the consequences of code errors differ: in the rollup world, tokens can be lost, while in the sharded chain world, there can be consensus failures. However, I anticipate the significance of errors will diminish as the protocol solidifies and formal verification technology improves. So, what long-term differences can we expect between these two visions?

Diversity of Execution Environments
n idea we briefly experimented with on Ethereum in 2019 was execution environments. Essentially, Ethereum would have different “zones,” each with different account working rules (including completely different methods, like UTXO), the ways the virtual machine works, and other characteristics. This would enable some methods that are difficult to implement independently on Ethereum.

Ultimately, we abandoned some more ambitious plans and retained only the EVM. However, Ethereum’s L2s (including rollups, valdiums, and Plasmas) serve as execution environments to some extent. Today, we usually focus on EVM-equivalent L2s, but this overlooks the diversity of many alternative approaches:

Arbitrum Stylus, which adds a second virtual machine based on WASM in addition to the EVM.
Fuel, which uses a UTXO architecture similar to Bitcoin (but more comprehensive).
Aztec, which introduces a new language and programming paradigm designed around privacy-protecting smart contracts with ZK-SNARKs.

We could try to make the EVM a super virtual machine that covers all possible paradigms, but this would lead to suboptimal implementations for each concept compared to platforms focusing on their respective areas.

Security Trade-offs: Scale and Speed Ethereum’s L1 provides very strong security guarantees. If some data is confirmed in a block on L1, the entire consensus (including social consensus in extreme cases) ensures the data won’t be edited in a way that violates application rules, any execution triggered by the data won’t be reversed, and the data will remain accessible. To achieve these guarantees, Ethereum’s L1 is willing to accept high costs. At the time of writing this article, transaction fees are relatively low: less than one cent per transaction on the second layer network, and even basic ETH transfer fees on L1 are less than one dollar. If technological progress is fast enough that available block space can keep up with demand, these costs may remain low—but they might not. And even at $0.01 per transaction, it’s too high for many non-financial applications (like social media or games).

But social media and games don’t need the same security model as L1. If someone spends a million dollars to reverse the record of a chess game they lost, or to make your tweet appear as if it was posted three days after it actually was, that’s acceptable. Therefore, these applications shouldn’t pay the same security costs. A rollup-centric approach makes this possible by supporting various data availability methods from rollups to plasmas to validiums.

Another security trade-off arises in the issue of transferring assets from L2 to L2. It’s expected that in 5-10 years, all rollups will be ZK rollups, and super-efficient proof systems like Binius and Circle STARKs combined with lookup and proof aggregation layers will enable L2s to provide final state roots at every slot. Currently, we have a complex mix of optimistic rollups and ZK rollups with various proof time windows. If we had implemented execution sharding in 2021, the security model to keep shards honest would be optimistic rollups, not ZK—so L1 would have to manage complex fraud proof logic and a week-long waiting period for assets to move from shard to shard. But I believe this issue is also temporary.

The third and equally enduring security trade-off dimension is transaction speed. Ethereum generates a block every 12 seconds and is unwilling to go faster because that would centralize the network too much. However, many L2s are exploring block times of a few hundred milliseconds. 12 seconds isn’t too bad: on average, users have to wait about 6-7 seconds for their transaction to be included in a block (not just 6 seconds, because the next block might not include them). This is about the same as the wait time when I pay with a credit card. But many applications need higher speeds, and L2s provide that.

To offer higher speeds, L2s rely on pre-confirmation mechanisms: L2’s own validators digitally sign commitments to include transactions at a specific time, and they may be penalized if the transactions aren’t included. A mechanism called StakeSure further generalizes this.

We could try to do all this on L1. L1 could combine “fast pre-confirmation” and “slow final confirmation” systems. It could combine shards with different security levels. However, this would add a lot of complexity to the protocol. Moreover, doing everything on L1 risks overloading consensus, as many higher-scale or faster throughput methods have higher centralization risks or require stronger forms of “governance,” and these stronger requirements would affect other parts of the protocol if done on L1. By offering these trade-offs through L2, Ethereum can largely avoid these risks.

Advantages of L2 in Organization and Culture
Imagine a country split in half, one half becoming capitalist and the other becoming highly government-led (unlike in real life, assume in this thought experiment it’s not the result of any kind of traumatic war; it’s just that one day a border magically appears, and that’s it). In the capitalist part, restaurants are run by various decentralized ownerships, chains, and franchises. In the government-led part, they’re all branches of the government, like police stations. On the first day, there wouldn’t be much change. People largely follow existing habits, what works and what doesn’t depends on technical realities like labor skills and infrastructure. A year later, you’d expect to see big changes, as different incentives and control structures lead to big changes in behavior, affecting who comes, who stays, who leaves, what’s built, what’s maintained, and what’s abandoned.

Industrial organization theory covers many of these distinctions: it talks not just about the difference between a government-run economy and a capitalist economy, but also about the difference between an economy dominated by large franchises and, for example, an economy where every supermarket is run by an independent entrepreneur. I believe the difference between an L1-centric ecosystem and an L2-centric ecosystem is similar.

The key benefit of Ethereum as a second-layer network-centric ecosystem can be stated as follows:

a) Ethereum is a second-layer network-centric ecosystem where you can freely and independently build your own sub-ecosystem with your unique characteristics, while still being part of the larger Ethereum.

b) If you’re just building an Ethereum client, you’re part of the larger Ethereum, although you have less creative space than L2. If you’re building a completely independent chain, you have the most creative space, but you lose the benefits of shared security and shared network effects. L2 forms a happy middle ground.

L2s not only create a technical opportunity to experiment with new execution environments and security trade-offs to achieve scale, flexibility, and speed: they also create incentives for developers to build and maintain them, and for communities to form and support them.

The fact that each L2 is isolated means that deploying new methods is permissionless: you don’t need to convince all core developers that your new method is “safe” for the rest of the chain. If your L2 fails, that’s your responsibility. Anyone can commit to completely strange ideas (like Intmax’s approach to Plasma), and they can continue to build and eventually deploy even if they’re completely ignored by Ethereum core developers. L1 features and precompiles aren’t like that, even in Ethereum, decisions about L1 development success and failure often depend on more political factors than we’d like. Regardless of what can theoretically be built, the different incentives created by an L1-centric ecosystem and an L2-centric ecosystem will ultimately greatly affect what is actually built, its quality, and order.

Challenges Faced by Ethereum’s L2-Centric Ecosystem
Cross-L2 interoperability presents several challenges that need to be addressed to ensure seamless integration and communication between different Layer 2 solutions. Here are some of the key challenges: Interoperability Issues:

Integrating functionalities across different Layer 2 solutions requires seamless interoperability between various blockchain networks. This can be complex due to the diverse architectures and consensus mechanisms employed by different L2 solutions.

Security Risks: Cross-L2 interactions increase the attack surface, potentially exposing vulnerabilities that can be exploited. Ensuring the security of bridges and other interoperability protocols is crucial to prevent loss of funds or data integrity issues.

Standardization: A lack of standard protocols for cross-L2 communication can lead to fragmentation and incompatibility issues. Developing common standards and practices is essential for facilitating interoperability.

User Experience: The complexity of moving assets or data across L2s can be a barrier for users. Simplifying the user experience without compromising on security is a significant challenge.

Technical Complexity: Building and maintaining the infrastructure required for cross-L2 interoperability involves significant technical complexity. This includes developing robust APIs and ensuring that different L2 solutions can effectively communicate and understand each other’s transactions and state changes.

Economic Incentives: Individual L2 solutions may lack the economic incentive to build and maintain interoperability infrastructure, as they might benefit more from strengthening their own network effects.

Addressing these challenges is critical for the development of a cohesive and efficient blockchain ecosystem where Layer 2 solutions can interact without friction. Efforts are ongoing to improve all these aspects, with emerging standards and new technologies aiming to streamline cross-L2 interoperability.