Solana Labs is taking steps to improve network upgrades for a faster, more reliable, and scalable decentralized web.
On early Saturday, Solana experienced a major technical issue, causing a near-complete halt of on-chain activity on the network. This was caused by a forking incident that led to conflicting versions of its transaction history, causing validators to downgrade to the previous version of the code in hopes of restoring Solana’s throughput.
Despite this, the downgrade did little to solve the unknown problem. Thus, validators opted for a more drastic solution: restarting the chain to the point immediately prior to the forking. The supermajority of validators had downgraded to the old software, but it didn’t work. A complete shutdown of the chain was inevitable when organizing a restart attempt. Validators planned a second restart attempt, hoping it would restore service to the blockchain’s users. The first attempt was unsuccessful, leading to an extended delay after the validators restarted the wrong point in the chain.
Check this thread for more:
Solana Labs Engineers Working to Improve Network Upgrades
Solana Labs, the company behind the Solana blockchain, has announced its plans to improve network upgrades to deliver a faster, more reliable, and scalable network. The goal is to achieve a better and decentralized web, which is a top priority for the company.
The recent 1.14 network update revealed some challenges in maintaining stability during major updates, which affected the network’s recent speed and usability. Solana Labs is currently investigating the issues and will provide more details soon. Meanwhile, the company is sharing its plans to address the balance between reliability and scalability, and where to go from here.
Solana Labs’ core engineers had been fixing live problems that impacted the network’s speed and usability before the 1.14 release. These issues included invalid gas metering, lack of flow control for transactions, lack of fee markets, spiraling RAM, storage, and restart overhead. Solana Labs prioritized addressing these issues to improve the user experience on the network.
Following the latest release, Solana Labs’ core engineers plan to improve the process for software release rollouts by bringing in additional external developers and auditors to test and find exploits and continuing to support external core engineers, including the Firedancer team building a second validator client.
Improving the Upgrade Process
To improve the software release process, Solana Labs’ core engineers will work with validators. The previous releases followed a certain pattern, which was the same for the 1.14 release. The Mainnet-beta validators ran 1.13, the Testnet validators ran 1.14, and the Devnet validators ran 1.14. The Mainnet-beta validators then began running 1.14 on master canary nodes (i.e. test nodes), and validators, RPC operators, as well as teams deploying dApps on the network provided feedback on 1.14. Additionally, the Mainnet-beta validators began a full deployment of 1.14, initiating the upgrade process. Despite having mixed nodes running against Mainnet-beta, the behavior of the network changes when the supermajority changes versions.
Solana Labs’ core engineers plan to improve the process by downgrading the Testnet to the current Mainnet-beta version and feature-set before the Mainnet-beta upgrade, upgrading the Testnet to the release candidate of the new version, observing how the Testnet migration goes in real-time, downgrading the Testnet back to the current Mainnet-beta version, and repeating this process while stress-testing the Testnet. This would require regenesis of the Testnet image during the first downgrade, and part of this simulation should include changing the stake distribution to mirror Mainnet-beta.
Core Engineers Continue to Focus on Stability
Solana Labs has formed an adversarial team comprised of nearly 1/3rd of its core engineering team to build additional hooks and instrumentation into the validator code to help find exploits across the underlying protocols and provide hardware to run medium to large clusters for adversarial simulation.
The company is also working to improve the restart process. Although fully automating the process is difficult, different kinds of failures can be solved with simpler procedures to improve the restart process. Nodes should be automatically discovering the latest optimistically confirmed slot and sharing the ledger with each other if it is missing.
Over the last 12 months, Solana Labs and third-party core engineering teams have been working to improve the network and will continue to do so with a focus on stability. For instance, Jump Crypto’s Firedancer team is building a second validator client focused on increasing the network’s throughput, efficiency, and resiliency. Mango DAO developers are working on the tooling needed to build on Solana. The network communication technology has transitioned to QUIC, a more advanced networking protocol. Local fee markets have been implemented, and stake-weighted QoS was incorporated to improve the ability to land transactions. Jito’s MEV client is providing alternative paths for landing transactions, and improvements to RPC infrastructure have been made to reduce their load.
To wrap it up, Solana Labs is taking steps to address the balance between reliability and building a scalable and fast network. They plan to bring in additional external developers and auditors, work with validators to improve the software release process, form an adversarial team to find exploits, and improve the restart process. Solana Labs and third-party core engineering teams will continue to focus on stability and welcome the community’s input and support in getting the network closer to a more decentralized future.