While we'd rather the incident hadn't happened, such events do show off the strength of the ecosystem. I think people use these incidents to gauge the degree of antifragility within a system. They are demonstrations of its ability to grow stronger from volatility. Good communications can highlight this aspect of the system.
The engagement and interest you mention is, IMO, the secret sauce of any such system. So many projects out there think that the solution to all of their problems are algorithmic-first, and if they can just design the right algorithms, that will solve all of their problems when in reality networks of people solve problems. The most elegantly designed "on-chain governance" will crumble if no one uses the software and likely will collapse when they do use it because then it will be forced to interact with unpredictable and irrational (not in a bad way) human actors. The only way to develop the right algorithms, build the right software, and establish functional governance mechanisms, is to build a network of engaged and interested people who are constantly kicking the tires, providing feedback, and influencing change in a positive direction. We don't just have that, the Steem community is years ahead of the competition. People often make the mistake of thinking that Steem should be defined as what it has been in the past, when the real power of Steem is what we will work together to make it in the future.
That’s well said.
Also see:
How to always win on Steemit!
Bitcoin has also had one major outage which needed manual intervention from the miners - https://github.com/bitcoin/bips/blob/master/bip-0050.mediawiki
Bitcoin used to have a centralized alert system, where urgent messages could be broadcast over the Bitcoin network by anyone having the private keys for the alert system. This alert system has later been removed from the Bitcoin Core client. I think maybe it would make sense to set up some alert system where any of the top 20 active witnesses could send an alert. I don't believe false alerts would be a big problem, since such an alert message (hopefully) would cause the witness to lose votes.
Thanks for the feedback. I don't think witness responsiveness has ever been, or was in this case, an issue but it certainly might be worth considering. It's important to remember that any time that is spent developing one solution is time not spent developing another. As Steve Jobs famously said, "People think focus means saying yes to the thing you've got to focus on. But that's not what it means at all. It means saying no to the hundred other good ideas that there are."
I wish the success of the platform! Hoping you release more youtube videos Andrew; you're a great spokesman.
Yes. @andrarchy
Why not create a video explaining how this problem was fixed?
Thanks! I'll work on it!
True. The alert-system is probably not worth pursuing - but I do believe it's rather important that nodes stays up and serves read-only-requests even if the block production is halted.
Many witnesses have alarms set up if they have missed blocks.