BP Infrastructure Group Update

in #eos7 years ago (edited)

WhatsApp Image 2018-05-05 at 3.09.19 PM.jpeg

This article is a collaboration between HKEOS (https://www.hkeos.com/), EOS Rio (https://eosrio.io/) and EOS Tribe (https://eostribe.io/)

Hello EOS Community,

A couple weeks ago, Syed (@eoscafe) and Jae (@hkeos) were discussing about the current hardware standards on testnets, and they were trying to plan what BPs would need going into the main net. They noticed that there were several questions regarding block producing infrastructure in the community, but a lot of aspects remained unclear.
To find exactly what would be needed, it seemed necessary to put together collaborative performance tests with other BP candidates, and find out ways to increase network security altogether.
HKEOS decided to start a Telegram group called EOS BP Infrastructure (https://t.me/BPInfrastructure) to open up the discussion about developing the highly available, reliable, and resilient architecture all BPs are going for.

The group has grown rapidly since then, with @xebb, @eluzgin and @jemxpat leading great discussions about how we could build highly secure yet performant setups. We are still working on these solutions and looking forward to testing them during the month.
Some of the points that came out from this discussion were:

  1. Never expose producing nodes to public Internet
  2. Have a full node exposed to Internet with all available APIs
  3. Producer node will connect to full node over trusted connection
  4. VPN / WireGuard is useful to establish a trusted peer to peer network among top BP nodes.
  5. Load balancing and reverse proxy are no remedies enough for DDoS attacks
  6. Third party solutions like CloudFlare are good but may be prohibitively expensive, we still have to analyze those in-depth.
  7. We need to come out with community solutions for DDoS attacks that will be both effective and affordable.

At the same time, Eugene (@eostribe) introduced Wireguard to us, a fast and secure VPN tunnel that could be utilized as an additional security layer when launching a chain. Eugene and Igor (@eosrio) organized a new testnet for us where all BP nodes used the VPN to mask the node IP from potential DDoS attacks during the boot process.

We managed to set up a trusted peer to peer network using the WireGuard software and collaborated on writing a guide to set it up for newly joining BP candidates:

https://busy.org/@eluzgin/how-to-configure-wireguard-vpn-network
https://steemit.com/eos/@eosrio/how-to-setup-a-secure-vpn-based-eos-io-network

After the VPN network was established, we followed the same process of setting up a Testnet. However, we used the VPN IP addresses for peers and created a restriction of who can join the network using the peer key settings (native to eos.io) in order to add an additional layer of security.

We started the network without a BIOS boot node to establish a pre-launch mesh network of nodes, as we previously discussed. Then, after enough nodes were up on the network, the BIOS node joined and bootstrapped the block production for all nodes.
While the VPN network does not provide full DDoS protection, it does reduce overall attacks on the surface and establishes secure communication channels between peer BP nodes.
DDoS protection needs to be added as an additional tier in deployment architecture that shields servers from those attacks. We as a community will continue work on the recommended approach.

HKEOS also hosted a call on May 4th (link at the bottom) with several other BP candidates (EOS Rio, EOS Sydney, EOS Africa, EOS Nodeone, EOSLaoMao, EOS Argentina) to further discuss about hardware, network security, and failover node setups.

First, all participants did a brief introduction on their current state of infrastructure development, talking about experiences between different types of nodes (bare-metal, cloud, docker.. etc). There was some discussion about failover for the BP nodes, where EOS Africa explained about their custom producer plugin. Among other topics, participants talked about the importance of training, as all BPs should be able to perform each and every step of the launch procedure manually. They also compared it with an overall automated process of network booting.

Looking forward, we plan to start a new chain where we can closely simulate the launch of the actual main net. This means that the BP candidates will be exchanging information through a secure platform such as Keybase with proper data encryption. We will put our ideas into action by launching with multi-node setups, and attempt to simulate potential DDoS attacks with our own “hackers”. Finally, with the release of DAWN 4.0, we will be able to test new features, especially voting.

This is a first draft of the proposed steps to securely launch the main net:

  1. All BPs should prepare at least 4 nodes: One producer (hidden configuration) and 3 standard nodes (preferably on different networks)
  2. Every BP should choose his own web of trust (6 to 10 other trusted entities)
  3. Every peer will publish their connection data encrypted in a way only the trusted peers should have access to, this is where a solution like Keybase can be useful.
  4. VPN connections will be established
  5. BIOS Entity is selected randomly, but without the knowledge from other peers to protect his connection. Then, he publishes a genesis file somewhere secure and accessible by other BPs, while staying hidden behind other full node.
  6. Standard nodes prepare the config.ini files with trusted peer keys and use the genesis to start a non-producing chain
  7. At some predetermined time, BIOS node will secretly bootstrap the chain
  8. Appointed BPs are defined and production switches from eosio (BIOS) to ABPs
  9. BIOS node verifies if all producers have successfully joined and sets a deadline to remove from production those that didn't
  10. BIOS sets the system contract
  11. BIOS deploys the EOS ERC-20 token distribution
  12. Some nodes on the network perform an independent verification of the distribution
  13. BIOS secretly publishes his private key somewhere without disclosing his identity, so other nodes can verify his steps.
  14. BP Candidates register themselves on the eosio.system contract
  15. Voting starts
  16. After 15% of the token holders place their votes the EOS blockchain is activated!

We are so grateful to have such a collaborative and hardworking community behind these efforts.
Hopefully, we will be able to come back soon with more results.
Thanks for reading!

Authors:

Jae (HKEOS): https://www.hkeos.com/)
Igor (EOS Rio): https://eosrio.io/
Eugene (EOS Tribe): https://eostribe.io/

Links:

EOS BP Infrastructure Telegram: https://t.me/BPInfrastructure

BP Infrastructure Call:

Sort:  

Great post! Thank you for summarizing the discussions around important security issues around launch process!

Third party solutions like CloudFlare are good but may be prohibitively expensive, we still have to analyze those in-depth.

I do not believe this is true. The most expensive Cloudflare tier is $5000/mo. It is peanuts compared to the other infrastructure cost to run BP.

At scale, the cost of Cloudflare's enterprise tier services may not be tremendous compared to server costs, but 5,000 USD/month is quite significant for a testing phase.

A great post, I am very interested in the verification step after the snapshot is installed. Any ideas on this?

We're still working on it with the rest of the BP group, and will implement it on the next chain we are launching.