Sparkster Development Update. WE 14th January 2019.

in #sparkster6 years ago

On December 31st, 2018, we launched our Alpha Main Net. Since then, we have been optimizing and resolving some issues that we have discovered since it’s deployment, some of which we will detail here. The introduction of these enhancements will bring us closer to a stable product.

nodes.jpg

As mentioned on our GitHub and in various communication media, we use an automated voting process to determine, every hour, the list of active Master Nodes (commonly called M21.) In order to accomplish the automated voting, a node would send its vote to all other Master Nodes, and would simultaneously accept votes from all other Master Nodes, provided open voting was in session. The challenge with distributed computing always is network latency and we saw this manifest among our Master nodes because nodes would only accept votes if they were in the open voting interval. This meant that if a node received a vote just before open voting started for that node, the node would drop the vote.

We have made the voting process resilient to network latency by allowing the nodes to accept ballots at any time between the close of the last open voting interval and the start of the next voting interval. Ballots remain in memory until the receiving node considers them at the close of the next open voting interval. This solution works because ballots are produced using the timestamp of the left edge of the next voting interval; therefore, no matter at what time a node sends its vote, as long as it signs it using the proper timestamp, that vote will be valid.

Another challenge in distributed computing is parallelism. We saw this situation when blocks were not being consistently produced. We were getting both duplicate and missing block numbers; in other words, some nodes were ahead of other nodes. So, a single Master Node would only produce blocks where the width between block numbers was equal to the number of master nodes producing blocks. For example, if there were four Master Nodes in existence, each node would produce blocks according to the sequence: {1,5,9,…}.

These issues were caused by a data race when a new node joined the network. If we relied on every node to publish the current block number to the DHT, there was no way to be sure if the currently published information was current or stale. We chose not to apply a global lock while the new block number was being written to the DHT because of performance penalties, and even global locks are not reliable over a distributed network due to latency.

Instead of recording the current block number, our solution was to record a genesis timestamp in the DHT. This timestamp represents the time at which the first Master Node starts up, thus creating the network. We know that blocks are produced every second, so all a node has to do to find out the current block number is calculate the number of seconds between the genesis timestamp and the current timestamp. The solution also prevented us from having to constantly fetch and push to the DHT every second, reducing load on the network. Once a node fetches the genesis timestamp, there is no need for it to query the DHT again during its lifetime since it can keep track of the block numbers internally.

The final optimization we have been working on is that of memory. Since the Master Nodes are pushing to the DHT every second and also generating new blocks, over time their memory footprints become large if not managed. Typically, if a program’s memory footprint grows rapidly and in an uncontrolled manner over time, this is known as a memory leak. The program loses access to memory it has allocated and is therefore unable to relinquish that memory back to the operating system.

In our case, what we were seeing was not a memory leak, although at first glance it definitely looked like one. Throughout our design, we were careful to avoid using pointers as much as possible and to delete them when we were done with them. However, the use of a lot of pointers in a program is discouraged even if memory is managed properly, because heap allocations are expensive; indeed, even with the introduction of smart pointers like std::unique_ptr to prevent us from having to explicitly call delete on a pointer, we still chose to use stack-allocated objects over heap-allocated ones simply because of the performance.

The Master Node’s large memory consumption was caused by two things: firstly, the data structure we used to store blocks was growing infinitely large; and secondly, memory used by OpenDHT was also growing infinitely large.

We made our first enhancement to the memory footprint by “force-deallocating” the data structure where we store blocks, and changing its datatype. Previously, it was using the std::unordered map type, a hash map implementation in C++. The problem with unordered map is that its allocations are sparse to allow for constant-time operations. This means that the std::unordered map type is, by nature, very large. So, we switched the data structure to use std::map instead of std::unordered map. The map structure allocates a contiguous block of memory and in some cases outperforms the hash map because of its smaller size. However, we determine that for block storage we simply did not need a hash map because block fetches (done by the client) were not time-critical, so using a hash map here was an unnecessary waste of resources for little to no realized performance benefits.

Once we switched the data structure to std::map, we cleared the memory of the map not only by calling clear after a sufficiently large amount of blocks has been written, but also by swapping the memory of the map for that of an empty map. While we were doing this for the unordered map as well, we saw no benefits since unordered map has such a large start-up memory footprint. It was necessary to swap with an empty region of memory because calling clear doesn’t necessarily de-allocate the memory held by the data structure; it only de-allocates individual items from the map. However, if we swap the memory held by the map with that of an empty map, we can be sure we’re starting with a truly empty map.

The second thing that was contributing to the large memory footprint of the Master Node was OpenDHT. Every node writes the block hash to the DHT so that a new node that has just joined the network can know where to start from. Since every node was writing to the DHT every second, all the collective values were beginning to occupy a large amount of memory which was only growing as more and more values were written to the DHT. Considering this, we set the expiration time of messages to be no more than thirty seconds (as opposed to the default ten minutes.) This configuration allows stale values to be dropped and to not be unnecessarily retained in memory, since most of the time we only care about the latest value. Our analysis on OpenDHT showed that, as expected, the library modified the size of its data structure appropriately and handled deallocation itself as values fell off of the expiration interval, keeping the size of the overall memory footprint small and tightly managed.

Along with these performance enhancements, we have also made adjustments (mostly to the Master Node) to make sure it performs quickly to keep up with the block production rate. Some of these changes include removing areas where we were unnecessarily acquiring locks and performing expensive operations that didn’t need to be performed in every block production, such as calculating the base-64 encoded string of the node’s public key. Now, we calculate the string only once and store it for the lifetime of the program.

We are still working through a few more issues and hope to have the second release out soon. In the meantime, we appreciate your feedback on Sparkster’s Decentralized Cloud.

Sort:  

Amazing worth reading post

Thank you so much for sharing this amazing post with us!

Have you heard about Partiko? It’s a really convenient mobile app for Steem! With Partiko, you can easily see what’s going on in the Steem community, make posts and comments (no beneficiary cut forever!), and always stayed connected with your followers via push notification!

Partiko also rewards you with Partiko Points (3000 Partiko Point bonus when you first use it!), and Partiko Points can be converted into Steem tokens. You can earn Partiko Points easily by making posts and comments using Partiko.

We also noticed that your Steem Power is low. We will be very happy to delegate 15 Steem Power to you once you have made a post using Partiko! With more Steem Power, you can make more posts and comments, and earn more rewards!

If that all sounds interesting, you can:

Thank you so much for reading this message!

Congratulations @sparkster! You received a personal award!

Happy Birthday! - You are on the Steem blockchain for 1 year!

You can view your badges on your Steem Board and compare to others on the Steem Ranking

Vote for @Steemitboard as a witness to get one more award and increased upvotes!