In the previous post here (please do read this first!), I talked about two key issues that affect dApps: the bootstrapping problem, and the issue of outsourced verification. The first problem I somewhat discussed how it can be addressed: when bootstrapping, we need to be careful we were not victim to a man-in-the-middle attack. Once bootstrapped, we have an initial point of verification; but then it brings us to the second, arguably harder, question: “how do we maintain verification without a trusted party doing that for us, or without doing it ourselves?” To come up with solutions to this problem, we first need to take a step back and look at two things: why we should truly and fully decentralize dApps, and, within our environment, how we can do so.
Why should we decentralize Apps / dApps?
In my previous post, I posed the question “What advantages does a dApp offer us if it is only ever used exactly as an App?” Indeed, when you only ever use a dApp as an App, you also throw away many (if not all) of the properties that differentiate them. If you only even use a trusted third party to interface an application that does not require a trusted third party, you might as well require one. Doing so would give you many benefits: less complexity, improved functionality, scalability, and more. If the Steemit.com interface replaced it’s working internals (STEEM) with a centralized system overnight, as a centralized system it could continue to operate extraordinarily well. Rewards could be distributed on-time, content could be posted and displayed to users, but it would never actually need to pass through to the STEEM blockchain. Very few people would notice the sleight of hand -- only those running their own validation and noticing all the content on the site does not appear to match all the data they are validating.
Yet, if you are here on Steemit, and if you are using cryptocurrencies -- you probably already know the answer to the question of “why” we should decentralize Apps. We want to cut out the middle man and remove trust from the equation. We are sick of middle-men taking a cut, controlling what we see, and telling us what we can or cannot do. Now, I’m not saying Steemit does any of these things, but when we develop these tools of the future, we should not design it in the same way as the past -- requiring a third party -- or we will make the same mistakes as history has. Fresh, new centralized systems are rarely designed to be corrupt, be we later see the centralization of power being the catalyst for the corruption. When we design our fresh, new dApps once again with centralization, the first iteration does not appear corrupt -- but through slow evolution the opportunity arises.
Thus, when I refer to truly decentralizing dApps, what I refer to is the idea of removing the outsourcing of verification. It is, in fact, the last step towards removing the middle-men from the equation. Only when a dApp no longer requires nor de-facto uses a trusted third party can it become truly decentralized.
To do this, first, we need to understand our environment itself: Cryptocurrencies, and Blockchains, and the opportunities they grant us to do this.
What differentiates Cryptocurrencies from Blockchains?
When we talk about “Cryptocurrencies”, the term “Blockchain” is often used interchangeably as a description. However, if you ask me (and many other academics), the differentiation between Cryptocurrencies and Blockchains is quite important. A Blockchain is a type of Cryptocurrency, but not all Cryptocurrencies are Blockchains. Blockchain, a term introduced by Bitcoin, referred to the idea of keeping a timeline in a linked list of blocks with an ever-moving “timestamp” of proof of work being needed to keep the list secure.
However, not all Cryptocurrencies are deployed this way. STEEM in fact, in the purist definition, is not a Blockchain, it is only acting like one. There. I said it. And, as we will see, perhaps this is a good thing! The data-structure that STEEM deploys -- a linked list in the form of a batch of transactions linked to a previous batch, with total ordering -- certainly seems to resemble a “Blockchain”, though. So what is the difference?
The key differentiator is PoS vs PoW. While most “Blockchains” would have you believe these are two different sides to a similar coin, these two systems are fundamentally different at their core, so much so that it feels like comparing the Sorting problem to the Travelling Salesman problem. While PoW systems with their ever moving work-time-stamps require a linearization of data for their properties (Bitcoin makes the correct reference to Markov Chains), PoS, while it certainly could use the same structure, does not specifically require it to function. In PoW systems, the world is governed by probabilities: the likelihood of orphaned blocks in the chain, 51% attacks creating longer chains, and the probability growing ever closer to 1 of a confirmation of a transaction, where, for all intents and purposes, the probability of your transaction being confirmed eventually comes so close to 1 that it might as well be. But -- it’s not.
And this key difference drives PoS to be different. In PoS, we can mathematically point to an event that confirms a transaction -- with probability 1 -- of being impossible to reverse. Instead of an ever moving work-time-stamps, PoS systems actually use a much older, well studied concept of “Byzantine Fault Tolerance” (BFT), a property investigated in academia for over 30 years. And indeed, BFT systems do not require a lot of the properties that PoW blockchains do -- the most notable being the linked-list “block” data structure. We have already seen Directed Acyclic Graph structures employed by cryptocurrencies such as NANO that have already begun to take advantage of the innate properties and historical work done on BFT. As another example property that is used but not required, (probabilistic) immutability is a staple of “Blockchains”, and while often implemented, it is also not a requirement of BFT.
BFT Cryptocurrencies (which encompasses PoS and DPoS, and thus includes STEEM), when compared to PoW currencies, have a few more important properties that differentiate them. The first one that is often used to compare the two is the permissionless aspect. Indeed, BFT cryptocurrencies are by nature permissioned: to enter the ecosystem, your entry point requires permission from within the ecosystem. This is often in the form of a barter transaction (buying entry from another user). With PoW systems, the entry point is by nature permissionless. Although you can always enter with permission (again, via barter), you can also enter the ecosystem (with some probability) without permission (by mining a block). Interestingly, by PoW being permissionless, it also removes the bootstrapping problem (the proof of this is left as an exercise to the reader, though if you’re curious, leave a comment and I can try to explain).
A second property is the verifiability. When you are verifying the status of a transaction, with PoW you can never be sure a transaction is valid, as there is always some chance of an attacker holding more hashpower in reserve, even to the point of re-writing the entire history. Although in practice the economic structure makes this unlikely, academically this makes validation hard, as you can never ratify a transaction with probability 1. However, with BFT, we have a structure in place: when 2/3rds+1 of validation entities have signed off on a transaction, this transaction is confirmed. Period. The transaction cannot be undone within the rules of the system.
(Two notes on this -- one, a hardfork, which could change this data, is rather the instantiation of a new system, and not the continuation of the old one, and thus does not invalidate this design. Two, if the system by design allowed “reversibility”, this reverse of the transaction is a new transaction, and the system remains incremental-only. As a more understandable example of why this is the case, consider returning a broken product to a store: the return of your money is considered a new transaction, not a deletion of the previous purchase.)
But enough of the history lesson and definition of terms. Can we use the properties of BFT systems to our advantage?
Breaking the Shackles of “Thinking like a Blockchain”: using BFT for Outsourced Verification
Currently, when a light client connects to a trusted endpoint, they are assuming that the endpoint is feeding them the correct information (and thus outsourcing verification to this third party). We inherited this loose sense of data aggregation from “Blockchains,” but constraining our thinking to the restrictions of Blockchains is not required if we consider the system as a traditional BFT system.
An interesting property of BFT is that, not only can we use it for ratifying data (in the form of validating transactions), we can also use it for distributing data. Instead of a light client requesting data from a single trusted source, suppose the client collects data from each of the validation sources, using BFT quorum behaviour. When this data is collected, the resulting quorum of returned information that has 2/3rds+1 majority consensus is indeed the fully validated and ratified data, or the system of validating sources is in itself Byzantine. This is the secret sauce. With collecting data via the BFT quorum, you can get full validation status without performing the validation yourself.
Notably, this property of being able to both ratify and collect data from a BFT cryptocurrency is unique -- the PoW systems do not have this ability, there is no way to collect with probability 1 any data -- because, (i) it never reaches probability 1, and (ii) there is no provable set of validating sources to form a quorum. As a simple proof for (ii), the set of sources that form a quorum for PoW is both known and unknown hashpower -- and hashpower can only be proved to exist, as it cannot be proved to not exist. With BFT systems, we have a definition of the quorum by design, and thus the ability to collect via quorum any data.
How would this enable or benefit dApps?
I will once again acknowledge the Bootstrapping problem: indeed, a dApp does need to first identify the current set of validators (e.g. witnesses). However, once the set of validators is known, knowing changes to the set can be determined via the current set. I will bring again the Bootstrapping problem a comparison to what most users already face on the internet. When determining a program is a valid one or a virus, some amount of investigation needs to occur. Once the user determines a program is not a virus, and indeed performs what it is advertised to do, the user can operate the program without worry. In a similar vein, once the dApp identifies the current witness set, it can then continue without worry. From the user’s perspective, once the user identifies that the program is indeed a proper implementation of BFT quorum validation (and is not a malicious program), the user can use it without worry.
Upon completing the bootstrapping process, the user can then be sure that the interface and all associated data it retrieves (via quorum) is thus valid, or the system itself is invalid. dApps thus become truly decentralized, with no trusted interface required -- the validation of collected data is outsourced only to the real validators ratifying the system.
So what would such a system look like, or require, for STEEM?
Unfortunately, the current environment of STEEM does not support such a design. As you can imagine, requiring the validators (witnesses) to also validate data requests would indeed require them to offer up public API endpoints that will respond to RPC requests for data. To-date, while we have a few APIs offered up publicly, there is still a large amount of reliance on the Steemit Inc. provided API. Indeed, we will need 2/3rds+1 of witnesses to offer APIs to ensure dApps can collect validated data in the face of no attacker, and all witnesses to offer APIs to ensure validated data collection in the face of near-byzantine attacker.
I will not lie, a shift towards a truly decentralized model is not easy, and there are many technical challenges that do come with it. But I do believe that progressing towards such a model is in our collective best interests. There are two aspects that we can target, (i) decentralizing standalone dApps, and (ii) decentralizing WebApps like Steemit.com.
The first, you can think of as ensuring certain programs like wallets (e.g. Vessel) have a truly decentralized model for interfacing with data on the STEEM blockchain. These standalone Apps would become dApp interfaces that are genuinely powered in a fully verifiable way, without any middlemen.
The second, as a more long-term goal, is decentralizing web apps. This is far harder to do: the front ends (e.g. the site that deploys the website) would be a third party, but the underlying data requests would be directed in a quorum manner to ensure decentralized data collection. To implement such a design for Steemit.com itself, it would require a fair amount of resources from all validators. Further, it would be much harder to prove properties about internal workings if it remains hosted externally (on a website), rather than downloaded locally. Rather, a more decentralized design would not be served from the web, and instead be used from a standalone downloadable app.
It will be a long road to fully decentralize the future, but won’t it be nice to say STEEM is truly end-to-end decentralized, with no trusted parties, for all users?
Putting My Money Where my Mouth Is
While building this train of thought towards the decentralization of dApps, and realizing the solution would be to have witnesses offering API endpoints for dApps to use, I decided I would need to put my money where my mouth is.
Soon I will be announcing a high-performance, custom built server I have built, and will be deploying solely for the purpose of publicly servicing dApps and end-users. Many witnesses already offer such services, and I believe having a cooperative effort to get more API resources up and running will be a big step towards starting the decentralization process for STEEM.
Stay tuned for the announcement about this! (I’m hoping to have it ready before Steemfest -- I just finished the build today!)
Excellent post, Scott. I remember talking with you about some of these differences a while back, and you've laid it all out quite nicely here as well.
I've been thinking for quite some time that I should get a full API node up and running and this has me thinking all the more how important that is for top witnesses. I look forward to hearing more about your build when you release details. Hopefully we'll continue to see improvements to how Steem manages data so we won't need expensive, experimental servers or days of replay time to run full nodes.
'Bout time I get some thoughts to paper, eh? :)
Indeed, the "part 3" of this series will really just be me geeking out about building the ultimate STEEM server -- hopefully my experience can help inform other witnesses and we can push for a more decentralized data environment!
Re- how steem manages data, I totally agree. It's something I want to dig in to soon, since it seems like it's currently a second class citizen as it currently "works". We can do much, much better.
It may relate to some of the things you mention in your post, but what do you think of this EOS: An Architectural, Performance, and Economic Analysis? I wonder how much if it relates to Steem as well. I've been asked to take a look and give a perspective and figured you might be interested as well.
Saw that this morning, first impression is that it has an extraordinary amount of inaccuracies.
I might do a peer review to explain how misleading their statements are.
Wow, part II really delivered!
First, I just want to congratulate you on stepping up your game developing and releasing your own API node, that's awesome! Hopefully more top witnesses follow your example.
It's really fascinating to learn more about the inner workings of the blockchain (and/or BFTs!). I particularly appreciate that you took the time to explain the distinctions between BFT (PoS/dPoS) and Blockchain (PoW) in such detail. I wasn't aware of the particulars that differentiate BFT from its predecessors.
I'm not going to lie, I'll have to read this post several times to really wrap my head around it but thanks so much for this - it's very valuable information for all Steemians.
Resteemed ;)
Glad you liked it! Indeed, there's a whole history of distributed consensus that's been around for decades. It surprises me that it seems to be a well kept academic secret -- there's a lot to learn by looking at previous work. Lots of problems have already been discovered and solved, and it feels like sometimes we're reinventing the wheel here.
Seems I need to keep writing on this topic :)
For sure, keep going...I'm definitely interested in:
Great work over at greymass as well, you and jesta are a powerhouse combination!
It's been released!
And thank you! It's really awesome to be able to work on two chains that have basically the same internals. All the knowledge transfer back and forth between EOS and Steem has greatly improved my Steem knowledge, and vice versa.
Great exploration and I'm sure your server will be useful.
Fast reader! :) Thanks, I really hope it will!
I've filed away part of the explanation to read later. Read about half of it so far and the last part.
I'm sorry, I had to stop reading at some point when it got too technical for me : (
Thanks
Hehe, sorry. This one is definitely meant for a more technical audience -- developers mostly. But it's an interesting topic to try and learn about if you're interested in decentralization.
I am interested, but it's like learning a whole NEW language LOL
this is good man!
Thanks! :)
It will be great, if you analyze all the dapps built on steem blockchain so far and let us know, which ones are true decentralized dapp ?
There's quite a few dApps and Apps, so I wouldn't know where to start :)
If you have one in particular you're curious about, I can try and give my opinion.
I would like you to have a look on these :
@steemhunt : https://steemhunt.com/ @musing : https://musing.io/ @dlike : https://dlike.io/ @memestagram : https://memestagram.io/ @utopian-io : https://utopian.io/ @steeveapp : https://steeve.app/
If it takes too much of time, then probably you can start a series and put the details on few of them. And by the way, @utopian-io rewards analysis posts, so you can post them there, to have it rewarded as well and utilize that for a good cause.
Thanks for that informative post about decentralization that was very interesting. I hope that many more witnesses will support your effort and they also bring out more API for dapps.
Very well said, you put this in such better words than I would have.
It's one thing to build apps that can connect to any decentralized endpoint, but it's a whole next level to build apps that connect to an entire quorum of decentralized endpoints to become both more trusting in the data and less prone to failure.
Now if we can just get to the point where more people could run the software for a much lower price lol.
Hello 3veryone fellow me and voteme
This made me think about the GPL x Apache debate.
Both are open source licenses, but Apache license allows it to be embedded on proprietary code, while GPL forbids that.
This leads to highly funded developers - mostly, core devs of big softwares like Wordpress and Linux), or developers that somehow don't need to worry about money, to prefere GPL. Because they don't want the product of their work being used for free and with no credits by companies that will profit from it.
By making their software GPL and blocking profit-based companies from consuming it on their softwares, they also remove the insterest of such companies to invest on their software (after all, they don't need these companies, they are alrdy nicely funded). And by doing so, they also reduce demand of work for independent developers interested on their software.
Apache licensed softwares may end up being consumed for free from companies that will profit from it, but these companies will also hire developers expert on these open source softwares, and increase demand for ppl to learn them and advocate on them.
Bringing this to Steem, you propose true decentralization of dApps that consume it. But that would also reduce these dApps owners' influence and profitability. And that would lead to removing interest from them.
First of course we have Steemit.com's Steemit Inc, which clearly wanna keep control of the whole system, and as Steem value had decreased I've seen many complains coming from the company, from investors, witnesses and general users.
Then we have softwares like Steem Monsters, which is the to-date best cryptocurrency-based game. And one that limits P2W!
Finally we have companies interested in tokenization. Ethereum grew based on this idea and is now fat due to that, and Waves is coming strong with its awesome wallet and DEX and its easyly created tokens.
Steem being way ahead of any other cryptocurrency to becoming the top dApp platform, wouldn't lose developers and companies interest if they'd see they can't be a middleman of a service? Be it by blockchain design, be it by community rejection...
I remember 20 years ago most Linux distros would gladly have freeware softwares on them, until they decided to ban such softwares and distribute only open source ones. That made some softwares be orphaned and some new be created.
I also remember when Joomla used to be proprietary software-friendly for its extension, when it was in beta. Then its core devs started forcing its extentions to be open source, that angered many companies that greatly contributed to its development and wouldn't be allowed to provide their proprietary extensions anymore.