Lessons learned from failing as a witness

in #witness-update8 years ago (edited)

About a week ago, my witness node crashed abruptly. No errors, no trace. My guess is that the steemd process has been killed by the kernel when the system ran out of ram (32GB setup).

My setup had been a mess. A manually configured server, and a few scripts running on my PC (feed, killswitch). I had no backup node.

To make matters worse, when my node crashed, my kill-switch hadn't been running, and I was AFK. This caused me to miss 90 blocks. I tried re-building the node, but steemd failed me on the first attempt. So I had to re-sync twice, which caused a whole day outage for my witness.

My negligence got me kicked out of the witness pool.

Never give up

At first, I was going to give up, and quit as a witness. But the thought of that alone made me really sad. Being a witness is part of my Steemit identity, and I've decided to turn my failure into an opportunity to create an awesome setup.

The setup

I now have a 4-server cluster, in 2 datacenters (Paris and Amsterdam).
1.) Witness node A, running latest version of steemd (ie 0.19.0.rc1)
2.) Witness node B, running a known stable version of steemd (ie 0.18.2)
3.) conductor node, publishing price feeds and handling failovers.
4.) A public seed node (seed.steemdata.com:2001)

Witness Nodes

The witness nodes are quad-core xeon's, with 32GB ram each, and 2 SSDS.
I am hosting steemd in docker, and its data volume is mounted to second, larger SSD.
The first SSD also has a 16GB swap file.

The node setup is still manual, and fairly arduous, but fortunately it only has to be done once per server.

Management

Managing a witness can be messy, which is why I developed a neat command line app (conductor).

Generating Keys

I will begin by generating signing keys.
I will generate new signing key every time I deploy a new node, for security and double-signing prevention reasons.

conductor key-gen

.
keys.png

Here I generate 2 key-pairs. Each witness node gets its own private key. This is very important to get right, because having the same signing key on more than one node could lead to double-signing, which is a quick way to get into serious trouble.

Setting up automated failover

Once both witness nodes are up and running, its time to prepare the failover plan.

If my main witness node A is signing with key SK-A, and my backup witness node B is signing with key SK-B, I will enable my witness with SK-B, and make it failover to SK-B.

conductor kill-switch --second-key <SK-B>

or more concretely (see screenshot above):

conductor kill-switch --second-key STM5kZnvU8R1zqSn6yg6ERiGfffDupxznRureyJrLQfSp6QcKsTUk

Here is what happens in the event of a node failure:
1.) Witness node A fails. Once 10 blocks are missed, kill-switch will change the signing node to backup node B, by assigning its public key (SK-B).
2.) If witness node B fails as well, my witness will be automatically disabled to avoid missing more blocks.

Enabling the witness

Now that the failover is configured, its time to enable the witness.
This is done with conductor enable command, and assigning primary node key to it (SK-A).

conductor enable <SK-A>

or more concretely (see screenshot above):

conductor enable STM5yjjAg4ApWoMHLTHbAss7EFC3LxyiEoAcivaTNKBtpSaD5WtyJ

Running the price feeds

The last component of running a witness is providing an accurate, reliable feed.

I've developed a feed publishing tool into conductor as well:

conductor feed

You can see more options for feeds here.

Thank you

Thank you for supporting my witness. It means a lot to me. Any suggestions for further improvements are welcome.

Sort:  
There are 2 pages
Pages

Hey. Some time ago I was actually thinking about a way to help witnesses and other Steem app developers get informed whenever there are issues with their servers (downtime/functionality etc.) when they aren't online. It would eventually imply a phone call with a prerecorded message to be placed to the owner of the witness/app and the trigger would be from one of the other witnesses to avoid sibyls.
What I though of obviously preserves privacy and ensures a transparent flow to trigger a call to the affected owner. It would also be offered for free.
As much as we'd want, you can't be on 24/7 or even forecast issues.

Would something like this be of interest to you as a witness?

A twillio sms would be preferable to a phone call. Also, if its something running on the server, it has to be open-source.

I've implemented something like this for my trading platform tymoraPRO that gives alerts and warnings whenever any network line, service, or datafeed goes down or doesn't check in within the designated amount of time. At that point, my server-monitor app sends an alert to prowlapp that immediately pops up on my mobile device (better than SMS, since you can include more information if necessary, and without potential SMS fees either). Of course, you could also easily link twillio as well (and/or), as I recall the twillio API is pretty similar and straight-forward.

https://www.prowlapp.com/
API: https://www.prowlapp.com/api.php

If you really want to go all the way with this, here are a few other open source projects that may already do most of the work for you and provide potentially much more robust (albeit more weighty) solutions as well:

libraries:
https://github.com/uniqush/uniqush-push
https://github.com/jreese/znc-push

complete systems:
https://github.com/huginn/huginn
https://github.com/Netflix/Hystrix
https://github.com/OpenNMS/opennms

If you need any help setting something up, let me know, I'd be happy to assist where I can.

this is a great idea and some of the witnesses might want to implement it. The idea i had in mind had zero tech overhead for their servers as it would rely on other humans triggering the alert through the steem blockchain

That aspect could relatively easily be incorporated as well, though you're still talking some tech overhead. You still need the monitor app or script that triggers the SMS, even if it's triggered by humans. But you do bring up an interesting point. Technically, the same app could monitor as many witnesses who'd also like their servers monitored for either discrepancies or outages, etc. as well.

It would definitely be open source and fully transparent in its process. Option for SMS instead of phone call is also an option. The idea of a phone call is that you can ignore a text when you're sleeping generally. It would buzz just like when you received an email.
Also, the idea of a phone call is that you can get a burner number, register that one with my app, and get it redirected to your real number without exposing that one to anyone.

It also depends on everyone's priorities, maybe someone doesn't want to be awaken by a particular app being down :) and would prefer to see it when they wake up.
OK. Ill think about it and work out a plan then.

Hi @furion

What are the qualification to become a witness? I think I have the skill set to tackle this challenge, consider it another chapter of my IT life :)

Let me know sir.

Thank you,
@Yehey

It was me, i hax0red ur steem node, I couldnt help myself

here is a screenshot of my setup

and here is how i got into your Steemit server

Thanks for your honest account as a witness. We know the hard work all the witnesses put in to growing the community. It's great that you're learning and growing. I'm glad that you didn't give up and wish you the best of luck!

Are you logging both RAM and CPU every minute to a text file via cron? This would help you see how your server was doing up and to a minute before the crash.

Also if you know what block you crashed at, you could go looking for it, to see what it contained.

Good idea, I should setup some monitoring/logging.

I love conductor @furion, started playing with it yesterday. And you fixed a bug I encountered within hours of reporting it!

<3

Sounds like many database admins I know - this happens on the backend frequently. Your new setup is very similar to an AG group and that will do well in time. You experienced failure, you learned, and you're bouncing back. That's winning in my book.

i really have no idea what you are talking about :-) but it is great that you get back up instead of give up.

Yeah, this.
I was just reading on pretending to myself that I understand until your comment hit me in the face.

i appreciate your candor.

Onward ever forward as one endeavors to persevere my friend @furion and keep up the good work

That's the spirit! :-)
With hones attitude and hard work big mistakes magically turn into extra valuable experince....

There are good arguments for both sides, sbd peg as well as no price manipulation. I would like to have both, peg and accurate feeds, so as soon as we figure out a way to support a peg without manipulating the feeds, I will make a policy switch.

conductor, the witness toolkit supports both options.

Thanks for your thoughts about this topic. :)

Failing feels awful, I know so many times where I've failed and I really do know how awful it feels. But like always, you learn from it, and the more you fail, the more you learn.

Therefore, the more you fail, the less likely you are to fail in the future. I'm glad you chose to keep on going even after this failure. Keep it up!

Thanks a lot for your honesty, humility and clarity in this matter. May the force stay with us and with you from here on! ;)

Namaste :)

It seems that I still have a lot to learn.Code and programs, this is too complicated。

nice....you deserve resteem and upvote...

Fall and Stand up - never ending cycle of life. I’m not turning my back on you and you still deserve my witness up vote. Acknowledgment of failure is the first step, working on correction is the second and you have successfully managed both.

Do you need some coding language to become a witness ?
Looks like it's quite complex

I'm a newbie . I've hear quite a lot about "witnesses" but i quite don't get what it is . Does anyone have a nice newbie safe explanation

Its like the Bitcoin Miners who do all the bitcoin transactions! Where do you think all this "free" steem comes from that is rewarded to us here everyday? Its MINED! And instead of just steemit miners keeping all the steem they mine like with bitcoin miners, we basically have volunteers who sacrifice some of what could be potential crypto profit from mining and instead shares that with all of us to make steemit even possible!

Mining an altcinis something many people could do without realizing it! it doesnt take any extra work or real dollar investment (excluding computer hardware and electricity cost) but the way I se it, alot of these computers used to mine were already laying around not doing much anyway!

Its like mining crypto currency is this new thing someone found to make all these old computers useful again!

NOW we see it as valuable NOW we see it as "work" but in the beginning of crypto an to a degree today, its seen as magic money that your computer just creates, and to a degree yes it is, especially when you could mine a bitcoin every day when it was new and anyone could mine!

now we see people STILL trying to convince public that its not profitable to mine bitcoins OR altcoins even with existing non asic hardware that everyone already has like android tablets and laptops...to tell people that its not worth it still to mine is a dirty lie told by those who want to mine it all for themselves! lol Ok its not that simple but its very close! so many articles out there telling you not to mine at all! when anyone can just install minergate and mine easily! and why not? some people have MASSIVLy fast computers for gaming just sitting around barely being used! u can totally mine bytecoins monero ethereum ethereum classic but bytecoins are where its at! and its not about that dollar amount of how much you canmake how fast, when anyone asks that its a re flag that they are inthis for the wrong reasons...you should just be happy to be mining a qquality crypto currency with a real future while its still under 1 penny! and bytecoin lets you do that! you could also mine monero and get 1 monero in a month on a fast computer or you can get thousands of bytecoins! eventually bytecoin will be $1 and more!

I use this as an example to show how Steem, also an altcoin, has a HUGE futere, we are going to be THE user friendly crypto currency for the masses! people were even saying steem might out do bitcoin in some respects! aparently we are still driving alot of bitcoin users maybe nt as much as coinbase but just imagine when we have 1 million users! we dont even have 200,000 yet! (we are at 171,000) god imagine...and its only been march 2016 since steemit started! its BARELy been over a year people!

we are ALl early adopters! just try and earn or invest as MUCh steem power as you can! i am never powering down! i will pass down my steempower to my children whenI die

 8 years ago  Reveal Comment

wow!, seems like those node wont crash anymore!, keep the good work! :D

Thank you for posting @furion.

So glad to hear you are still in.....appreciate your perseverance.

All the best to you. Cheers.

good luck to u!!!!

Thanks for this peek into the workings and challenges of being a witness!

Congratulations @furion! You have completed some achievement on Steemit and have been rewarded with new badge(s) :

Award for the number of posts published

Click on any badge to view your own Board of Honnor on SteemitBoard.
For more information about SteemitBoard, click here

If you no longer want to receive notifications, reply to this comment with the word STOP

If you want to support the SteemitBoard project, your upvote for this notification is welcome!

wow congrats. so witness nodes are like the mining nodes in NEM and NAV?

very well wishes for you!! good luck

that's great work you have done, Im planing maybe inthe furute be a witness myself but I dont fell ready jet I still have a lot to learn.

Now is witnessing another form of minting coins? Are they the ones who power the steemit network? Sorry I am not great at understanding these more technical words

Thanks for sharing, at some point i have as well thought of giving up but reading great posts like this keeps me encouraged.

Witness vote goes to you! :-)

excelente publicacion felicidades

good stuff here

Ouch. That's great that you were able to improve on it. It makes me want to fire up my witness again.

Inspiring stuff @furion, thanks.

Hey man are there any good tutorials you could point me to do i can learn more about running some servers. I have lots of hardware just need to know the ins and outs of What you are doing. Like do you get paid for this? Bandwidth needed? You know what I mean? Thanks.

Great motivator, u can motivate yourself @furion.
Never give up and no white flag..just go ahead dude

Congratulations @furion!
Your post was mentioned in my hit parade in the following categories:

  • Upvotes - Ranked 4 with 567 upvotes
  • Pending payout - Ranked 1 with $ 1106,11

Thank you for creating a python API for Steemit, I develop in Python exclusively so, that is lovely. Could you explain what the benefit of running a steem node witness is?

¡Beautiful! just beautiful. A cute and instructive post to add to my feed page so I don't forget what to do with my big ball of SBD when I finally can be able to fill my wallet with a deluge of them.
On other hand, I am almost to the edge of happy tears to witness how many alike nerdy souls are swimming besides me on this geeky steemit's ocean.

Hahaha it feels about sweet in the know I'm pauper but not a lone.

¡Resteemed! :)

@furion thank you for the knowledge!!! :)

You have my vote!

Hey!

great post!

please follow my new account as my old one ive been locked out of due to my password going for a holiday!

Fitness and health related posts, aiming to create awareness and ultimately motivate everyone to make a healthy change like i did!

Feel free to ask any questions as im here to try help YOU!

thanks :)

All this is way over my head but I do know that you play an important part in the success of steemit and we thank you for that. I gave you a vote, for the little my vote is worth.

No body is perfect . And for me come back is for real so ill vote you as our witness :) goodluck and good work.

Namaste @furion,

Going through all the Witness threads to vote for them - your thread is more like a reply on Stackoverflow - LOL.

Nothing wrong with that - you have skills that are essential for Steem.

My vote goes to you :)

interesting! I want to become a witness to have a more vocal role in this community that I like. This will give me an excuse to build more things around Steemit.

You seem to be competent enough for my vote as a witness @furion
I will vote for you.
I only request for improvements in the steemit system favoring minnows to encourage them in using the steemit platform.
Right now I just see capitalism in here. We need a balance in making a profit. That's all I want to say. Thank you.

Despite all the struggle you have gone through you still manage to handle the situation like a pro and to keep things working back as it was. Hats off, you deserve a gold medal for not giving up. We all know how hard all the witnesses are working to keep the community growing . I wish you good luck for the future and hope you have learn from your mistake!

Very nice,,,

My profile
https://steemit.com/@atcexperts
My blog
The most effective method to Achieve Perfect Focus.
Follow me and vote for my blog
keep in touch

There are 2 pages
Pages