If you are still not amazed by the power that the Python Language is capable of, then in this part we are going to learn how to generate a Bitcoin address or a wallet in python. I just love how easy it is to communicate with your computer if you have a Linux OS through python and how many interesting projects you can make with it.
In this article I am going to analyze the source code of Electrum, the Bitcoin wallet that is purely written in Python, and it should work with any python 2.x and I believe even with python 3.x package, by default, all dependencies that this software uses are in the default packages. So no additional software is needed it's self-sustainable.
Disclaimer: Use this code and information at your own risk, I shall not be responsible for any damages resulting from the use of the modified code, nor the information provided in this article. It's not recommended to modify the code that generates private keys if you don't know what you are doing!
Playing with the Code
I have downloaded the latest version of the Electrum's source code from Github:
The seed generator file is basically located in lib
it's named mnemonic.py
and the function is make_seed()
, it’s this block of code:
Which you can actually call from the terminal as well, through an internal command. So if you have Electrum installed, then I think it’s like this:
electrum make_seed --nbits 125
This would create a 125 bit seed for you, if you have Electrum installed, but you can also call that mnemonic script through another python file, and customize it for example (like generate multiple ones, or integrate it with some other code).
We will create a new file named testcall.py
from where we will call this Mnemonic code, it has to be in the same lib
folder though. It looks like this:
And if we call it from the terminal using python testcall.py
command:
Basically we are importing the Mnemonic
class from the mnemonic.py
file just calling it as mnemonic
. I haven’t talked about classes yet, they are in the more advanced parts of the Python language, basically they are object that bind together functions. Here the make_seed()
function is contained inside the Mnemonic
class, and it’s called through that, together with other functions that depend on eachother. It could be done with just 1 function, but using it like this is more elegant and less error prone since it can handle exceptions. I am not a very good expert in Classes, so I’m just gonna leave it like this.
In the Mnemonic
class you can define 1 parameter, the language, which has the following values:
None
= Englishen
= Englishes
= Spanishzh
= Chineseja
= Japanesept
= Portuguese
You can see the country codes in the i18n.py
file, but only these have wordlists available for now, visible in the wordlist
folder. Basically here is how you create a Chinese seed just replace that argument with the country code:
print Mnemonic('zh').make_seed('standard', 132, 1)
And this will give out some seed in Chinese:
There are also multiple types of seeds you can generate, which you can see in the version.py
file:
standard
- Normal walletsegwit
- Support for upcoming Segregated Witness softfork based addresses of Bitcoin2fa
- Two Factor Authentication based WalletsThe next argument is the
num_bits
variable which from the command line is called withnbits
command, basically just the number of bits entropy your seed will have (recommended minimum 128 for security)The last argument is the
custom_entropy
, basically just an integer with which you multiply your seed number, just in case your RNG is bad, this replaces a part of the secret with the customly generated number by you, of the same entropy size.
So if I call it like this, where I chose a custom entropy number, this would generate a seed this way, of course the entropy number has to be a secret as well:
print Mnemonic('en').make_seed('standard', 132, 2349823353453453459428932342349489238)
I don’t really recommend using this code, it looks kind of weird to me, I am not cryptographic expert but I just don’t like how this inserts entropy into your number. I have heard that multiplying numbers decreases entropy, so I am not sure about this part of the code. In fact I am going to message the dev about this issue, see what his response is about this. However no worries, the default wallet generation doesn’t call the custom entropy part, so if you are generating a wallet in Electrum through the GUI, or leaving it at 1
value, then this is of no concern to you.
Auditing the Seed Generator
Ok so now that we know how to generate a seed, let’s see what exactly does the seed generator do. After all anyone using Electrum has to rely on the security and integrity of this code, otherwise you can lose all your money if this code were to be written badly. So we really have to trust this code 100% if we want to store a lot of Bitcoin in Electrum. So let’s analyze it.
So let’s analyze the make_seed()
function, this is where the action is, first of all I will put many print
codes in it to print out each variable at each step:
Basically I just print out the each variable at each step. Ok so we are calling the make_seed()
function from our testcall.py
file with python testcall.py
command. Where the testcall file is like this:
print Mnemonic('en').make_seed('standard', 132, 1)
Just a standard seed generation, it prints out these:
Well let’s take it step by step.
- First the
version.py
is imported where the codes of the file is, it basically translates thatstandard
argument into01
which will be the prefix of the seed later. So it sets the prefix to a01
string. - Then the
bwp
(bits per word) variable takes the log2 value of the length of the word list, I mean how many words there are in there, in this case the English list:english.txt
. There are 2048 words in the English list, and log2 of that is 11. - Then the
num_bits
is divided bybwp
and rounded up, turned into an integer and multiplied bybwp
again. I don’t know why this is necessary since it gives back the same value, I guess it’s just some kind of precaution. n_custom
becomes 0 if we leave thecustom_entropy
at default 1, so that no extra entropy is addedn
again, it remains the same as thenum_bits
input if no custom entropy is added.- So basically if you generate a default wallet with no extra entropy, then the
n
variable becomes the main number holding the amount of entropy you define initially throughnum_bits
. So in our case it remains equivalent since we don’t add anything. - Then
my_entropy
will just pick a random number between 0 and 2n, wheren
is the samen
, so it will be a large number, this is the prototype to the seed. - Then we go into a while loop to search for a random number that starts with
01
which will serve as a checksum of the seed. - If the custom entropy is 0, then basically we just add 1 to the
my_entropy
number until the first 2 bits become 0 and 1. Actually the first 2 bits of it’s hashed format. So that happens is that it encodes it withmnemonic_encode(i)
and right after it decodes it with mnemonic_decode(seed) I guess to test if the number can be encoded in words, otherwise it would give some error. That is what theassert
command does, it tests for errors. - Then it goes into the
is_new_seed()
function, if you generate a seed now, if you import and older seed in the old format then it goes into the old function. But this code that I executed above goes into the new function. This is where the magic happens. Theis_new_seed()
function is actually located in thebitcoin.py
file:
- What happens here is interesting, first the seed gets normalized with the
normalize_text()
function in themnenonic.py
file, I think the Chinese or other strange languages get transmuted into ASCII text I believe. So this function does not much with the English wordlist. - Then is when things get interesting, it takes the HMAC-SHA512 hash of the seed list, in the English text version of it basically in our case. And it checks the first 2 characters to be
01
, since we called astandard
wallet. Electrum defines the standard wallet as a seed whose HMAC-SHA512 encoded withSeed version
starts with01
, a Segwit wallet whose HMAC-SHA512 encoded withSeed version
starts with02
and so on… So basically thatwhile
loop increments thatmy_entropy
variable by 1 until the wordlist that it gives back whose HMAC-SHA512 encoded withSeed version
starts with01
in our case. After it found that number, it exits the loop, and it returns the seed:
because sister decrease neither cool more car galaxy one upset high allow
That’s it, that is how basically Electrum generates a seed. And this seed’s HMAC-SHA512 sum will start with 01
, you can even check it yourself. So in Linux you can install a tool called GTKHash to calculate hashes, so let me demonstrate, we take the the seed, and add the HMAC message Seed version
as defined in that function:
So as you can see if we add the HMAC message Seed version
together with the seed it gives us the 512 bit hash that will start with 01
so in this case this is a valid default seed compatible with Electrum.
Of course the HMAC system is unbreakable, especially the 512 bit version of it is probably quantum computer resistant, so there is no way to reverse engineer the seed from this system.
However there is 1 issue, if we fix the first 2 characters of the hex format, where obviously the HMAC-SHA512 output is in hexadecimal format, well that loses entropy.
So that is why we start with 132 bits of entropy, because we lose about 4 bits of entropy, and hence the output at the end will only have 128 bits of entropy which his what we want by default, it’s safe to use 128 bits of entropy, in fact it’s recommended to only use above 120 bits now, given how powerful computers get.
So we start with 132 bits, we lose some bits due to fixing the first 2 characters, and then we remain with 128 bits which is computationally secure. To brute force this it requires a supercomputer to go through 2128 combinations which is pretty much impossible since there is not enough energy on Earth to go through that many combinations, in fact some people say that you can’t even count until this number range, not to mention hashing and other memory intensive operations
Conclusion
It looks like Electrum is safe to use. It has passed my audit, although I am no crypto expert but from what I have researched and learned it looks safe to me.
I am still skeptical about that custom_entropy
thing, I should ask the dev what that does exactly, but other than that, default wallet generation is flawless. There are no backdoors in my opinion.
After all many thousands of people use Electrum, especially people holding large amounts there so it better damn be safe to use, and in my opinion it is.
I have analyzed it’s main seed generation code in this article. Of course the code is a lot more than this, but we already know that if you generate a seed on an Offline Computer with it, it should be safe. Now I haven’t looked into the network related parts of it, but I trust them to be safe.
It’s a cool wallet, use it if you want: https://electrum.org
Sources:
- Electrum software is the
Copyright of Thomas Voegtlin
licensed with MIT license. - Python is a trademark of the Python Software Foundation
Going to go through this post in more detail later this week, I'll edit and let you know how the experience goes. I just installed Linux on one of my laptops so quite excited to give this a go the same way you did it. Thanks for sharing and all the time you put into it highly appreciate the content.
Opened a thread about that
custom_entropy
issue:It has to be investigated more thoroughly, it could be a potential bug.
nice , usefull post
This is the wallet I use. In any case thank you for this super tutorial.
Yes I figured many people use this wallet including myself. I have analyzed the code myself especially the seed generator part, because I was skeptical about it at first. I don't really trust javascript based stuff that much, it often has crappy rng. But in this case this python code, which I kind of understand well, is written pretty well.
So Voegtlin did a good job with it, it's a good software, trustworthy.
Very interesting post :) thanks to the author! :)
Nice post, one thing, you don't need Linux to run python :)
Yes you don't but I find it more easy to use on Linux. I never liked Windows commands.
If you ever had to use windows, check this https://msdn.microsoft.com/en-us/commandline/wsl/about
Congratulations @profitgenerator! You have completed some achievement on Steemit and have been rewarded with new badge(s) :
Award for the number of posts published
Click on any badge to view your own Board of Honor on SteemitBoard.
For more information about SteemitBoard, click here
If you no longer want to receive notifications, reply to this comment with the word
STOP
Thanks!