对应的coursera课程 Introduction to Crypto and Cryptocurrencies
Cryptographic Hash Functions
Hash function
Definition
- Input: take any string as input (any string of any size)
- Output: fixed-size, usually a 256 bits.
hash function 就是一个函数,其可以将一个任意长度
的string映射到一个固定长度
(e.g., 256)的string。
Three important security properties
Collision-Free
Suppose the hash function is `H()`.
If there are x and y that `x != y`, but `H(x) = H(y)`,
then we call this a collision. A collision free hash function has no collisions.
However, collisions do exist. There's no hash function in existence which has been proven to be collision free. The possible inputs of the Hash function can be a string of any size, while the possible outputs has to be string of 256 bits in size. In other words, the number of possible inputs is larger than the possible outputs. So, some inputs should be mapped to one output (yes, collisions :) ).
To find a collision, try 2^130 randomly chosen inputs and we have 99.8% chance that two of them will collide. This works, but it takes very very very long time. We can say that if every computer ever made by humanity was computing since the beginning of the entire Universe up to now, the odds that they would have found a collision is still infinitesimally small. So, we can choose to believe that these hash functions are collision free.
Benefits of collision-free
- Hash as message digest
- If
H(x) = H(y)
, then it is safe to assume thatx = y
- To recognize a file that we saw before, just compare its hash value! We do not need to compare the whole file line by line :). Efficient! Good job, Hash!
- If
Collision free 是指两个不同的string永远不会被映射到相通的Hash output。
但是对于hash function而言,collision是一定存在的。因为hash function的input是任意长度的string,也就是有无穷多种input,而output是固定长度的string,其个数是有限的。那么将input映射到output时,势必会有多个input映射到同一个output。
没有完美collision free的hash function,但我们可以让找到collision inputs变得非常困难,难到以当前计算力几乎不可能。那么当H(x)=H(y)
时,我们就可以放心地认为x=y
。
Hiding
Given H(x), it is infeasible to find x.
In other words, it is very hard to map the hash result back to the input x. To make this work, we need to choose x from a probability distribution that is "very spread out" (so that no particular value is chosen with more than negligible probability).
Benefits of Hiding
We want to "seal a value in an envelop" and "open the envelope" later.
For example, suppose we want to send a message msg
. To make sure the message is not tampered:
- Get the hash value
H(msg)
- Encrypt the meesage using key
K
, and get the encrypted resulte_msg
- Tag the
e_msg
withH(msg)
- When sending the message, the public can only see the
e_msg
andH(msg)
. The secret messagemsg
is hiden.
When read the message, we firstly decrypt the e_msg
to get msg
, then check if its hash value equals H(msg)
.
Hiding就是让人无法通过 hash value (output) 去推回 hash key (input)。
如果input set很小,那么我们很容易用暴力手段得到hash input 和 output之间的对应关系。所以解决方案就是让input set尽量大,大到以当前计算力无法暴力解。
利用hiding的属性,我们可以放心地传播数据的hash value,而不用担心别人通过hash value获得数据。
Puzzle-Friendly
The puzzle-friendly property here implies that
there's no solving strategy for this puzzle,
which is much better than just trying random values of x.
So, what this means is basically that if someone wants to target the hash function and want it to come out to some particular output value y, they have to go through all the possible inputs (no shortcuts!).
puzzle friendly 就是说如果想要找到某个hash value y
的对应输入,那么只能去尝试各个可能的输入,没有任何其它fancy的方法 (暴力就是王道)!
这个特性和上面的collision free就相互呼应了。由于没有完美的hash function,我们通过puzzle-friendly的特性,使得找到collisions变得非常困难。
SHA-256 Hash function
Bitcoin uses this hash function. Google it if you feel interested.
同为程序员,谢谢你的分享,关注你了。别断了哦
哈哈 好的哈!谢谢!