Blockchain has the characteristics of non-tamperable and traceable data. This feature makes the blockchain just like history. What happened is what happened. There is no way to regret it, and the unofficial history cannot replace the official history and write the consensus of mankind. Blockchain allows the ownership of data to be confirmed, so that value can be transmitted on the Internet. What makes the blockchain possess these two big killers? First, we need to understand the nature of blockchain.
1. The essence of blockchain: distributed database
The essence of the blockchain is a distributed database. Each node in the blockchain network has the latest complete database copy, and the operation and update of this database are jointly maintained by all nodes through a consensus mechanism.
That is, the blockchain is distributed in terms of data storage, and in terms of data update, it needs to pass the consensus of all nodes.
For example, there was a village where the village accountant used to keep the accounts, and the village chief kept the books, but the accountant and the village chief might secretly make false accounts for personal gain, and the villagers had no choice. Now it is changed to a public account book, the whole village has one copy, and a set of fair rules is adopted to select bookkeepers to keep accounts in turn, and each bookkeeper is responsible for keeping one page of accounts. The village stipulates that those who have the right to keep accounts will be rewarded if they record one page of the account book to ensure the enthusiasm of the whole village to keep accounts, but it also stipulates that if this public account becomes invalid, all rewards will be invalid.
The accounting process is like this: every account that occurs in the village must be broadcast to the entire village. The bookkeeper records the accounts that occurred during this period in the account book, and then sends it to the villagers to check this page of the account book. If the transaction records on this page of the account book are false and cannot be confirmed by the consensus of the whole village, then this page is a fake account book. Will be invalidated. If more than 51% of the villagers confirm that this page of the account book is valid, and the whole village reaches a consensus based on the principle that the minority obeys the majority, this page will be officially included in the public account book. At the same time, all the villagers added this new account book to the account book they kept. The ledger information in the hands of one or several villagers is different from that of others, and the public ledger shall be the same ledger in the hands of most villagers.
It is this distributed accounting technology that prevents single or even multiple nodes from tampering with data, because their modifications to the database cannot affect the databases of other nodes unless they can control more than 51% of the nodes in the entire network to modify at the same time. However, in the real world, controlling 51% of the nodes requires an incalculable cost. Even if it is controlled, the original public ledger is equivalent to invalidation, and the rewards based on the original public ledger are invalid. Then the meaning of attacking this public ledger is again What's the matter? So it can be considered that this situation will basically not happen.
From the perspective of the entire network, blockchain ensures that data cannot be tampered with through distributed accounting technology and consensus mechanism.
From a specific level, why can all nodes check the authenticity of the data in the blockchain network?
This involves two key parts, one is the data structure of the blockchain, and the other is the guarantee of the cryptographic algorithm, involving the two core technologies of the hash algorithm and the asymmetric encryption algorithm.
2. Blockchain data structure: Blockchain structure
Why is the blockchain called the Block Chain? It is because the blockchain is composed of Block + Chain.
Block, or data block, can be understood as a page in the public ledger on which transaction data is recorded.
So what is a chain?
Let us first take a look at the structure of the block: block header + block body.
Recorded in the block header: the hash value of the previous block (calculated for the previous block header), Merkel root (which can be simply understood as the hash value of the specific transaction data of this block), and timestamp (record this The time when the block was generated) and other summary information.
Record in the block body: specific transaction data generated during this period of time
The hash value of the previous block is recorded in each block header. When calculating the hash value of the current block, the hash value of the previous block is also included. It is a system of block structure + hash value + timestamp that makes each adjacent block interlock, thus forming a chain. So every piece of information or transaction record in the blockchain can be traced to the source, and its ins and outs can be queried. And the key factor-the hash value ensures that the block data cannot be tampered with, because "it affects the whole body". Let me explain how this key factor works.
Three, hash algorithm
Hash algorithm is a one-way encryption algorithm.
Hash function (English: Hash function), also known as hash algorithm, hash function, is a method of creating a small digital "fingerprint" from any kind of data. The hash function compresses the message or data into a digest, so that the amount of data becomes smaller, and the format of the data is fixed. This function scrambles and mixes the data and recreates a fingerprint called hash values (hash values, hash codes, hash sums, or hashes). The hash value is usually represented by a short string of random letters and numbers. [1]
This hash value is also the hash value.
In one sentence, a hash function is a public function that compresses messages of any length into a short and fixed-length message digest.
In cryptography, hash functions are used to prevent tampering. It is a one-way function with only an encryption process and no decryption process.
Its function expression is: y=H(x). Simply understand the characteristics of the hash function from the mathematical level:
-x1 and x2 are different, then H(x1) ≠ H(x2), that is, y1≠y2
-It is easy to calculate y from x through a hash function, but it is not feasible to calculate x from H(y)
-The change of x will cause the change of its hash value y, and the change is unpredictable
The main function of the hash function is to verify the integrity of the information.
Understand with a popular metaphor: Each different doctoral dissertation will get a different hash value through the hash function operation, and the obtained hash value has a corresponding identification ability for each doctoral dissertation. But it is impossible to restore this doctoral thesis through this hash value. Even if one punctuation mark in the paper changes, the calculated hash value will also change. If each PhD thesis is uploaded with the hash value of this paper, after downloading the PhD thesis and its hash value, we will calculate the hash value of the paper again and compare the two hash values. Whether it is consistent or not, you know whether this doctoral thesis has been tampered with during the data transfer process or whether the downloaded content is complete.
As we mentioned before, each block header contains the hash value of the previous block and the hash value of the actual data in this block, and the hash value of this block header will be recorded in the next block header. From this we can draw the following inferences:
-
The hash value of each block is unique, and the hash value can be used as an identification (data fingerprint) for each block.
-
Once the data changes in the block, even a small change, the hash value obtained will also change. The hash value can be used to verify whether the data in the block has been tampered with.
The hash value allows the blocks to form a chain structure in a strict order relationship. If a transaction record in a block is tampered with, then the hash value of this block will also change, and the hash value of all subsequent blocks will change. This is "it affects the whole body". Then the node will easily find that the block data has been tampered with, and the "fake" block cannot pass the consensus of the entire network, and there is no way to get on the chain.
Fourth, asymmetric encryption algorithm
Someone may ask, what if a piece of transaction data is tampered with during data transfer before it is written into the block? How does the node verify the authenticity of all transaction data in the latest block?
The blockchain uses an asymmetric encryption algorithm and a hash algorithm to ensure that the block data cannot be tampered with from the source.
What is asymmetric encryption?
Asymmetric encryption uses algorithms to generate a pair of different keys, one for encryption and the other for decryption, and vice versa. One of them is the public key and the other is the private key. The public key can be sent to anyone who requests it, and the private key can only be kept by the holder and cannot be leaked. The private key cannot be derived from information such as the public key.
Through this cryptographic mechanism, as long as a pair of keys, the public key can be distributed to multiple accounts, and one-to-many encrypted transmission can be performed.
Asymmetric encryption has two functions:
-
Encryption function: prevent data from being leaked and tampered during transmission (public key encryption, private key decryption)
-
Digital signature: verify whether the identity of the data sender is true (private key encryption, public key decryption)
The realization process is as follows:
First, you need to generate your own private key and public key: The message sender A generates private key A and public key A; message receiver B generates private key B and public key B.
If the purpose is encrypted transmission: A uses the public key B given by B to encrypt the plaintext, and after B receives the ciphertext, it uses the private key B to decrypt it to obtain the restored plaintext.
For example, on Xiao Ming’s birthday, friends plan to deliver gifts to Xiao Ming separately. In order to prevent someone from peeping or changing gifts during the express delivery process, the delivery box needs to be locked with a code. Xiao Ming distributed the same public key to his friends Xiao Liang, Da Xiong, and Dong Dong. After the three of them installed the gifts they prepared, they used the public key given by Xiao Ming to encrypt and lock the courier box. After Xiao Ming received the three gifts, he could use his private key to unlock the three different gifts.
If the purpose is a digital signature: A will digitally sign the information that needs to be transmitted through the private key A. After B receives the encrypted information, it uses the public key A given by A to verify that the data signature of A is correct, and it is indeed A issued.
In the blockchain, in order to prevent the information of the sending node from being tampered with or forged, it is necessary to introduce the digital signature scheme of an asymmetric encryption algorithm on the basis of the hash algorithm.
-
The sending node performs a hash operation on the information Y to obtain the corresponding hash value X. Then use the private key to digitally sign X and get the signature N.
-
The sending node sends the information Y and the digital signature N to the receiving node, and broadcasts it to all nodes.
-
The receiving node uses the public key of the sending node to decrypt the digital signature N. If the signature N is not forged, X can be solved.
-
The receiving node uses the hash algorithm to calculate the received information again to obtain the hash value Z. If Z=X, it can be verified that the information Y has not been tampered with during the transmission process.
-
All nodes can verify the authenticity of this information synchronously, because the public key to decrypt the digital signature is public.
For example: Xiao Zhang sends an IOU to Xiao Liu. He first performs a hash calculation on the content of the IOU to obtain a hash value x1, and then uses his private key to encrypt the hash value. A digital signature. After Xiao Liu received the check, he decrypted the digital signature with the public key given by Xiao Zhang. If it can be solved, it means that it was indeed signed and issued by Xiao Zhang. Xiao Liu then hashes the content of the IOU to obtain the hash value x2, and compares x2 with the hash value x1 decrypted with the public key. If the two hash values are the same, it means that the IOU has not been tampered with during the transfer process. .
It can be seen that the information transmission process in the blockchain network is also tamper-proof. Cryptography is used to verify the authenticity of the source of the data and the integrity of the content, and the digital signature allows the producer/sender of the data to ensure that the data corresponds to the data. ownership.
Having said that, it not only explains why the blockchain has data that cannot be tampered with, but also explains why the blockchain can allow data to be confirmed. In the information Internet, data is easy to delete, copy, and tamper with, and there is no natural ownership mechanism and label. Li Kui and Li Gui are everywhere, making it impossible to confirm the rights of data. However, the existing network security technology cannot guarantee the circulation of high-value data, and needs to rely on the endorsement of third-party credit institutions such as banks. In the process of value transmission, the Internet only plays a role of recording information changes, not a role of value transfer. The blockchain ensures the authenticity of data and the uniqueness of data objects through non-tamperable modification and digital signature schemes, and data objects on the blockchain cannot be copied. In the blockchain, each blockchain object must be bound to its own owner account, which is an address. In this way, there is no need to directly realize value transfer through a third party in the blockchain network.
To sum up: Blockchain has created a database technology that is different from the past. Once data is written into the block through the consensus of the whole network, it will form an established fact. Just like history, it can be traced back and forth. Connected, but can no longer be changed. In other words, the data in the blockchain does not have the function of modification and deletion, and can only be written once, and the rest is updated. Coupled with the digital signature scheme, the data can be confirmed and the value can be transmitted in the network.