Bitcoin: A Peer-to-Peer Electronic Cash System

1. Abstract
2. Introduction
3. Transactions
4. Timestamp Server
5. Proof-of-Work
6. Network
7. Incentive
8. Reclaiming Disk Space
9. Simplified Payment Verification
10. Combining and Splitting Value
11. Privacy
12. Calculations
13. Conclusion

https://bitcoin.org/bitcoin.pdf
http://blog.codingnow.com/2011/05/bitcoin.html
Bitcoin Block Explorer : http://blockexplorer.com/

1. Abstract

A purely peer-to-peer version of electronic cash would allow online payments to be sent directly from one party to another without going through a financial institution. Digital signatures provide part of the solution, but the main benefits are lost if a trusted third party is still required to prevent double-spending. 最主要的问题就是不经过第三方来解决double-spending问题.

The network timestamps transactions by hashing them into an ongoing chain of hash-based proof-of-work, forming a record that cannot be changed without redoing the proof-of-work. The longest chain not only serves as proof of the sequence of events witnessed, but proof that it came from the largest pool of CPU power. As long as a majority of CPU power is controlled by nodes that are not cooperating to attack the network, they'll generate the longest chain and outpace attackers. 整个系统是安全的除非attacker掌握了超过世界上50%以上的计算能力. POW-chain可以认为是trx组成的链表, 但是每个trx都是环环相扣的, 所以历史trx是没有办法修改的除非重做POW-chian. 整个系统中只有一个POW-chain是被认可的, 就是最长的POW-chain. 重做POW-chain的代价是高昂的. 如果世界上50%以上的计算运力都使用可信的bitcoin client的话, 那么attacker通过修改trx重做POW-chain并且超过可信计算集群计算出的POW-chain的概率微乎其微.

The network itself requires minimal structure. Messages are broadcast on a best effort basis, and nodes can leave and rejoin the network at will, accepting the longest proof-of-work chain as proof of what happened while they were gone. 对于网络组织上要求非常少. 所有消息都是best-effort广播出去. 节点一旦介入网络只要同步POW-chain即可.

2. Introduction

3. Transactions

一个coin可以表示成为一串trxs. 最新的trx可以表示为payee-pub-key(hash-of-prev-trx). payee得到这个trx之后可以通过private-key来验证这个trx/coin. 但是payee没有办法验证payer是否还将这个coin给其他人了. 这就是double-spending问题: 一个coin被payer支付给了两个人.

The problem of course is the payee can't verify that one of the owners did not double-spend the coin. A common solution is to introduce a trusted central authority, or mint, that checks every transaction for double spending. After each transaction, the coin must be returned to the mint to issue a new coin, and only coins issued directly from the mint are trusted not to be double-spent. The problem with this solution is that the fate of the entire money system depends on the company running the mint, with every transaction having to go through them, just like a bank.

解决这个问题必须引入可信的第三方third-party, third-party管理mint(铸币厂). payer首先将coin转给third-party, third-party将这个coin销毁并且从mint创建一个新的coin. 然后third-party再将这个coin给payee. third-party来解决double-spending问题. 只有从third-party出来的coin才能够确保没有double-spent. 当然如果coin是物理的话那么一切都没有问题.

We need a way for the payee to know that the previous owners did not sign any earlier transactions. For our purposes, the earliest transaction is the one that counts, so we don't care about later attempts to double-spend. The only way to confirm the absence of a transaction is to be aware of all transactions. In the mint based model, the mint was aware of all transactions and decided which arrived first. To accomplish this without a trusted party, transactions must be publicly announced, and we need a system for participants to agree on a single history of the order in which they were received. The payee needs proof that at the time of each transaction, the majority of nodes agreed it was the first received.

要真正解决double-spending问题, 就必须像third-party一样了解到历史上所有进行的transactions.我们只认为第一个被看到的trx是有效的, 对于后面的double-spent trx直接忽略掉. 为了不通过third-party, trx必须通知给所有的参与者. 系统确保参与者(大多数参与者)都能够对当前某一个trx-history达成共识. 交易时候payee必须确保这个trx被大部分参与者第一个接收到(也就是大部分参与者承认这个trx).

4. Timestamp Server

A timestamp server works by taking a hash of a block of items to be timestamped and widely publishing the hash, such as in a newspaper or Usenet post. The timestamp proves that the data must have existed at the time, obviously, in order to get into the hash. Each timestamp includes the previous timestamp in its hash, forming a chain, with each additional timestamp reinforcing the ones before it. # block里面包含许多个trxs. 并且每个block上面都有一个时间戳. 可以认为这些trxs都是在这个时间戳内按顺序提交的. 之所以将这些trxs放在一个block里面是因为考虑到计算代价问题. 每个block上都有hash-value = hash(prev-block-hash, block-content). 一旦这个block确定就被broadcast出去.

可以看到full-node上是可以检查到每个地址上的bitcoin的数量的，所以从某个地址转移到另外一个地址是会进行校验的。

5. Proof-of-Work

To implement a distributed timestamp server on a peer-to-peer basis, we will need to use a proof-of-work system similar to Adam Back's Hashcash, rather than newspaper or Usenet posts. The proof-of-work involves scanning for a value that when hashed, such as with SHA-256, the hash begins with a number of zero bits. The average work required is exponential in the number of zero bits required and can be verified by executing a single hash. # 但是在投递出去之前, block-hash必须以一定的0开头(通过在block里面增加一个字段来影响hash). 这个工作和0的数量成指数级关系, 但是却可以很容易被验证.

For our timestamp network, we implement the proof-of-work by incrementing a nonce in the block until a value is found that gives the block's hash the required zero bits. Once the CPU effort has been expended to make it satisfy the proof-of-work, the block cannot be changed without redoing the work. As later blocks are chained after it, the work to change the block would include redoing all the blocks after it. # 在block里面增加一个nonce字段来影响block-hash. 注意到如果我们想修改某个trx的话, 那么整个chain都必须重新计算.

可以看到hash是以0开头并且有nonce字段.

The proof-of-work also solves the problem of determining representation in majority decision making. If the majority were based on one-IP-address-one-vote, it could be subverted by anyone able to allocate many IPs. Proof-of-work is essentially one-CPU-one-vote. The majority decision is represented by the longest chain, which has the greatest proof-of-work effort invested in it. If a majority of CPU power is controlled by honest nodes, the honest chain will grow the fastest and outpace any competing chains. To modify a past block, an attacker would have to redo the proof-of-work of the block and all blocks after it and then catch up with and surpass the work of the honest nodes. We will show later that the probability of a slower attacker catching up diminishes exponentially as subsequent blocks are added. # POW也解决了如何定义大多数参与者的问题. 通过IP不是一个可靠办法. CPU也就是计算运力是一个可行办法. 后面会证明一个slower attacker试图超越可信计算集群的概率.

To compensate for increasing hardware speed and varying interest in running nodes over time, the proof-of-work difficulty is determined by a moving average targeting an average number of blocks per hour. If they're generated too fast, the difficulty increases. # 考虑到硬件计算能力提升以及计算节点增加, 整体运算能力也会增加. POW困难程度变化可以通过每个小时生成blocks数量来估算.

6. Network

The steps to run the network are as follows:

New transactions are broadcast to all nodes.
Each node collects new transactions into a block. # 每个节点收集trxs并且定期打包成为block
Each node works on finding a difficult proof-of-work for its block.
When a node finds a proof-of-work, it broadcasts the block to all nodes.
Nodes accept the block only if all transactions in it are valid and not already spent. # 如果认为block里面所有trxs有效的话, 那么就会挂载到pow-chain之后.
Nodes express their acceptance of the block by working on creating the next block in the chain, using the hash of the accepted block as the previous hash. #之后block的加工就会都以这个pow-chain为准.

Nodes always consider the longest chain to be the correct one and will keep working on extending it. If two nodes broadcast different versions of the next block simultaneously, some nodes may receive one or the other first. In that case, they work on the first one they received, but save the other branch in case it becomes longer. The tie will be broken when the next proof-of-work is found and one branch becomes longer; the nodes that were working on the other branch will then switch to the longer one. # 过去分支会保存起来等待有一天变长(可能会定期删除) 如果不巧地work在不是longest-chain上的话，那么工作就算白干了。

New transaction broadcasts do not necessarily need to reach all nodes. As long as they reach many nodes, they will get into a block before long. Block broadcasts are also tolerant of dropped messages. If a node does not receive a block, it will request it when it receives the next block and realizes it missed one. # 可以发现是否缺失blocks. 如果丢失blocks的话需要和full-node进行同步。

7. Incentive

By convention, the first transaction in a block is a special transaction that starts a new coin owned by the creator of the block. This adds an incentive for nodes to support the network, and provides a way to initially distribute coins into circulation, since there is no central authority to issue them. The steady addition of a constant of amount of new coins is analogous to gold miners expending resources to add gold to circulation. In our case, it is CPU time and electricity that is expended. # 按照惯例block里面第一个trx比较特殊, 这个trx是用来生产coin的. 一方面可以鼓励节点来支撑整个网络, 另外一个方面可以增发coin到流通中.

The incentive can also be funded with transaction fees. If the output value of a transaction is less than its input value, the difference is a transaction fee that is added to the incentive value of the block containing the transaction. Once a predetermined number of coins have entered circulation, the incentive can transition entirely to transaction fees and be completely inflation free. # 如果一旦不再产生coin的话, 奖励也可以从手续费中获得. 这样就不会有通货膨胀问题.

The incentive may help encourage nodes to stay honest. If a greedy attacker is able to assemble more CPU power than all the honest nodes, he would have to choose between using it to defraud people by stealing back his payments, or using it to generate new coins. He ought to find it more profitable to play by the rules, such rules that favour him with more new coins than everyone else combined, than to undermine the system and the validity of his own wealth. # 鼓励节点诚实. 因为attack的代价比正常计算的代价要高很多.

注意第一个trx. 25btc是挖矿的奖励, 0.1341997btc是手续费

8. Reclaiming Disk Space

通过组织成为Merkel-Tree格式我们可以删除部分不需要的branches上的数据来节省空间. 兼顾灵活性和效率.

A block header with no transactions would be about 80 bytes. If we suppose blocks are generated every 10 minutes, 80 bytes * 6 * 24 * 365 = 4.2MB per year. With computer systems typically selling with 2GB of RAM as of 2008, and Moore's Law predicting current growth of 1.2GB per year, storage should not be a problem even if the block headers must be kept in memory.

如果这样的话，是不是要对当前数据库状态做个snapshot？

9. Simplified Payment Verification

It is possible to verify payments without running a full network node. A user only needs to keep a copy of the block headers of the longest proof-of-work chain, which he can get by querying network nodes until he's convinced he has the longest chain, and obtain the Merkle branch linking the transaction to the block it's timestamped in. He can't check the transaction for himself, but by linking it to a place in the chain, he can see that a network node has accepted it, and blocks added after it further confirm the network has accepted it. # 我们不要运行full-network节点也可以查询交易是否完成, 只需要同步pow-chain并且查询trx是否在上面即可.

As such, the verification is reliable as long as honest nodes control the network, but is more vulnerable if the network is overpowered by an attacker. While network nodes can verify transactions for themselves, the simplified method can be fooled by an attacker's fabricated transactions for as long as the attacker can continue to overpower the network. One strategy to protect against this would be to accept alerts from network nodes when they detect an invalid block, prompting the user's software to download the full block and alerted transactions to confirm the inconsistency. Businesses that receive frequent payments will probably still want to run their own nodes for more independent security and quicker verification. # 如果大部分节点都是可信的话, 那么验证是可靠的. 但是如果attacker控制网络大部分节点的话…

10. Combining and Splitting Value

一个trx可以包含许多input和最多两个output. 这个功能主要就是用来铸币（mint coin）把大量的小钱汇集起来，不过这个操作容易被人窥探到隐私。

可以看到input有两个coin. "ed0889a26367…:1" 和 "ff210a7074e1…:0". 这代表这两个coin分别是trx#ed0889a26367的output1, 和trx#ff210a7074e1的output0. 而这次trx也生成了两个coin. 分别是"a63c3bb1d2c23…:0" 和 "a63c3bb1d2c23…:1".

在网站 http://blockexplorer.com/ 上可以追踪这些coin的流动情况

11. Privacy

As an additional firewall, a new key pair should be used for each transaction to keep them from being linked to a common owner. Some linking is still unavoidable with multi-input transactions, which necessarily reveal that their inputs were owned by the same owner. The risk is that if the owner of a key is revealed, linking could reveal other transactions that belonged to the same owner. # 即使为每次trx都生成key-pair, multi-input/combining trxs还是可以跟踪到owner情况. 所以尽可能低不要把所有的bitoins放在一个key下。

12. Calculations

这个部分是计算一次支付后，如果attack有一定概率可以跑赢主干网络的话，在等待多少个blocks之后确认比较安全。

第一个是如果q=0.1/0.3, 那么等待z个网络之后被反超的概率分布。q=0.1大约在3个blocks出来之后被反超的概率就是0.013，这个就非常小了。而q=0.3则需要等到大约15个blocks出来。

q=0.1
 z=0 P=1.0000000
 z=1 P=0.2045873
 z=2 P=0.0509779
 z=3 P=0.0131722
 z=4 P=0.0034552
 z=5 P=0.0009137
 z=6 P=0.0002428
 z=7 P=0.0000647
 z=8 P=0.0000173
 z=9 P=0.0000046
 z=10 P=0.0000012

q=0.3
 z=0 P=1.0000000
 z=5 P=0.1773523
 z=10 P=0.0416605
 z=15 P=0.0101008
 z=20 P=0.0024804
 z=25 P=0.0006132
 z=30 P=0.0001522
 z=35 P=0.0000379
 z=40 P=0.0000095
 z=45 P=0.0000024
 z=50 P=0.0000006

换个角度，如果我们希望被反超的概率P<0.01的话，那么需要等待多少个blocks出来

P < 0.001
 q=0.10 z=5
 q=0.15 z=8
 q=0.20 z=11
 q=0.25 z=15
 q=0.30 z=24
 q=0.35 z=41
 q=0.40 z=89
 q=0.45 z=340