The Log: What every software engineer should know about real-time data's unifying abstraction
文章介绍了log的来龙去脉,站在log角度来看待数据库和分布式系统,以及讨论如何使用log来更好地设计数据库和分布式系统。文章末尾给出了许多有价值的参考链接。
关于共识算法(consensus algorithm)方面的文章链接
Paxos!:
- Original paper is here. Leslie Lamport has an interesting history of how the algorithm was created in the 1980s but not published until 1998 because the reviewers didn't like the Greek parable in the paper and he didn't want to change it.
- Even once the original paper was published it wasn't well understood. Lamport tries again and this time even includes a few of the "uninteresting details" of how to put it to use using these new-fangled automatic computers. It is still not widely understood.
- Fred Schneider and Butler Lampson each give more detailed overview of applying Paxos in real systems.
- A few Google engineers summarize their experience implementing Paxos in Chubby.
- I actually found all the Paxos papers pretty painful to understand but dutifully struggled through. But you don't need to because this video by John Ousterhout (of log-structured filesystem fame!) will make it all very simple. Somehow these consensus algorithms are much better presented by drawing them as the communication rounds unfold, rather than in a static presentation in a paper. Ironically, this video was created in an attempt to show that Paxos was hard to understand.
- Using Paxos to Build a Scalable Consistent Data Store: This is a cool paper on using a log to build a data store, by Jun, one of the co-authors is also one of the earliest engineers on Kafka.
Paxos has competitors! Actually each of these map a lot more closely to the implementation of a log and are probably more suitable for practical implementation:
- Viewstamped Replication by Barbara Liskov is an early algorithm to directly model log replication.
- Zab is the algorithm used by Zookeeper.
- RAFT is an attempt at a more understandable consensus algorithm. The video presentation, also by John Ousterhout, is great too.