HBase Log Splitting

http://blog.cloudera.com/blog/2012/07/hbase-log-splitting/

需要log split的原因是,在一台region server上面可能serve多个region,而这些region的WAL都记录在同一个文件里面。如果一个region server挂掉的话,那么对应的region需要放在其他region server上面进行serve,而在serve之前需要做日志恢复,这个日志包括所有对于这个region的修改,所以这就牵扯到了log split。所以所谓的log split是将一个WAL文件,按照不同region拆分成为多个文件,每个文件里面只是包含一个region的内容。log split发生在启动一个region server之前。

Log splitting is done by HMaster as the cluster starts or by ServerShutdownHandler as a region server shuts down. Since we need to guarantee consistency, affected regions are unavailable until data is restored. So we need to recover and replay all WAL edits before letting those regions become available again. As a result, regions affected by log splitting are unavailable until the process completes and any required edits are applied.(log split过程是由master来完成的,为了保证一致性在进行split期间受影响的region不能够服务,下面是一个log splitting的图示流程:

hbase-log-splitting.png


Times to complete single threaded log splitting vary, but the process may take several hours if multiple region servers have crashed. Distributed log splitting was added in HBase version 0.92 (HBASE-1364) by Prakash Khemani from Facebook. It reduces the time to complete the process dramatically, and hence improves the availability of regions and tables. For example, we knew a cluster crashed. With single threaded log splitting, it took around 9 hours to recover. With distributed log splitting, it just took around 6 minutes.(由单个master来完成log splitting的工作非常耗时,所以引入了distributed log splitting这个机制,由facebook的工程师实现的)

distributed log splitting 机制非常简单,就是将所有需要被splitting的WAL分布式并行地来完成。首先将这些文件全部放在zookeeper上面,然后cluster里面的机器可以上去认领自己来进行split那个日志,当然也要考虑这个机器在split日志的时候自己挂掉的情况。

hbase-split-log-manager.png

这个功能通过参数 hbase.master.distributed.log.splitting = true 来进行设置,split log manager也启动一个monitor thread来监控zookeeper节点观察出现的问题,逻辑如下:

comments powered by Disqus