HBase Write Path

http://blog.cloudera.com/blog/2012/06/hbase-write-path/

At first, it locates the address of the region server hosting the -ROOT- region from the ZooKeeper quorum. From the root region server, the client finds out the location of the region server hosting the -META- region.(首先从Zookeeper里面找到-ROOT- region所在的region server,然后在找到对应的-META- region所在的region server,最后找到数据所在的region server)

写入的Write Ahead Log存放在/hbase/.logs下面,文件路径是 hbase.logs/<host>,<port>,<startcode>,文件名称/hbase/.logs/<host>,<port>,<startcode>/<host>%2C<port>%2C<startcode>.<timestamp>. 个人猜测,startcode表示这个regionserver启动的时间,log文件名后面的timestamp部分表示这个log文件产生时间。

/hbase/.logs/srv.example.com,60020,1254173957298
/hbase/.logs/srv.example.com,60020,1254173957298/srv.example.com%2C60020%2C1254173957298.1254173957495

对于每个WAL文件roll的时机包括下面几个:

By default, WAL file is rolled when its size is about 95% of the HDFS block size. You can configure the multiplier using parameter: “hbase.regionserver.logroll.multiplier”, and the block size using parameter: “hbase.regionserver.hlog.blocksize”. WAL file is also rolled periodically based on configured interval “hbase.regionserver.logroll.period”, an hour by default, even the WAL file size is smaller than the configured limit.