HBase Write Path
http://blog.cloudera.com/blog/2012/06/hbase-write-path/
At first, it locates the address of the region server hosting the -ROOT- region from the ZooKeeper quorum. From the root region server, the client finds out the location of the region server hosting the -META- region.(首先从Zookeeper里面找到-ROOT- region所在的region server,然后在找到对应的-META- region所在的region server,最后找到数据所在的region server)
写入的Write Ahead Log存放在/hbase/.logs下面,文件路径是 hbase.logs/<host>,<port>,<startcode>,文件名称/hbase/.logs/<host>,<port>,<startcode>/<host>%2C<port>%2C<startcode>.<timestamp>. 个人猜测,startcode表示这个regionserver启动的时间,log文件名后面的timestamp部分表示这个log文件产生时间。
/hbase/.logs/srv.example.com,60020,1254173957298 /hbase/.logs/srv.example.com,60020,1254173957298/srv.example.com%2C60020%2C1254173957298.1254173957495
对于每个WAL文件roll的时机包括下面几个:
- 大小达到HDFS block size (64MB,可以通过hbase.regionserver.hlog.blocksize配置)的95%(可以通过hbase.regionserver.logroll.multiplier配置)
- 定期(1小时)进行(可以通过hbase.regionserver.logroll.period配置)
By default, WAL file is rolled when its size is about 95% of the HDFS block size. You can configure the multiplier using parameter: “hbase.regionserver.logroll.multiplier”, and the block size using parameter: “hbase.regionserver.hlog.blocksize”. WAL file is also rolled periodically based on configured interval “hbase.regionserver.logroll.period”, an hour by default, even the WAL file size is smaller than the configured limit.