Required for interview: HBase block cache

Big data 2021-10-14 07:48:01
 Big data
Big data
Mainly share big data framework , Such as spark,flink,kafka,hbase Principle source code , At the same time, it will share the data warehouse , Figure calculation and other wave crest fields .
367 Original content
official account

Block Cache

HBase There are two different kinds of BlockCache Realization , For caching from HDFS Read out data . These two are respectively :

  1. default , In heap memory (on-heap)LruBlockCache

  2. There is out of heap memory (off-heap)BucketCache

Next, we will discuss the advantages and disadvantages of each method 、 How to choose between the two ways , And these two types of related configurations .

Cache Choices

LruBlockCache Is the initial implementation , And all exist Java In heap memory .BucketCache Is another option , Mainly used to block cache Data exists for off-heap( Out of heap memory ), however BlockCache It can also be used as a file backup cache .

When open the BucketCache after , The system with two-level cache enabled . We used to use “L1” and “L2” To describe these two levels , But now the term has been used in hbase-2.0.0 It was abandoned . Now? “L1” cache Directly refers to LruBlockCache,“L2” It means a off-heap Of BucketCache.(hbase-2.0.02 after ) When BucketCache When enabled , All data blocks (DATA block) Will be there BucketCache layer , and meta Data blocks (INDEX as well as BLOOM block ) Be existed on-heap Of LruBlockCache in . Manage these two layers of cache , And policies that indicate how data blocks move between them , from CombinedBlockCache complete .


Cache The general configuration of

In addition to caching its own implementation , We can also set some general configuration options , Used to control the cache act . For details, please refer to CacheConfig Documents :

After setting or modifying any properties , Need to restart HBase Cluster for configuration files to take effect . If you encounter an exception , You can view it further HBase Wrong report in .


LruBlockCache The design of the

LruBlockCache It's a LRU cache , There are three priorities , To adapt to :scan-resistance as well as in-memory ColumnFamilies scene . The three priorities are :

  1. Single Access priority : When a data block is first removed from HDFS When reading , It will have this priority , And the cache space needs to be recycled ( substitution ) when , It's a priority . Its advantage lies in : Usually scanned (scanned) Read data block , Compared to the data blocks that will be used later , It should be cleared first

  2. Multi Access priority : If a data block , Belong to Single Access priority , But then visited again , Then it will be upgraded to Multi Access priority . The contents in the cache need to be cleared ( substitution ) when , This part belongs to the scope of secondary consideration

  3. In-memory Access priority : If the data block family is configured to “in-memory”, Will have this priority , And it has nothing to do with the number of times it is accessed .HBase Catalog This is the configured priority . When the content in the cache needs to be replaced , This part belongs to the final consideration . If you need to label a column family as this priority :

    1. stay Java Can be called : HColumnDescriptor.setInMemory(true);

    2. stay hbase shell When creating or modifying a table in , have access to IN_MEMORY => true, for example :create ‘t’, {NANME => ‘f’, IN_MEMORY => ‘true’}

If you want more specific information , You can refer to LruBlockCache Source code


LruBlockCache  Use

Generally speaking ,BlockCache It is enabled by default in all user tables , in other words , Any read operation will load LRU Cache. This scheme may be applicable to most scenarios , however , If you need to achieve better performance, Some adjustments still need to be made . One of the most important concepts is working set size( or WSS), It means : The resources needed to solve a problem ( Memory ) size . For a website , This is the amount of data required to respond to requests in a short time . Calculated at HBase How much memory is available in cache The method to :

number of region servers * heap size * hfile.block.cache.size * 0.99

block.cache The default value of is 0.4, Represents the amount of heap memory available 40%. Last value (99%) yes : After cache memory eviction starts , Default recyclable rate ( If the value here is 100%, It's not realistic , Because new blocks will also be written during replacement ?). Here are some examples of using this formula :

  1. One region server, Set up 1GB Heap memory for , Use default block.cache Parameters , Will have a 405MB Of block cache You can use

  2.  20 individual region server, Set up 8GB Size of heap memory , Use default block.cache Parameters , Will have a 63GB The size of block cache You can use (20 × 8 × 0.4 × 0.99)

  3. 100 individual region server, Set up 24GB Size of heap memory , Use block.cache=0.5, Will have a 1.16TB Of the available block cache

Of course , Be existed block cache It's not just your data , Here are some other things to consider :

  1. Catalog  surface :hbase:meta The table is forced to load block cache, And have in-memory priority , in other words , Its contents are rarely removed from the cache .( according to regions The number of ,hbase:meta Tables take up a small portion of memory )

  2. HFile  Indexes :HFile yes HBase Used in HDFS Format of data file stored on . It contains multi-level indexes , It can make HBase Find the target data without reading the entire file . The factors that determine the size of these indexes are : Block size ( The default is 64KB), Yours key size , And the amount of data stored . For large data sets , One region server Of cache It contains about 1GB The index size of is also normal , Although not all indexes will be put into cache( because LRU Will remove those indexes that are not commonly used ).

  3. Keys:The values that are stored are only half the picture, since each value is stored along with its keys (row key, family qualifier, and timestamp).

  4. Bloom Filter: Such as HFile It's the same index , Those data structures ( If it's in enabled after ), Will be there LRU

For the current , To measure HFile Index and Bloom filter A recommended way to size is : see region server UI, And check the relevant indicators . For keys , When sampling , have access to HFile Command line tools for , And look at the average size of the keys . from HBase 0.98.3 in the future , Can be in UI In the interface Block Cache part , see BlockCache Detailed status and indicators of .

Generally speaking , If the currently available memory is not enough to support WSS, It is not recommended to use block caching. for instance : Suppose there are... In the cluster 40GB The available memory is distributed across the region server Of block cache, But you need to deal with 1TB The data of , Then this scenario is not suitable for block caching. One reason is : Recycling ( substitution ) Caching can disrupt memory distribution , And trigger more unnecessary garbage collection . Here are two scenarios :

  1. Completely random read mode : This kind of scene is generally , In a short time , The application hardly reads the contents of the same row in the table repeatedly , So in this case , hit cache The opportunity is basically close to 0. Set... In this table block caching It's a waste of memory and CPU Time slice of . More Than This , It will also produce more JVM Garbage collection events .

  2. Mapping a table: For example, in some MapReduce Tasks , The input of the task is a table . Each row is read only once , So there's no need to put this data in block cache. stay Java in ,Scan Object has a close block cache The function of :setCaching( Set to false). Of course , If you need fast random read access , You can also keep it open on this table block caching function .

Cache only META Data blocks (DATA  Data block in fscache

An interesting setting is : Cache only META Data blocks , Every time you read data , All visit DATA Data blocks . In this case, if DATA The data block is suitable for fscache, And when the operation of accessing data is completely random in a large cluster , This setting is desirable . To enable this setting , You can modify the table directly : For a column family , Set up BLOCKCACHE => ‘false’. This closes the of the column family BlockCache. however ,META The data block block caching Cannot be closed , Even if it's closed ,META Data blocks will still be loaded into the cache .

Out of pile (Off-heapBlock Cache

How to open BucketCache

A common deployment BucketCache The way is through a management class , It sets up a two-level cache : A cache in the heap , from LruBlockCache Realization ; And layer 2 caching , from BucketCache Realization . The default management class is CombinedBlockCache. To put it simply , The caching rule implemented by this class is : take meta Data blocks (INDEX as well as BLOOM) Cache in the heap (LruBlockCache layer ), And will be DATA Put data in BucketCache layer .

stay HBase-2.00  Before the release

stay hbase 2.00 Version before , from BucketCache It will be slow to get data ( Compare the use of heap memory LruBlockCache). However , In terms of performance , The delay time of read operation tends to be stable . Because in use BucketCache when , There will be less garbage collection (BucketCache management BlockCache The distribution of , instead of GC). If BucketCache Deployed as off heap (off-heap) Pattern , Then this part of memory will not be used at all GC management . That's why you're 2.0.0 pre-release HBase Use BucketCache when , The delay time tends to be stable , And can reduce GC And the impact of heap memory fragmentation , This can safely use more memory . If you want the cache not to be GC management , have access to BucketCache.

stay 2.0.0 Before version , In configuration BucketCache after , Can reduce the LruBlockCache The effect of displacement . All data and index Blocks are first cached in L1. When L1 Cache cleanup occurred in ( substitution ) when , The replaced data block will be moved to L2. stay Java in , Can pass HColumnDescriptor.setCacheDataInL1(true) Set up cacheDataInL1; stay hbase shell Can be set in CACHE_DATA_IN_L1 by true, for example :create ‘t1’, {NamE => ‘t1’, CONFIGURATION => {CACHE_DATA_IN_L1 => ‘true’}}

HBase-2.0.0  After the version

HBASE-11425 Changed the HBase Data read path for , It realizes reading data directly from outside the heap .off-heap The delay can be close to on-heap Delay of , because off-heap It doesn't cause GC operation .

from HBase 2.0.0 Start ,L1 And L2 The concept of is abandoned . When BucketCache When enabled , Data blocks (DATA blocks) Will always be kept in BucketCache;INDEX/BLOOM The block will be saved in LRUBlockCache Heap memory for .cacheDetaInL1 The configuration of has also been removed .

BucketCache The block cache can be deployed as off-heap, file , or mmaped File these three modes ( adopt hbase.bucketcache.ioengine To configure ). Set to offheap Will make BucketCache Out of heap memory management block cache. Set to file:PATH_TO_FILE(EMR Rimmer thinks files:/mnt/hbase/bucketcache), Will directly let BucketCache Use file caching ( If the volume is SSD Like a high-speed disk , It would be more useful ). from 2.0.0 Start , It is also possible to use multiple file paths . If you need more Cache In the scene of , This configuration is very useful . The basic configuration when setting is :files:PATH_TO_FILE1, PATH_TO_FILE2, PATH_TO_FILE3.BucketCache It can also be configured to use a mmapped file . To configure ioengine by mmap:PATH_TO_FILE that will do .

stay hbase 2.0.0 Before , You can also set multi-level cache ( Bypass CombinedBlockCache Strategy ), take BucketCache Set to strict L2 cache ,LruBlockCache by L1 cache . In this configuration , Set up hbase.bucketcache.combinedcache.enable by false that will do . In this mode , When L1 Cache contents are cleared ( substitution ) when , Will put the replaced block into L2. When a block is cached , First cached in L1. When we query a cache block , First, in the L1 check , If you don't find , Then search L2. We call this deployment method Raw L1+L2. It should be noted that , This L1+L2 The pattern is already hbase 2.0.0 Later removed . When BucketCache When used , It will strictly DATA Block cache put BucketCache, and INDEX/META Blocks are put into LruBlockCache.

other BucketCache The configuration of includes : Appoint cache The path that is persisted so that it still exists after restart 、 Write cache Number of threads , wait . When checking whether it is on , You can view the log content , Will contain cache Set up ; It will record in detail BucketCache Deployed information . It can also be done through UI, It can see in detail cache Levels and their configuration .

BucketCache Sample configuration

This example provides for a 4GB Out of pile BucketCache、1GB Configuration of in heap cache . The configuration process is RegionServer Put into practice .

Set up hbase.bucketcache.ioengine, And set up hbase.bucketcache.size > 0, Turn on CombinedBlockCache. Here we assume that at present RegionServer Configured with 5GB Heap memory for (HBASE_HEAPSIZE=5g)

  1. First , edit RegionServer Of file , And set up HBASE_OFFHEAPSIZE, Need more than needed off-heap Size . What is needed in this case off-heap by 4GB, So we set this value to 5GB. here 4GB Used for our off-heap cache , remainder 1G Used by other users ( Because other users will also use off-heap Memory ; for example RegionServer Medium DFSClient Will use out of heap memory , Refer to the following Direct Memory Usage in HBase).HBASE_OFFHEAPSIZE=5G

  2. then , stay hbase-site.xml Set the following configuration under :

<property> <name>hbase.bucketcache.ioengine</name> <value>offheap</value></property><property> <name>hfile.block.cache.size</name> <value>0.2</value></property><property> <name>hbase.bucketcache.size</name> <value>4196</value></property>

1. Restart the cluster , If there are any problems , Check the log

In the configuration file above , We set it up BucketCache by 4G, To configure on-heap LruBlockCache by 20% Of RegionServer Heap memory for (0.2 × 5G = 1G).

HBASE-10641 in the future ,HBase 0.98 And later versions , Introduced a method that can be used for BucketCache Configure multiple bucket And the function of their size . Configure multiple bucket sieze, You can set hbase.bucketcache.bucket.sizes Set... For a range of block sizes , From small to large . It's designed according to your data access pattern , Optimize bucket sizes. Here is an example configuration , Size from 4096 To 8192:

<property> <name>hbase.bucketcache.bucket.sizes</name> <value>4096,8192</value></property>

Direct Memory Usage in HBase

By default, the maximum number of direct memory because JVM Different but different . Traditionally 64M, Or by directly allocating heap memory (-Xmx), Or no restrictions at all ( Such as JDK7).HBase Server usage direct memory, especially short-circuit reading( Read data without DataNode, The client directly reads the file ),RegionServer Upper DFSclient Will be allocated direct memory buffers.DFSClient The amount of memory used is not easy to quantify ; It is from : The open HFile Number of documents × decision stay HBase Set in the for 128k( Reference resources hbae-default.xml The default configuration ). If needed off-heap block caching, You need to use direct memory (direct memory). stay RPC Server in , Will also use a ByteBuffer pool , from hbase 2.0.0 Start , These buffers are off-heap ByteBuffers. Start up JVM when , Make sure -XX:MaxDirectMemorySize Set up ( stay Considering off-heap BlockCache(hbase.bucketcache.size)、DFSClient Usage of , as well as RPC Terminal ByteBufferPool The maximum sum size of .Direct memory The size of should be larger than off-heap BlockCache + max ByteBufferPool The size of is bigger . In general , Can be based on the desired direct memory Size case , Extra redistribution 1-2GB Space .Direct memory Belong to Java Part of the process heap , And object heap ( from -Xmx Distribute ) Separate .MaxDirectMemorySize The size of must be less than the physical RAM size , And less than all available RAM size ( Because of other uses of memory , And the limitations of the system ).

You can UI Medium Server Metrics: Memory Column to see a RegionServer Amount of memory configured (on-heap as well as off-heap/direct memory). This part of the data can also be through JMX obtain .

BlockCache  Compress

HBASE-11331 Introduced lazy BlockCache decompression. After this function is turned on ,DATA( as well as ENCODED_DATA) Blocks of data are represented by them on-disk Form cache to BlockCache. The difference from the default mode is : By default , When caching a data block , Will unzip first 、 Decrypt , And then it's stored in the cache ( Because the data block is from HDFS take ). and lazy BlockCache decompression Store data blocks directly into the cache .

If one RegionServer Too much data is stored , Unable to put most of the data into the cache properly , Then turn on this function ( Use SNAPPY Compress ) after , Experimental proof : Can improve 50% Of throughput,30% The average delay of , increase 80% Garbage collection , as well as 2% The whole of CPU load .

For one RegionServer, If its data volume is already suitable cache Size , Or your application for additional CPU or GC The load is particularly sensitive , Then this option will not be of much use .

By default , This function is off , To turn on , Can be in all of RegionServer Medium hbase-site.xml Set in file by true


 Big data
Big data
Mainly share big data framework , Such as spark,flink,kafka,hbase Principle source code , At the same time, it will share the data warehouse , Figure calculation and other wave crest fields .
367 Original content
official account
Please bring the original link to reprint ,thank
Similar articles