Ambari Metrics principle

Ambari Metrics System Referred to as AMS, It mainly provides the monitoring function of cluster performance for system administrators .Metrics Generally divided into Cluster、Host as well as Service Three levels .

Cluster and Host Level 1 is mainly responsible for monitoring the performance of cluster machines , and Service The level is responsible for Host Component Performance of .AMS The modules involved are shown in the figure below :

chart 1. Ambari Metrics Schematic diagram

about AMS itself , The main modules involved are Metrics Monitor、Hadoop Sinks as well as Metrics Collector.

AMS Also a Master-Slave The framework of the structure .Master The module is Metrics Collector,Slave It is Metrics Monitor and Hadoop Sinks.Salve The module is responsible for collecting information , And send it to Collector.

Of course Metrics Monitor and Hadoop Sinks There are also different responsibilities , The former is mainly responsible for collecting relevant indicators of the machine itself , for example CPU、Mem、Disk etc. ;

The latter is responsible for collecting Hadoop relevant Service Module performance data , For example, how much the module takes up Mem, And the function of the module CPU Occupancy rate, etc .

stay Ambari 2.1 Later versions ( contain 2.1) Support configuration Widget The function of .

Widget That is to say Ambari Web In Metrics Diagram control of , It will be based on Metrics The numerical , Make a simple aggregation operation , The final rendering is in the graph control .

AMS Will continue to collect cluster related performance data , And finally by Metrics Collector Medium Timeline Server Save to Hbase In the database ( adopt Phoenix).

as time goes on ,Metrics The data is going to be huge , therefore Metrics Collector Two storage modes are supported ,Embedded Mode( Embedded mode ) and Distributed Mode( Distributed mode ).

Simply speaking , For in embedded mode ,Hbase Will use the local file system as the storage layer , And in distributed mode ,Hbase with HDFS As a storage layer .

In this way, we can make full use of the physical storage of the whole cluster . But so far AMS Automated data migration is not supported , That is to say, when users move from embedded mode to distributed mode , It can't be done automatically HBase Data export and import .

Fortunately, , We can do it by simple HDFS CLI command , Will the whole Hbase The data directory is copied locally to HDFS in , To complete the Metrics Data migration .

Here we also need to pay attention to , If AMS To run in distributed mode , that Metrics Collector The machine you're on has to deploy a HDFS Of Data Node modular .

Ambari Widget

Ambari Widget The emergence of the Internet has further improved Ambari Ease of use , And configurability . We've talked about Widget The purpose of the control , That is to show AMS Collected Metrics attribute .

however Ambari At present, only support in Service Summary Page customization Widget Components .Ambari Widget There are four main categories :Graph、Gauge、Number as well as Template, The first three are more commonly used .

Graph It's a linear or rectangular graph , It's used to show at a certain time Service One of the ( It can be more than one )Metrics Property value . The effect is as follows (5 Kind of Graph chart ):

chart 2. Graph design sketch

Gauge Generally used to display percentages , The value comes from a ( It can be more than one )Metrics The calculated value ( Less than 1). The effect is as follows :

chart 3. Gauge figure

Number It is used to display a value directly , And you can configure a unit for it , for example MB etc. . The values displayed also come from one or more Metrics attribute .

because Widget Support only for Service Summary The custom of , in other words Widget Support only for Service Component Metrics Customization of attributes .

Yet the same Service Of Component There can be more than one Host Component form , that Widget You have to go through some kind of aggregation operation to get a value .

Ambari by Widget The component provides 4 In this way , Namely max、min、avg、sum.

Simply speaking :max Namely Host Component The same collection of metrics The maximum value of the property ;min Is the minimum ;avg It's the average ;sum It's summation .Widget The component will be avg Is the default aggregation mode . Users can go to widget Override this method in the configuration file of .

Ambari Predefined in Metrics and Widget

Before we go into the details of the specific predefined , Let's take a look first Ambari Web Related pages in . The first is Cluster Dashboard As shown in Metrics Widget Control , These are the so-called Cluster level Metrics, Pictured 4.

chart 4. Cluster level Metrics

On the machine Component On the page , We can see the machine related Metrics Information . also , Users can be in part Service Of Summary You can see the Service dependent Metrics Information .

at present ,Ambari Just for YARN、HDFS as well as Ambari Metrics Other parts Service Predefined Metrics and Widget Information .

Here we use YARN For example , See how to do it for a Service Definition Ambari Of Metrics and Widgets. First of all, let's take a look at Ambari Server On HDFS Directory structure of , Pictured 5.

chart 5. HDFS Directory structure of

In the directory structure , We can see metrics.json and widgets.json These two documents . stay Ambari Of Web Put a Service When ,Ambari Server It reads these two JSON The definition in the document .

about HDFS Come on , Its Metrics and Widget The configuration is quite rich and complicated , So here we just look at a small section of these two documents .

Here we are Ambari Web Open in HDFS Of Summary page , And look at what's called NameNode Heap Control for , Pictured 6( This is a summary diagram , You can click on the picture , And then look at the details ).

chart 6. NameNode Heap Summary chart

seeing the name of a thing one thinks of its function , We can see that the chart is for recording NameNode Process heap memory usage . So in AMS In the system , There must be a module that collects that information and sends it to Metrics Collector.

Finally, you need to define such a graph control , To read and display the collected Metrics Information . Here's how to Widget The definition of :

detailed list 1.NameNode Heap The definition of (widget.json)
{
"widget_name": "NameNode Heap",
"description": "Heap memory committed and Heap memory used with respect to time.",
"widget_type": "GRAPH",
"is_visible": true,
"metrics": [
{
"name": "jvm.JvmMetrics.MemHeapCommittedM",
"metric_path": "metrics/jvm/memHeapCommittedM",
"service_name": "HDFS",
"component_name": "NAMENODE",
"host_component_criteria": "host_components/metrics/dfs/FSNamesystem/HAState=active"
},
{
"name": "jvm.JvmMetrics.MemHeapUsedM",
"metric_path": "metrics/jvm/memHeapUsedM",
"service_name": "HDFS",
"component_name": "NAMENODE",
"host_component_criteria": "host_components/metrics/dfs/FSNamesystem/HAState=active"
}
],
"values": [
{
"name": "JVM heap committed",
"value": "${jvm.JvmMetrics.MemHeapCommittedM}"
},
{
"name": "JVM heap used",
"value": "${jvm.JvmMetrics.MemHeapUsedM}"
}
],
"properties": {
"display_unit": "MB",
"graph_type": "LINE",
"time_range": ""
}
}

The above configuration code is mainly composed of 4 Part of it is made up of , Namely widget Control description content 、metrics The match of 、value And the display properties of the control diagram .

For example, in the first few lines, the widget For the name of the “NameNode Heap” as well as Description Information .widget_type Is used to define the type of the graph , Here is Graph.

is_visible Used to configure whether the graph is visible .metrics Segment is configuration widget The metrics attribute .

In the code above ,widget The binding jvm.JvmMetrics.MemHeapCommittedM and jvm.JvmMetrics.MemHeapUsedM Two metrics attribute .

values Segment defines how to use the bound metrics Property value .Name Used to display the name of the attribute in the diagram ,value It's a mathematical expression , Define how to use ( Calculation , For example, addition, subtraction, multiplication and division )metric value .

As defined in the above code , It means to use directly metric value , It's not calculated .properties Paragraph defines the Graph Graph properties .display_unit Defines the unit of display ( It can also be used. “%” And so on ).

graph_type Defined Graph Types of graphs , It can be LINE( Linear chart ) or STACK( Rectangle ).time_range Is the sampling time . The picture below shows NameNode Heap An example of ( Detailed drawing ).

chart 7. NameNode Heap Detailed drawing

In the list 1 in , We explained HDFS widget.json A piece of configuration code for , Now let's find the corresponding Metrics Configuration code .

First let's open up HDFS Of metrics.json, And search the string separately “jvm.JvmMetrics.MemHeapUsedM” and “jvm.JvmMetrics.MemHeapCommittedM” Find the following example code ( I've glued the two pieces together ):

detailed list 2.metrics.json
{
"NAMENODE": {
"Component": [
{
"type": "ganglia",
"metrics": {
"default": {
"metrics/jvm/memHeapUsedM": {
"metric": "jvm.JvmMetrics.MemHeapUsedM",
"unit": "MB",
"pointInTime": false,
"temporal": true
},
"metrics/jvm/memHeapCommittedM": {
"metric": "jvm.JvmMetrics.MemHeapCommittedM",
"unit": "MB",
"pointInTime": true,
"temporal": true
}
}
}
}]
}
}

We can see from the code above that Metrics The property of is for HDFS Of Namenode Configured . stay type a , The metrics The implementation type of the request .

Read about ganglia Readers of , You should know ganglia It's also Apache Here's an open source framework , Its purpose is to monitor the cluster performance , I won't go into that here .

stay metrics In the field of , Each of them is defined in detail metrics Properties of . In the code “metrics/jvm/memHeapUsedM” and “metrics/jvm/memHeapCommittedM” All are metrics Of key, This key To a service It's the only one , Used to uniquely identify a metric request .

metric Defines the name ,unit It's units .pointInTime It means that we should Metric Property to allow time period queries , If false It means that , This will only take the last value .

Temporal Represents whether query requests for time periods are supported , It's usually true. because Widget Equipped with time range Properties of , If temporal yes false,widget Component cannot display any data .

How to serve the third party Service increase Metrics and Widgets

Ambari Now it supports highly configurable , about Metrics and Widget Come on , We need a simple configuration first metrics.json as well as widget.json.

Here's an example Sample Service As an explanation (Ambari Version is 2.2,HDP Stack by 2.3). Let's take a look first Sample Directory structure of , Here's the picture .

chart 8. Example Sample Directory structure

I'm here for Sample Service Added metrics.json as well as widgets.json. In the control script master.py Only the related control functions are simply defined in , There's no real logic code . Example Sample Of metainfo.xml The code is as follows :

detailed list 3.Sample Service Of metainfo.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
 

Here you need to pay attention to the fields timelineAppId, The value is unique , It's usually used Service Name that will do , And case insensitive .Metrics Collector Medium Tmeline Server Will pass timelineAppid Distinguish between the various modules Metrics Information .Sample in , I use it. sample As app id that will do . Concrete Metrics The definition is as follows :

detailed list 4.Sample Of metrics.json
<metainfo>
<schemaVersion>2.0</schemaVersion>
<services>
<service>
<name>SAMPLE</name>
<displayName>my sample</displayName>
<comment>Service definition for sample</comment>
<version>1.0</version>
<components>
<component>
<name>MY_MASTER</name>
<displayName>My Master</displayName>
<category>MASTER</category>
<cardinality></cardinality>
<timelineAppid>sample</timelineAppid>
<commandScript>
<script>scripts/master.py</script>
<scriptType>PYTHON</scriptType>
<timeout></timeout>
</commandScript>
</component>
</components>
<osSpecifics>
<osSpecific>
<osFamily>any</osFamily>
</osSpecific>
</osSpecifics>
</service>
</services>
</metainfo>

The above code is Sample Of Master The module is configured with 4 term metric attribute :test1、test2、test3 as well as test4. And then through the following widget.json, It predefines a Graph Linear graphs of type . The specific code is as follows :

detailed list 5.Sample Of widget.json
{
"layouts": [
{
"layout_name": "default_sample_dashboard",
"display_name": "Standard TEST Dashboard",
"section_name": "SAMPLE_SUMMARY",
"widgetLayoutInfo": [
{
"widget_name": "test_widget",
"description": "test widget",
"widget_type": "GRAPH",
"is_visible": true,
"metrics": [
{
"name": "test1",
"metric_path": "metrics/test1",
"service_name": "SAMPLE",
"component_name": "MY_MASTER"
}
],
"values": [
{
"name": "test1",
"value": "${test1}"
}
],
"properties": {
"graph_type": "LINE",
"time_range": ""
}
}
]
}
]
}
 

From the above code , We see that it's time to Widget The binding Metrics Medium test1, And use it directly test1 As its value attribute . Then we can start with Ambari Web Deploy this Sample Service To Ambari colony . After installation , We can see the following Service Summary Interface :

chart 9. Sample Service Summary Interface

Here we are , We've already seen the predefined test_widget Control . Maybe you'll be curious , Why the control displays no data available . It's simple , Because we didn't do it for Sample Service Realize collection and sending Metrics Modules of data .

We've covered in the first section ,AMS There is Metrics Monitor as well as Hadoop Sinks Collect relevant Metrics data , But these don't serve third-party services .

HDFS Service The reason why we can see the relevant Widget data , It is also because Ambari For which the corresponding Hadoop Sinks.

To put it simply ,Hadoop Sink It's just one. jar package , Its Java A HTPP Client The thread of , Collect and send relevant data regularly .

and Metrics Monitor It's also Ambari Pass in the background Shell A script to start Python process , It is through url The class library implements HTPP Of Client, Regularly send to Collector send data .

Here I use a simple Shell Scripts do similar things , The sample code is as follows .

detailed list 6. send out Metrics Data scripts (metric_sender.sh)
#!/bin/sh
url=http://$1:6188/ws/v1/timeline/metrics
while [ ]
do
millon_time=$(( $(date +%s%N) / ))
random=`expr $RANDOM % `
json="{
\"metrics\": [
{
\"metricname\": \"$2\",
\"appid\": \"$3\",
\"hostname\": \"localhost\",
\"timestamp\": ${millon_time},
\"starttime\": ${millon_time},
\"metrics\": {
\"${millon_time}\": ${random}
}
}
]
}" echo $json |tee -a /root/my_metric.log
curl -i -X POST -H "Content-Type: application/json" -d "${json}" ${url}
sleep
done
 

Run the following command ( We must pay attention to the parameters here 1 yes Metrics Collector Where the machine is , Not at all Ambari Server):

./metric_sender.sh ambari_collector_host test1 sample

The code above is written as 5 Seconds as interval , towards Metrics Collector Send a random number as test1 Of metrics Property value , The command is sent by curl Realized .

In addition, readers can also use Firefox Of Poster Tools to test . If you use curl The command completes the function , We need to pay attention to -H Use of parameters .

When the command runs for a period of time , We can do that Summary Page see Widget The design of , Click to open the detailed data chart as follows :

chart 10. Test_widget figure

Through the above efforts , We are successful for Sample Service Customized test1 Of metrics attribute , And by the test_widget Dynamically display the change of this attribute , Thus, it can greatly facilitate the system administrator to complete the relevant monitoring work .

We can look back Sample Service Of metrics.json, Among them, we define 4 individual metrics attribute , But we have only one widget Component calls a Metrics attribute .

Some readers may find it unreasonable , That's why I define it that way , I want to give an example in a real environment , Third party services can be defined in more detail Metrics attribute , The system administrator can selectively monitor some key attributes .

It's easy to operate ( The premise is that Service It has been defined widget.json), Only need Service Of Summary Select... On the page Action -> Create Widget, Then choose the corresponding Metrics attribute , And define the relevant mathematical expressions .

Here's the picture , Defined a Gauge Percentage graph of class , Its purpose is to show Test2( Less than 10 The random number ) Divide 10 The numerical ( The result is small and 1).

also Warning The threshold value of is 0.7,Error The threshold value of is 0.9, That is, when the percentage is greater than 70% Less than 90%, The color of the picture will turn yellow , Dayu 90% It will be red , Green is the normal color .

chart 11. Show Test2 Attribute Gauge chart

chart 12. Summary Page rendering

How to do AMS Integrated Trouble-shooting

adopt Log Mechanism

AMS Our system is not complicated , The core of its integration is Metrics Collector. If you have a related problem , First we can look at Metrics Collector and Ambari Server Log .

Finish in the configuration metrics.json as well as widget.json after , Successful deployment Service Before , If there is a problem, most of them just need to check Ambari Server Of log that will do , Most of the problems are two json Because of the format problem of .

After successful deployment , Most of them just need to check Metrics Collector Log . It's better to open it here Metrics Collector Of DEBUG Level log.

This needs to be AMS Service Of Config Page to find ams-log4j paragraph , And change the log4j.rootLogger by DEBUG that will do .

Here's another reminder ,Metrics Collector No Ambari Server, They may not be on one machine , There are a lot of people who have been Ambari Server Machine search for Metrics Collector Log .

Metrics Collector Of Rest API

And we can pass it Metrics Collector Of Rest API Test whether it is normal . Its API It's simple , And only support POST and GET Two requests .

POST Method can refer to the script in the previous section curl command .GET The method is as follows , We can get the last section Sample Medium test2 Property value ( Here's a reminder ,Collector On Timeline Server Of Web Port defaults to 6188, But sometimes it can be 6189.

So you need to AMS Of Config Find... In the page ams-site Configuration bar , Confirm its port ).

Execute the following command :

curl -X GET http://xalin64sw8:6188/ws/v1/timeline/metrics?metricNames=test2&appid=sample

Get the following output ( Last time Metrics Property value ):

{"metrics":[{"timestamp":,"metricname":"test2","appid":"sample","starttime":,"metrics":{"":7.875}}]}

adopt Phoenix Inquire about Hbase database

Phoenix It's also Apache The next well-known project , Its main function is to support Hbase Above SQL operation . Interested readers can also visit Phoenix Learn about its specific content on the official website of .AMS adopt Phoenix stay Hbase The following eight tables are created in :

surface 1. AMS adopt Phoenix Created Hbase form
Table name describe Time interval for cleaning up
METRIC_RECORD Used to record every... Collected on each machine Metrics attribute , The time accuracy is 10 miao . 1 God
METRIC_RECORD_MINUTE ditto , The time accuracy is 5 minute . 1 Zhou
METRIC_RECORD_HOURLY ditto , The time accuracy is 1 Hours 30 God
METRIC_RECORD_DAILY ditto , The time precision is 1 God 1 year
METRIC_AGGREGATE Cluster level Metrics attribute ( Aggregate and calculate each machine's Metrics), The sampling accuracy is 30 miao 1 Zhou
METRIC_AGGREGATE_MINUTE ditto , Accuracy of 5 minute 30 God
METRIC_AGGREGATE_HOURLY ditto , Accuracy of 1 Hours 1 year
METRIC_AGGREGATE_DAILY ditto , Accuracy of 1 God 2 year

Above , We already know the name of the relevant form , Then you can go through Phoenix Query the corresponding table content . The following example diagram uses phoenix 4.2.2 In version sqlline Access to Hbase. By inquiring Hbase database , We can know to send Metrics Whether the module of property is normal .

chart 13. Phoenix Examples of use

Ambari Metrics More related articles in detail

  1. Hadoop Release version Hortonworks Installation details ( Two ) install Ambari

    One . adopt yum install ambari-server Because we set up the local source in the last step , actually yum Is installed from a local source ambari-server, Although it can also be installed online directly through the official source , But it's huge and time-consuming . So here I'm going to choose ...

  2. Ubuntu14.04 Next Ambari Install and deploy big data cluster ( The picture and text are explained in detail in five steps )( Bloggers strongly recommend )

    Not much to say , Direct delivery of dry goods ! Let me write it out front (1) recently , Because I work as a real physical machine in the big data environment cluster of my team's laboratory , thus , I uphold responsibility . Serious and careful attitude , First, simulate and build on the virtual machine ambari( be based on CentO ...

  3. About using... On real physical machines cloudermanger or ambari Summary of precautions for building big data cluster 、 Experience and experience ( Graphic, )

    Let me write it out front (1) recently , Because I work as a real physical machine in the big data environment cluster of my team's laboratory , thus , I uphold responsibility . Serious and careful attitude , First, simulate and build on the virtual machine ambari( be based on CentOS6.5 edition ) and clo ...

  4. to Ambari Install visual analysis tools in the cluster Hue step ( Graphic, )

    Expand the blog following , I'm doing it manually CDH Version platform , install Hue. CDH Build under the big data cluster Hue(hadoop-2.6.0-cdh5.5.4.gz + hue-3.9.0-cdh5.5.4.tar.gz) ...

  5. to Ambari Cluster installation based on Hive Big data real-time analysis query engine tool Impala step ( Graphic, )

    Not much to say , Direct delivery of dry goods ! Impala and Hive The relationship between ( Detailed explanation ) Expand the blog to Clouderamanager Cluster installation based on Hive Big data real-time analysis query engine tool Impala step ( Graphic, ) Reference resources horton ...

  6. to ambari In the cluster kafka The installation is based on web Of kafka Management tools Kafka-manager( Graphic, )

    Not much to say , Direct delivery of dry goods ! Reference blog be based on Web Of Kafka Manager tools Kafka-manager Detailed installation of compilation deployment ( Support kafka0.8.0.9 and 0.10 Later versions )( Graphic, )( Default port or any custom port ...

  7. Ambari How to delete a specified service in ( Graphic, )

    Not much to say , Direct dry goods ! Ambari Learn from a lot of mature distributed software API Design .Rest API It is a good embodiment of . adopt Ambari Of Rest API, You can do it in a script by curl Maintain the entire cluster . and ...

  8. Ambari The problem with deployment is Ambari Metrics Can't start

    First , My question is as follows : Traceback (most recent call last): File , in <module> AMSServiceCheck().execute() ...

  9. Hue Global configuration file for hue.ini( Graphic, )

    Hue edition :hue-3.9.0-cdh5.5.4 You need to compile to use ( Connected to the Internet ) For all of you : If you have a good computer configuration , Be sure to install cloudera manager. It's family, after all . meanwhile , I've experienced it myself , There will be a ...

Random recommendation

  1. Java The inner class of

    Java The inner class of First, let's look at what inner classes are ? Inner class means to define another class inside an outer class . Inner classes can be static static Of , Also available public,default,protected and private modification ...

  2. aspx Use... In the page Input Tag to upload pictures

    To achieve the function of uploading pictures, a separate aspx page , The front page needs to pay attention to two points : a) Implementation of upload function input Of type="file" b) Set the request message header to enctype=" ...

  3. ( turn )ecshop The function of adding pictures to the background commodity classification

    Turn it around --http://blog.sina.com.cn/s/blog_4696b3760100n5ee.html   1 . First, find the data table ecs_category ( Classification list ) Add one cat_i ...

  4. python Similar module use cases ( One )

    One :threading VS Thread as everyone knows ,python It supports multithreading , And it's native The thread of , among threading It's right Thread The module is packaged , Can be used in more ways ,threading ...

  5. java Data types and binary

    stay java in Int Variables of type account for 4 Bytes Long Variables of type account for 8 Bytes A program is a world , Variables are the basic unit of the program . Java Basic data type 1.        Integer types 2.        ...

  6. 1068: [SCOI2007] Compress

    Answer key : Section DP Consider the design of States : \(dp[i][j][0/1]\) Representing the original string \(i-j\) Whether the interval is added in the middle \(M\). And default to \(i\) Before joining \(M\) The minimum length after compression , There's obviously a shift : \[ ...

  7. Kaggle course —— The great God teaches you to share

    This article records the author watching Coursera National University of Economics HLE Curriculum How to win a data science competetion Harvest in , Share with you . The lecturer of this course is Kaggle The bull of , ...

  8. nuxtjs Use routing guard in

    stay vue The routing guard in is usually used to verify token Failure or something , Of course, you can also set permissions , stay nuxtjs How to use route guard in the network , Don't talk much , Go straight to the code stay plugins Resume in the catalog route.js export defa ...

  9. Wechat applet - Drop down events (onPullDownRefresh) Don't trigger

    1.app.json No configuration "window": { /* Other configuration information */ "enablePullDownRefresh":true } 2.scroll ...

  10. Python3.6.2 Online installation pymysql modular

    I am a python Novice just used python When I write the email code, I think I need to connect to the database , The following installation steps python -m pip install pymysql PS C:\Users\hp> ...