Zookeeper tutorial
14 Lin 2021-06-29 09:55:46

Zookeeper

1.       Zookeeper summary

1.1 summary

Zookeeper It's an open source, distributed , Provide coordination services for distributed applications Apache project .

1.2 characteristic

 

1)Zookeeper: A leader (leader), Multiple followers (follower) A cluster of .

2)Leader Responsible for the initiation and resolution of the vote , Update system status .

3)Follower Used to receive customer requests and return results to the client , In the election Leader Vote in the process .

4) As long as more than half of the nodes in the cluster survive ,Zookeeper The cluster can service normally .

5) Global data consistency : Every server Save a copy of the same data ,client No matter which one is connected to server, The data are consistent .

6) The update request sequence goes on , From the same client The update requests of are executed in the order in which they are sent .

7) Atomicity of data updates , A data update is either successful , Or failure .

8) The real time , Within a certain time frame ,client Can read the latest data .

1.3 data structure

ZooKeeper The structure of the data model and Unix File systems are very similar , As a whole, it can be seen as a tree , Each node is called a ZNode. every last ZNode By default, it can store 1MB The data of , Every ZNode Can be uniquely identified by its path .

 

Data structure chart

1.4 Application scenarios

The services provided include : Unified naming service 、 Unified configuration management 、 Unified cluster management 、 Server node dynamic up and down 、 Soft load balancing, etc .

1.4.1 Unified naming service

In a distributed environment , Often need to apply / Services are named uniformly , Easy to identify different services .

(1) Similar to domain name and ip Correspondence between ,ip Not easy to remember , And domain names are easy to remember .

(2) Get the address of a resource or service by name , Information such as providers .

1.4.2 Unified configuration management

1) Distributed environment , Profile management and synchronization is a common problem .

(1) In a cluster , The configuration information of all nodes is consistent , such as Hadoop colony .

(2) After modifying the configuration file , I hope to be able to quickly synchronize to each node .

2) Configuration management can be delegated to ZooKeeper Realization .

(1) Configuration information can be written to ZooKeeper On the one Znode.

(2) Each node listens for this Znode.

(3) once Znode The data in is modified ,ZooKeeper Each node will be notified .

1.4.3 Unified cluster management

1) In distributed environment , It is necessary to master the state of each node in real time .

(1) Some adjustments can be made according to the real-time status of nodes .

2) You can leave it to ZooKeeper Realization .

(1) Node information can be written to ZooKeeper On the one Znode.

(2) Listen to the Znode We can get its real-time state change .

3) Typical applications

(1)HBase in Master State monitoring and elections .

1.4.4 Server node dynamic up and down

The client can see the changes of the server in real time .

1.4.5 Soft load balancing

Load balancing , English name is Load Balance, It means to load ( Work task ) Balance 、 Allocate to multiple operation units for operation , for example FTP The server 、Web The server 、 Enterprise core application servers and other main task servers , So that we can work together to complete the task .

1.5 Download address

1) Website home page :

https://zookeeper.apache.org/

2.       Zookeeper install

2.1.    Cluster planning

stay hadoop01、hadoop02 and hadoop03 Deploy on three nodes Zookeeper.

2.2.    Unpack the installation

(1) decompression zookeeper The installation package to /opt/module/ Under the table of contents

[root@hadoop102 software]$ tar -zxvf zookeeper-3.4.10.tar.gz -C /opt/module/

(2) stay /opt/module/zookeeper-3.4.10/ Create... In this directory data

       mkdir -p data

(3) rename /opt/module/zookeeper-3.4.10/conf In this directory zoo_sample.cfg by zoo.cfg

       mv zoo_sample.cfg zoo.cfg

2.3.    To configure zoo.cfg file

       (1) Specific configuration

       dataDir=/opt/module/zookeeper-3.4.10/data

       Add the following configuration

       #######################cluster##########################

server.1=hadoop01:2888:3888

server.2=hadoop02:2888:3888

server.3=hadoop03:2888:3888

(2) Configuration parameter interpretation

Server.A=B:C:D.

A It's a number , Which server is this ;

B It belongs to this server ip Address ;

C It's in this server and cluster Leader The port on which the server exchanges information ;

D In case of Leader The server is down , You need a port to re-election , Pick a new one Leader, And this port is the port that the servers communicate with each other during the election .

Configure a file in cluster mode myid, This file is in dataDir Under the table of contents , One of the data in this file is A Value ,Zookeeper Read this file at startup , Get the data and zoo.cfg The configuration information in it is compared to determine which is server.

2.4.    Cluster operation

(1) stay /opt/module/zookeeper-3.4.10/data Create one in the directory myid The file of

       touch myid

add to myid file , Attention must be paid to linux It creates , stay notepad++ It's very likely that there's a mess in it

(2) edit myid file

       vi myid

       Add and... To the file server The corresponding number : Such as 1

(3) Distribute the copy configured zookeeper On other machines

(4) To start, respectively, zookeeper

       [root@hadoop01 zookeeper-3.4.10]# bin/zkServer.sh start

[root@hadoop02 zookeeper-3.4.10]# bin/zkServer.sh start

[root@hadoop03 zookeeper-3.4.10]# bin/zkServer.sh start

(5) Check the status

[root@hadoop01 zookeeper-3.4.10]# bin/zkServer.sh status

JMX enabled by default

Using config: /opt/module/zookeeper-3.4.10/bin/../conf/zoo.cfg

Mode: follower

[root@hadoop02 zookeeper-3.4.10]# bin/zkServer.sh status

JMX enabled by default

Using config: /opt/module/zookeeper-3.4.10/bin/../conf/zoo.cfg

Mode: leader

[root@hadoop03 zookeeper-3.4.5]# bin/zkServer.sh status

JMX enabled by default

Using config: /opt/module/zookeeper-3.4.10/bin/../conf/zoo.cfg

Mode: follower

2.5.    attach zoo.cfg The meaning of parameters in the document

1)tickTime=2000: The number of communication heartbeat ,Zookeeper Server heartbeat time , Unit millisecond

Zookeeper Basic time used , Time interval between servers or between clients and servers to maintain heartbeat , That is, every one of them tickTime Time will send a heartbeat , Time is measured in milliseconds .

It's used for the heartbeat mechanism , And set the smallest session The timeout is twice the heartbeat time .(session The minimum timeout for is 2*tickTime)

2)initLimit=10:Leader and Follower Initial communication time limit

In the cluster follower Follower server and leader The maximum number of heartbeats that can be tolerated during initial connection between leader servers (tickTime The number of ), Use it to limit the Zookeeper The server is connected to Leader Time limit of .

Vote for the new leader Initialization time of

Follower In the startup process , From Leader Sync all the latest data , Then determine the starting state of the external service .

Leader allow Follower stay initLimit Finish the work in time .

3)syncLimit=5:Leader and Follower Synchronous communication time limit

In the cluster Leader And Follower The unit of maximum response time between , If the response exceeds syncLimit * tickTime,Leader Think Follwer Die , Remove... From the server list Follwer.

At run time ,Leader Responsible for working with ZK All machines in the cluster communicate , For example, through some heartbeat detection mechanism , To detect the survival of the machine .

If L Send out a heartbeat package in syncLimit after , Not yet F That received a response , So think of this F It's no longer online .

4)dataDir: Data file directory + Data persistence path

Where to save the snapshot information of the memory database , If there is no other explanation , The updated transaction log is also saved to the database .

5)clientPort=2181: Client connection port

Listen for the port of the client connection

3.       Zookeeper shell Client operation

Command basic syntax

Function description

help

Display all operation commands

ls path [watch]

Use ls Command to view the current znode What's included in

ls2 path [watch]

View the current node data and see the update times and other data

create

Common creation

-s  Contains sequence

-e  temporary ( Restart or timeout disappears )

get path [watch]

Get the value of the node

set

Set the specific value of the node

stat

View node status

delete

Delete node

rmr

Recursively delete nodes

1) Start client

[root@hadoop103 zookeeper-3.4.10]$ bin/zkCli.sh

2) Display all operation commands

[zk: localhost:2181(CONNECTED) 1] help

3) View the current znode What's included in

[zk: localhost:2181(CONNECTED) 0] ls /

[zookeeper]

4) View the current node data and see the update times and other data

[zk: localhost:2181(CONNECTED) 1] ls2 /

[zookeeper]

cZxid = 0x0

ctime = Thu Jan 01 08:00:00 CST 1970

mZxid = 0x0

mtime = Thu Jan 01 08:00:00 CST 1970

pZxid = 0x0

cversion = -1

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 0

numChildren = 1

5) Create a normal node

[zk: localhost:2181(CONNECTED) 2] create /app1 "hello app1"

Created /app1

[zk: localhost:2181(CONNECTED) 4] create /app1/server101 "192.168.1.101"

Created /app1/server101

6) Get the value of the node

[zk: localhost:2181(CONNECTED) 6] get /app1

hello app1

cZxid = 0x20000000a

ctime = Mon Jul 17 16:08:35 CST 2017

mZxid = 0x20000000a

mtime = Mon Jul 17 16:08:35 CST 2017

pZxid = 0x20000000b

cversion = 1

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 10

numChildren = 1

[zk: localhost:2181(CONNECTED) 8] get /app1/server101

192.168.1.101

cZxid = 0x20000000b

ctime = Mon Jul 17 16:11:04 CST 2017

mZxid = 0x20000000b

mtime = Mon Jul 17 16:11:04 CST 2017

pZxid = 0x20000000b

cversion = 0

dataVersion = 0

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 13

numChildren = 0

7) Create a transient node

[zk: localhost:2181(CONNECTED) 9] create -e /app 8888

(1) It can be viewed in the current client

[zk: localhost:2181(CONNECTED) 10] ls /

[app1, app, zookeeper]

(2) Exit the current client and restart the client

       [zk: localhost:2181(CONNECTED) 12] quit

[root@hadoop104 zookeeper-3.4.10]$ bin/zkCli.sh

(3) Check again that the temporary node in the root directory has been deleted

       [zk: localhost:2181(CONNECTED) 0] ls /

[app1, zookeeper]

8) Create nodes with serial numbers

       (1) First create a normal root node app2

       [zk: localhost:2181(CONNECTED) 11] create /app2 "app2"

       (2) Create nodes with serial numbers

       [zk: localhost:2181(CONNECTED) 13] create -s /app2/aa 888

Created /app2/aa0000000000

[zk: localhost:2181(CONNECTED) 14] create -s /app2/bb 888

Created /app2/bb0000000001

[zk: localhost:2181(CONNECTED) 15] create -s /app2/cc 888

Created /app2/cc0000000002

If there is 1 Nodes , Then reorder from 1 Start , And so on .

[zk: localhost:2181(CONNECTED) 16] create -s /app1/aa 888

Created /app1/aa0000000001

9) Modify node data values

[zk: localhost:2181(CONNECTED) 2] set /app1 999

10) Node value change monitoring

       (1) stay 104 Register listening on the host /app1 Node data changes

[zk: localhost:2181(CONNECTED) 26] get /app1 watch

       (2) stay 103 Modify on the host /app1 Node data

[zk: localhost:2181(CONNECTED) 5] set /app1  777

       (3) Observe 104 Host receives monitoring of data changes

WATCHER::

WatchedEvent state:SyncConnected type:NodeDataChanged path:/app1

11) Listen for changes in the child nodes of a node ( Path change )

       (1) stay 02 Register listening on the host /app1 The children of a node change

[zk: localhost:2181(CONNECTED) 1] ls /app1 watch

[aa0000000001, server101]

       (2) stay 03 host /app1 Create child nodes on nodes

[zk: localhost:2181(CONNECTED) 6] create /app1/bb 666

Created /app1/bb

       (3) Observe 02 The host receives the monitoring of changes in child nodes

WATCHER::

WatchedEvent state:SyncConnected type:NodeChildrenChanged path:/app1

12) Delete node

[zk: localhost:2181(CONNECTED) 4] delete /app1/bb

13) Recursively delete nodes

[zk: localhost:2181(CONNECTED) 7] rmr /app2

14) View node status

[zk: localhost:2181(CONNECTED) 12] stat /app1

cZxid = 0x20000000a

ctime = Mon Jul 17 16:08:35 CST 2017

mZxid = 0x200000018

mtime = Mon Jul 17 16:54:38 CST 2017

pZxid = 0x20000001c

cversion = 4

dataVersion = 2

aclVersion = 0

ephemeralOwner = 0x0

dataLength = 3

numChildren = 2

4.       Java API application

4.1.    establish ZooKeeper client

private static String connectString = "hadoop01:2181,hadoop02:2181,hadoop03:2181";

    private static int sessionTimeout = 2000;

    private ZooKeeper zkClient = null;

 

    @Before

    public void init() throws Exception {

 

    zkClient = new ZooKeeper(connectString, sessionTimeout, new Watcher() {

            @Override

            public void process(WatchedEvent event) {

                // Callback function after event notification ( The business logic of the user )

                System.out.println(event.getType() + "--" + event.getPath());

 

                // Start monitoring again

                try {

                    zkClient.getChildren("/", true);

                } catch (Exception e) {

                    e.printStackTrace();

                }

            }

        });

    }

4.2.    Create child nodes

    // Create child nodes

    @Test

    public void create() throws Exception {

        // Data addition, deletion, modification and query

        // Parameters 1: The path of the node to create ; Parameters 2: Node data ; Parameters 3: Node permissions ; Parameters 4: The type of node

        String nodeCreated = zkClient.create("/eclipse", "hello zk".getBytes(), Ids.OPEN_ACL_UNSAFE,CreateMode.PERSISTENT);

    }

4.3.    Get child nodes and listen

// Get child nodes

    @Test

    public void getChildren() throws Exception {

        List<String> children = zkClient.getChildren("/", true);

 

        for (String child : children) {

            System.out.println(child);

        }

 

        // Delay blocking

        Thread.sleep(Long.MAX_VALUE);

    }

4.4.    Judge znode Whether there is

// Judge znode Whether there is

    @Test

    public void exist() throws Exception {

        Stat stat = zkClient.exists("/eclipse", false);

 

        System.out.println(stat == null ? "not exist" : "exist");

    }

5.       Case actual combat

5.1.    Listen to the server node dynamic up and down cases

1)         demand : In a distributed system , There can be more than one master node , Dynamic up and down line , Any client can sense the up and down line of the master server in real time

5.2.    Demand analysis

5.3.    Code implementation

(1) Now create on the cluster /servers node

[zk: localhost:2181(CONNECTED) 10] create /servers "servers"

Created /servers

(2) Create projects and import dependencies

<dependency>

    <groupId>junit</groupId>

    <artifactId>junit</artifactId>

    <scope>test</scope>

</dependency>

<dependency>

    <groupId>org.apache.zookeeper</groupId>

    <artifactId>zookeeper</artifactId>

    <version>3.4.10</version>

</dependency>

 

(3) Server-side code

import java.io.IOException;

import org.apache.zookeeper.CreateMode;

import org.apache.zookeeper.WatchedEvent;

import org.apache.zookeeper.Watcher;

import org.apache.zookeeper.ZooKeeper;

import org.apache.zookeeper.ZooDefs.Ids;

 

public class DistributeServer {

 

    private static String connectString = "hadoop01:2181,hadoop02:2181,hadoop03:2181";

    private static int sessionTimeout = 2000;

    private ZooKeeper zk = null;

    private String parentNode = "/servers";

   

    // Create to zk Client connections for

    public void getConnect() throws IOException{

       

        zk = new ZooKeeper(connectString, sessionTimeout, new Watcher() {

 

            @Override

            public void process(WatchedEvent event) {

 

            }

        });

    }

   

    // Register the server

    public void registServer(String hostname) throws Exception{

        String create = zk.create(parentNode + "/server", hostname.getBytes(), Ids.OPEN_ACL_UNSAFE, CreateMode.EPHEMERAL_SEQUENTIAL);

       

        System.out.println(hostname +" is noline "+ create);

    }

   

    // Business function

    public void business(String hostname) throws Exception{

        System.out.println(hostname+" is working ...");

       

        Thread.sleep(Long.MAX_VALUE);

    }

   

    public static void main(String[] args) throws Exception {

        // obtain zk Connect

        DistributeServer server = new DistributeServer();

        server.getConnect();

       

        // utilize zk Connection registration server information

        server.registServer(args[0]);

       

        // Start the business function

        server.business(args[0]);

    }

}

(4) Client code

import java.io.IOException;

import java.util.ArrayList;

import java.util.List;

import org.apache.zookeeper.WatchedEvent;

import org.apache.zookeeper.Watcher;

import org.apache.zookeeper.ZooKeeper;

 

public class DistributeClient {

    private static String connectString = "hadoop01:2181,hadoop02:2181,hadoop03:2181";

    private static int sessionTimeout = 2000;

    private ZooKeeper zk = null;

    private String parentNode = "/servers";

    private volatile ArrayList<String> serversList = new ArrayList<>();

 

    // Create to zk Client connections for

    public void getConnect() throws IOException {

        zk = new ZooKeeper(connectString, sessionTimeout, new Watcher() {

 

            @Override

            public void process(WatchedEvent event) {

 

                // Start monitoring again

                try {

                    getServerList();

                } catch (Exception e) {

                    e.printStackTrace();

                }

            }

        });

    }

 

    //

    public void getServerList() throws Exception {

       

        // Get server child node information , And listen to the parent node

        List<String> children = zk.getChildren(parentNode, true);

        ArrayList<String> servers = new ArrayList<>();

       

        for (String child : children) {

            byte[] data = zk.getData(parentNode + "/" + child, false, null);

 

            servers.add(new String(data));

        }

 

        // hold servers Assign to members serverList, It has been provided for each business thread to use

        serversList = servers;

 

        System.out.println(serversList);

    }

 

    // Business function

    public void business() throws Exception {

        System.out.println("client is working ...");

Thread.sleep(Long.MAX_VALUE);

    }

 

    public static void main(String[] args) throws Exception {

 

        // obtain zk Connect

        DistributeClient client = new DistributeClient();

        client.getConnect();

 

        // obtain servers The child node information of , Get a list of server information from

        client.getServerList();

 

        // Business process start

        client.business();

    }

}

 

 

For reference only , If you have any mistakes, please point out !

What do you think , Comment area , Teach each other .

If you think it's good, you can click the recommendation on the right

 

Please bring the original link to reprint ,thank
Similar articles

2021-06-04

2021-06-04

2021-06-06

2021-06-27