ZooKeeper Is very popular , There's a basic question ：
- ZooKeeper What is it used for ？
- Not before ZK, Why was it born ZK？
OK, Answer the question above ：（ Here's what I said intuitively ）
- ZooKeeper Distributed application development is used to simplify , Shield developers from the underlying details of the distributed application development process
- ZooKeeper The exposure is simple API, For supporting distributed application development
- ZooKeeper While providing the above functions , It's still a High performance 、 High availability 、 Highly reliable distributed cluster
That's all it says , To sum up ,ZK Distributed application development can solve the problem ,ZK Can solve the problem very well . To this step , There are more questions ：
- Distributed application development , What are the common problems ？ZK How to shield these underlying details ？
- ZooKeeper Exposed to the outside world API？ these API How to support distributed application development ？ these API Can it be simplified ？API How about the semantic nature of ？
- ZooKeeper Itself is a high performance 、 High availability 、 Highly reliable distributed cluster , There's a simple question ：
- What does high performance mean ？ZooKeeper In order to achieve high performance , What has been done ？
- High availability is the same as
- High reliability is the same as
Note： This article wiki It is to solve the first question .（ Other questions will be in other blog Step by step ）
An application , When multiple processes are involved in collaboration , There are a lot of complex process collaboration logic in business logic code .
The above multi process collaboration logic , Yes 2 Characteristics ：
- Deal with complexity
- Processing logic is reusable
therefore , Consider the common problems of multi process collaboration , As infrastructure , Give Way RD More focus on business logic development , namely ：
ZooKeeper It is one of the basic services of multi process cooperation .
ZooKeeper There are several simple features ：
- ZooKeeper Of API： from file system API The inspiration , Provide simple API
- ZooKeeper Running on a dedicated server , Separate from business logic , To ensure the High fault tolerance and Extensibility
ZooKeeper It's a storage facility , But pay special attention to
- ZK The data stored on is focused on ：
Metadata）, Instead of applying data , Application data has its own storage scheme , for example HDFS etc.
- ZK Essentially , It can be seen as a kind of
In particular ：
Application data and metadata , Due to different scenarios , There are differences in requirements for consistency and persistence , therefore , Architecture design 、 In the process of data governance , Should be 2 Class data is viewed independently 、 Independent storage .
ZK The core problem to be solved ：
ZK The goal is ： Simplify distributed application development , Multi process collaboration problem . For distributed applications , Provide
reliable The distributed coordination service （ Basic services ）, for example ：
- Unified naming service
- Distributed lock
- Process crash detection
- Leader The election
- Configuration Management ： When the configuration changes , Send it to all in time Client.
A simple question ： What is multi process collaboration ？ Nima , Over and over , You have everything , Facing this crazy skull , Let's answer .
Multi process collaboration , The whole is divided into 2 class ：
- Collaboration ： Many processes need to be dealt with together , Some processes take action, and others work properly , for example ： Master-slave structure ,M towards S Assigned tasks ,S Will execute , otherwise S Just stay idle
- competition ： Two processes can't work at the same time , One process must wait for another process to finish executing , for example ： Master-slave structure ,M After node failure , quite a lot S All want to be M, At this time , You need mutexes , Only the first to get the lock S Become M
In particular ：
- No collaboration across networks ： Multi process , It can be on the same physical host , Synchronization primitives are very convenient ( such as ？ The Conduit 、 Shared memory 、 Message queue 、 Semaphore )
- Collaboration across networks ： Multi process , Distributed on different physical hosts ,ZK Focus on this category
Multi process collaboration across networks , Process of communication , The basic idea is 2 individual ：
- Message mechanism ： Through the network , Direct information exchange , Multi message passing algorithm , Implement synchronization primitives
- Shared memory ： Using external shared storage , Achieve multi process collaboration , requirement
Shared memoryProvide orderly access to ,ZK In this way
In real systems , Cross network communication , There are several common problems ：
- Message delay ： Because of the Internet , Send later and arrive first
- Processor performance ： Due to system scheduling , When the message arrives , Delays in processing
- Clock offset ： Different physical hosts , Clock offset
ZK Carefully designed for Block the above 3 A common problem , Make these problems completely transparent at the application service level .
Consistency of distributed systems ：
- The messaging ： Delay , Message sent first , You don't have to arrive first ;
- The messaging ： Loss of sex , Message sent , May be lost ;
- Node crash ： Within a distributed system , Any node can crash ;
under these circumstances , How to ensure data consistency ？
- Vote on the proposal ： Based on voting strategy ,2PC
- The election vote ： Based on voting strategy , Cast
The highest priority node（ The node that contains the latest data ）
Paxos The goal is ： solve
Distributed consistency problem , Improve distributed systems
Fault tolerance The consistency algorithm .
Paxos The essence ： be based on
The messaging Of
Highly fault-tolerant Of
ZooKeeper yes ：
- Distributed coordination services
- Efficient 、 reliable
- Convenient applications , focusing
Business logic development, And you don't have to pay too much attention to
Details of distributed inter process collaboration
ZooKeeper No direct exposure
The original language , It is , Expose a part of
Calling method Composed of API, File system like API, Support applications to implement their own
The original language .
ZooKeeper The following distributed consistency features can be guaranteed ：
- Sequential consistency ： The same Client Initiated transaction request , Strictly follow the order of initiation
- Atomicity ： Transaction request , Or apply to all nodes , Either one node has no application
- A single view ：Client No matter which node you connect to , The server-side data you see is consistent （Note： inaccurate , It's actually Final consistency ）
- reliability ： Once the transaction is successfully executed , The state remains permanently
- The real time ： Once the transaction is successfully executed ,Client You can't see the latest data immediately , but ZooKeeper Guarantee Final consistency
ZooKeeper Committed to providing
High performance 、
High availability 、
Sequential consistency The distributed coordination service , Guaranteed data
Final consistency .
A tree structureOrganize data nodes ;
- Full data nodes , All stored in memory ;
- Follower and Observer Handle non transactional requests directly ;
- More than half of the machines survive , The service will run normally
- Automatically Leader The election
- Each transaction request , Will be forwarded to Leader Handle
- Every business , The global allocation will be incremental id（zxid,64 position ：epoch + Self increasing id）
- By proposing a vote , Ensure the reliability of transaction commit
- Propose a way to vote , Only guarantee Client After receiving the transaction commit successfully , More than half of the nodes can see the latest data
ZK Before appearance , There are two ways of distributed system , Achieve multi process collaboration ：
- Distributed lock manager
- Distributed database
ZK More focused on process collaboration , It does not provide any lock interface and general storage data interface .（ doubt ：ZK It can also be provided , We don't need to use it ）
application server , common 2 Needs ：
- Master-Slave?Leader The election ： Asked to provide Master Node election function
- Process response tracking ? Crash detection ： Request to provide a trace of the process's survival state
- Distributed lock ： Exclusive lock
ZK For the above 2 These strategies provide the basis API.
ZooKeeper Scenarios that are not applicable ：
- Mass data storage ：ZK The essence is
special FS, but ZK Used to store
Metadata, It needs to be stored separately
Why don't I fully advocate self-study ？
① Daniel on the platform basically has many years of working experience , Have you ever thought about the threshold of the industry before , What is the industry threshold now ？ In the past, enterprises did not have such high requirements for programmers' ability , Even more than a decade ago, if you could write “Hello World”, You can get started in this industry , So you can get started before .
② Now there are some excellent young cattle , They may also be self-taught , But they must have excellent learning ability , Excellent self-management skills （ time management , Meditation, persistence, etc ） And be good at finding and summarizing problems .
If you think your goal is very clear , Can you do the first ② The points mentioned in point , In the current market , You are really suitable for self-study .
besides , For most people , Registering for a class must be the best way to grow quickly . But there's a problem , At present, the quality of training institutions in the market is uneven , If you don't find a good training class , It's a total waste of energy , Time and money , This needs to be selected by yourself .
I personally suggest that online is more cost-effective than offline , The price of offline training is basically unchanged 2W You can't get down , Online education is now more mature , During this outbreak , Students have basically experienced the online learning mode . Compared with offline , The advantages of online are mainly in the following aspects according to my understanding ：
① Price ： The online price is basically half of the offline price ;
② teacher ： Relatively speaking, the teachers of online education are stronger and richer than offline education , Better coordination of resources ;
③ Time ： Learning time is relatively free , Don't learn naked words , Suitable for learning while working , Reduce the stress of life ;
④ Course ： In terms of course content , It really goes deeper than offline .
What technologies should be learned to meet the requirements of the enterprise ？（ The following figure summarizes ）