The phenomenon :

Feedback from the computer room 9 P.m. , Machine room switch failure , Cause network problems

Business personnel feedback that an interface has timed out

Preliminary investigation : Through the analysis of business log, it is found that , In connection mongo One of the collections When , The error is as follows :

An error is reported when writing data :

Mongo::Error::OperationFailure - no progress was made executing batch write op in jdb3.images after rounds ( ops completed in rounds total) ():

Therefore, it is preliminarily determined that the problem lies in mongo On a cluster of pieces

Get into mongos node , Conduct findOne operation , The tips are as follows :

"errmsg" : "None of the hosts for replica set configReplSet could be contacted."

see shard Information :

--- Sharding Status ---
sharding version: {
"_id" : ,
"minCompatibleVersion" : ,
"currentVersion" : ,
"clusterId" : ObjectId("58c99a8257905f85f1828f52")
}
shards:
{ "_id" : "shard01", "host" : "shard01/100.106.23.22:27017,100.106.23.32:27017,100.111.9.19:27017" }
{ "_id" : "shard02", "host" : "shard02/100.106.23.23:27017,100.106.23.33:27017,100.111.9.20:27017" }
{ "_id" : "shard03", "host" : "shard03/100.106.23.24:27017,100.106.23.34:27017,100.111.17.3:27017" }
{ "_id" : "shard04", "host" : "shard04/100.106.23.25:27017,100.106.23.35:27017,100.111.17.4:27017" }
active mongoses:
"3.2.7" :
balancer:
Currently enabled: yes
Currently running: no
Balancer active window is set between : and : server local time
Failed balancer rounds in last attempts:
Migration Results for the last hours:
: Success
databases:
{ "_id" : "jdb3", "primary" : "shard01", "partitioned" : true }
jdb3.images
shard key: { "uuid" : }
unique: false
balancing: true
chunks:
shard01
shard02
shard03
shard04
too many chunks to print, use verbose if you want to force print
{ "_id" : "gongan", "primary" : "shard02", "partitioned" : true }
{ "_id" : "tmp", "primary" : "shard03", "partitioned" : false }
{ "_id" : "1_n", "primary" : "shard04", "partitioned" : true }
{ "_id" : "upload", "primary" : "shard04", "partitioned" : true }
upload.images
shard key: { "uuid" : }
unique: false
balancing: true
chunks:
shard01
shard02
shard03
shard04
too many chunks to print, use verbose if you want to force print
{ "_id" : "test", "primary" : "shard03", "partitioned" : false }

No abnormalities found , And then check one by one shard Node log

Found in shard4 Node 100.106.23.25 On the copy , Can't find master, And then in shard4 Of master Check the error log on

100.106.23.25 Log error information :

--10T11::53.546+ W SHARDING [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused by :: ReplicaSetNotFound: None of the hosts for replica set configReplSet could be contacted.

master100.106.23.35 Log error information :

--10T09::02.282+ W SHARDING [conn7204619] could not remotely refresh metadata for jdb3.images :: caused by :: None of the hosts for replica set configReplSet could be contacted.

And in 35 When a query is made on the server , With the mongos The error reported on the query is the same :

"errmsg" : "None of the hosts for replica set configReplSet could be contacted."

Location problem :

Among others shard1-3 Query a piece of data on , And then through the index in mongos Query by node , You can query the data , from shard04 All the information found on the node , stay mongos It's all wrong ,

solve : restart slave,25, Watch the log , There are no more errors ,

restart master,35 The server , The error report disappeared , And check the status ,master It's switched to 25 Server ,

Business feedback , The problem has been solved .

Doubtful point :

1、 Network problems lead to , Why after network recovery , Or report the following error :

--10T11::53.546+ W SHARDING [replSetDistLockPinger] pinging failed for distributed lock pinger :: caused by :: ReplicaSetNotFound: None of the hosts for replica set configReplSet could be contacted.

Don't mongo shard Connect mongos It's a long connection ?

You are welcome to let me know ! Thank you very much

Record once mongodb Due to network problems shard More articles about node exceptions

  1. SQLServer 2012 And AlwaysOn —— Specify the data synchronization link , Eliminate the delivery delay caused by network jitter

    Causes of events : Recent R & D response , A database from 08 Switch to 12 After environment , There is a problem of write commit delay from time to time : Event analysis : After eliminating the system resource contention and other issues , Preliminary analysis may be caused by network jitter synchronization mode alwayson Nodes often have sessions ...

  2. Record a configuration unix The process and problems of network programming environment

    Record a build unix Network programming environment encountered in the process of problems and summary Computer environment virtual machine  linuxmint-18-xfce-64bit 1. open unix Network programming .iso Copy the files in the directory to a directory , Modify the permissions , It's fateful ...

  3. CentOS IP The loss of , Switching the network connection vmnet8 not enabled dhcp

    It's solved ,  The problem is that I'm turning on the virtual machine ubuntu In the process of the system ,  Host computer win7 It's caused by switching the network connection on the Internet ,  It's just the wireless broadband I used at the beginning ,  Now it's on ubuntu , And then in the process of using , I am here win7 Switch back to static ...

  4. It's caused by the Internet spring cloud config Read git Error in configuration file on :Cannot clone or checkout repository

    Used in the company today spring cloud config When setting up the configuration center , There's something that can't be read git The library problem :Cannot clone or checkout repository. Baidu on the Internet , The first few answers are ...

  5. Remember once that it was caused by the Internet mysql The connection is broken (druid)

    date: 2018-04-19 21:00 tag: java,mysql,exception,mat, debugging ,jvm Tools : gceasy.io, MAT There's a weird thing about the online system bug, adopt heap ...

  6. mongodb Solutions that cannot be started due to illegal shutdown

    mongodb Solutions that cannot be started due to illegal shutdown 1. Delete the database directory .lock file 2. Enter the command mongod --repair 3. restart

  7. mongodb Replica set ( The election , Node set , Read write separation settings )

    1. Compared with the advantages of the traditional master-slave model Traditional master-slave mode , You need to manually specify the Master. If Master failure , It's usually manual intervention , Specify the new Master. This process is generally not transparent to applications , It's often accompanied by application RE modification ...

  8. Oracle The listener log file is too large, resulting in abnormal listening

    Oracle The listener log file is too large, resulting in abnormal listening db edition :11.2.0.1 os edition :windows2008 The phenomenon : Application exception , Unable to connect to database . Log in to the database server , Check that the monitor is off . Try restarting monitoring , Restart failure ...

  9. 【mongoDB Operation and maintenance ④】Shard Fragmentation cluster

    sketch Why split it up Reduce the number of single machine requests , Reduce the single machine load , Increase total load Reduce the storage space of single machine , Improve total memory . common mongodb sharding Server architecture To build a MongoDB Sharding Clu ...

Random recommendation

  1. If you can also C#, Let's get to know F#(6): Object oriented programming “ class ”

    Preface The idea of object-oriented is very mature , While using C# Our programmers are also very familiar with object-oriented programming , So I will not introduce object-oriented , In this article, we will only introduce object-oriented in F# The use of . F# It's a functional programming language that supports object orientation , So you use ...

  2. Black horse programmer ——JAVA Basic network programming

    ------Java train .Android train .iOS train ..Net train . Looking forward to communicating with you ! ------- Network programming A network model :OSI Reference model and TCP/IP Reference model There are three elements of network communication : IP Address :InetA ...

  3. JS random number

    function GetRandomNum(Min,Max){ var Range = Max - Min; var Rand = Math.random(); return(Min + Math.r ...

  4. 2048 swift

    AppearanceProvider.swift import UIKit protocol AppearanceProviderProtocol:class { func tileColor(val ...

  5. SQLServer summary

    Basics nvarchar and varchar And so on 1.nvarchar One more. N,n Indicates the unicode code , no need N It starts with utf-8 code . 2. So Chinese is in English varchar Two characters in length , stay ...

  6. Nginx-OpenResty Installation configuration

    In the last two articles : Ngnix Technology Research Series 1- From the perspective of application scenarios Nginx Reverse proxy of Ngnix Technology Research Series 2- be based on Redis Implement dynamic routing Find out , There should be one more OpenResty Installation and deployment instructions for , It's convenient for you to follow the picture ...

  7. Install OpenCV on Ubuntu or Debian

    http://milq.github.io/install-OpenCV-ubuntu-debian/ Transfer notes : Let's use the first method , The second one is sh File execution failed , Because I like the price kurento.org Source , stay ...

  8. Java Development system

    Looking back, I've been developing for ten years , I've gained a lot in the last ten years , Technical ability . train . Go abroad . Big company experience , And a lot of good friends . But on second thought , I've wasted at least five of these ten years , These five years can be enough to grow into a good programmer , unfortunately ...

  9. hdu 1272 Xiaoxi's maze 【 Union checking set 】

    < Topic link > Xiaoxi's maze Problem Description The last time Gardon The maze of Castle Xiaoxi played for a long time ( see Problem B), Now she wants to design a maze Gardon Come and go . But she designed the maze ...

  10. [UE4] Instantiate materials

    In the virtual engine 4 in , Material instantiation is used to change the appearance of a material , Without costly material recompilation . Instantiate material official document