mongo副本集mongos启动报错
背景
项目采用mongo副本集的形式存储数据,经常出现无故断电导致某个副本mongos启动不起来的问题。
环境介绍
MongoDB副本集:
mongo01:192.168.36.218
mongo02:192.168.36.219
mongo03:192.168.36.220网站建设哪家好,找创新互联公司!专注于网页设计、网站建设、微信开发、小程序设计、集团企业网站建设等服务项目。为回馈新老客户创新互联还提供了景谷免费建站欢迎大家使用!
报错信息
mongo03副本上mongos启动不起来,执行命令报错:
[root@localhost ~]# mongos --configdb 192.168.36.218:20000,192.168.36.219:20000,192.168.36.220:20000 --port 30000 --chunkSize 500 --logpath /home/mongo/logs/mongos.log --logappend --fork
about to fork child process, waiting until server is ready for connections.
forked process: 79748
ERROR: child process failed, exited with error number 5
查看mongos.log,错误信息如下:
2018-06-25T09:13:47.607+0800 I CONTROL [main] ***** SERVER RESTARTED *****
2018-06-25T09:13:47.612+0800 I CONTROL [main] ** WARNING: You are running this process as the root user, which is not recommended.
2018-06-25T09:13:47.613+0800 I CONTROL [main]
2018-06-25T09:13:47.613+0800 I SHARDING [mongosMain] MongoS version 3.2.1 starting: pid=80904 port=30000 64-bit host=MongoDB03 (--help for usage)
2018-06-25T09:13:47.613+0800 I CONTROL [mongosMain] db version v3.2.1
2018-06-25T09:13:47.613+0800 I CONTROL [mongosMain] git version: a14d55980c2cdc565d4704a7e3ad37e4e535c1b2
2018-06-25T09:13:47.613+0800 I CONTROL [mongosMain] allocator: tcmalloc
2018-06-25T09:13:47.613+0800 I CONTROL [mongosMain] modules: none
2018-06-25T09:13:47.613+0800 I CONTROL [mongosMain] build environment:
2018-06-25T09:13:47.613+0800 I CONTROL [mongosMain] distarch: x86_64
2018-06-25T09:13:47.613+0800 I CONTROL [mongosMain] target_arch: x86_64
2018-06-25T09:13:47.613+0800 I CONTROL [mongosMain] options: { net: { port: 30000 }, processManagement: { fork: true }, sharding: { chunkSize: 500, configDB: "192.168.36.218:20000,192.168.36.219:20000,192.168.36.220:20000" }, systemLog: { destination: "file", logAppend: true, path: "/home/mongo/logs/mongos.log" } }
2018-06-25T09:13:47.613+0800 I SHARDING [mongosMain] Updating config server connection string to: 192.168.36.218:20000,192.168.36.219:20000,192.168.36.220:20000
2018-06-25T09:13:47.625+0800 W SHARDING [mongosMain] config servers 192.168.36.218:20000 and 192.168.36.220:20000 differ
2018-06-25T09:13:47.627+0800 W SHARDING [mongosMain] config servers 192.168.36.218:20000 and 192.168.36.220:20000 differ
2018-06-25T09:13:47.628+0800 W SHARDING [mongosMain] config servers 192.168.36.218:20000 and 192.168.36.220:20000 differ
2018-06-25T09:13:47.630+0800 W SHARDING [mongosMain] config servers 192.168.36.218:20000 and 192.168.36.220:20000 differ
2018-06-25T09:13:47.630+0800 E SHARDING [mongosMain] Error initializing sharding system: ConfigServersInconsistent hash from 192.168.36.218:20000: { chunks: "d41d8cd98f00b204e9800998ecf8427e", databases: "95954cb16c029767f4ad050712a28f49", shards: "68f4b37fec8c2ac97cc985aa01f37717", version: "b25e55c19a8c75c87b4f950dcf5eb088" } vs hash from 192.168.36.220:20000: {}
如上我们发现:
config servers 192.168.36.218:20000 and 192.168.36.220:20000 differ
配置服务器192.168.36.218:20000和损坏的配置服务器192.168.36.220:20000不一致,也就是说mongo01:20000和mongo03:20000上面的配置不一样
修复
那我们该如何修复呢?其实我们可以将mongo01:20000上的config库导入到mongo03:20000的config库中,以解决上面的问题。
执行过程
第一步:备份mongo01:20000的config库
[root@localhost ~]# mongodump --host 192.168.36.218:20000 -d config -o /home/config
2018-06-25T10:06:06.225+0800 writing config.actionlog to
2018-06-25T10:06:06.226+0800 writing config.locks to
2018-06-25T10:06:06.226+0800 writing config.mongos to
2018-06-25T10:06:06.226+0800 writing config.lockpings to
2018-06-25T10:06:06.227+0800 done dumping config.locks (3 documents)
2018-06-25T10:06:06.228+0800 done dumping config.lockpings (2 documents)
2018-06-25T10:06:06.229+0800 done dumping config.mongos (3 documents)
2018-06-25T10:06:06.229+0800 writing config.shards to
2018-06-25T10:06:06.229+0800 writing config.settings to
2018-06-25T10:06:06.229+0800 writing config.version to
2018-06-25T10:06:06.230+0800 done dumping config.shards (1 document)
2018-06-25T10:06:06.230+0800 writing config.databases to
2018-06-25T10:06:06.230+0800 done dumping config.settings (1 document)
2018-06-25T10:06:06.230+0800 done dumping config.version (1 document)
2018-06-25T10:06:06.230+0800 writing config.changelog to
2018-06-25T10:06:06.230+0800 writing config.chunks to
2018-06-25T10:06:06.231+0800 done dumping config.databases (1 document)
2018-06-25T10:06:06.231+0800 writing config.tags to
2018-06-25T10:06:06.232+0800 done dumping config.chunks (0 documents)
2018-06-25T10:06:06.232+0800 done dumping config.changelog (1 document)
2018-06-25T10:06:06.232+0800 done dumping config.tags (0 documents)
2018-06-25T10:06:06.355+0800 done dumping config.actionlog (8160 documents)
[root@localhost ~]#
第二步:导入备份的config到mongo03:20000中
[root@localhost ~]# mongorestore --host 192.168.36.220:20000 -d config /home/config/config
2018-06-25T10:08:09.136+0800 building a list of collections to restore from /home/config/config dir
2018-06-25T10:08:09.162+0800 reading metadata for config.actionlog from /home/config/config/actionlog.metadata.json
2018-06-25T10:08:09.163+0800 reading metadata for config.locks from /home/config/config/locks.metadata.json
2018-06-25T10:08:09.163+0800 reading metadata for config.changelog from /home/config/config/changelog.metadata.json
2018-06-25T10:08:09.164+0800 reading metadata for config.mongos from /home/config/config/mongos.metadata.json
2018-06-25T10:08:09.164+0800 restoring config.locks from /home/config/config/locks.bson
2018-06-25T10:08:09.165+0800 restoring config.mongos from /home/config/config/mongos.bson
2018-06-25T10:08:09.175+0800 error: multiple errors in bulk operation:
- E11000 duplicate key error collection: config.mongos index: _id_ dup key: { : "MongoDB01:30000" }
- E11000 duplicate key error collection: config.mongos index: _id_ dup key: { : "MongoDB02:30000" }
2018-06-25T10:08:09.175+0800 restoring indexes for collection config.mongos from metadata
2018-06-25T10:08:09.204+0800 finished restoring config.mongos (3 documents)
2018-06-25T10:08:09.204+0800 restoring indexes for collection config.locks from metadata
2018-06-25T10:08:09.204+0800 reading metadata for config.lockpings from /home/config/config/lockpings.metadata.json
2018-06-25T10:08:09.205+0800 restoring config.lockpings from /home/config/config/lockpings.bson
2018-06-25T10:08:09.215+0800 restoring config.actionlog from /home/config/config/actionlog.bson
2018-06-25T10:08:09.269+0800 restoring config.changelog from /home/config/config/changelog.bson
2018-06-25T10:08:09.284+0800 restoring indexes for collection config.changelog from metadata
2018-06-25T10:08:09.284+0800 finished restoring config.locks (3 documents)
2018-06-25T10:08:09.284+0800 reading metadata for config.shards from /home/config/config/shards.metadata.json
2018-06-25T10:08:09.284+0800 restoring config.shards from /home/config/config/shards.bson
2018-06-25T10:08:09.286+0800 finished restoring config.changelog (1 document)
2018-06-25T10:08:09.286+0800 reading metadata for config.version from /home/config/config/version.metadata.json
2018-06-25T10:08:09.286+0800 restoring config.version from /home/config/config/version.bson
2018-06-25T10:08:09.287+0800 error: multiple errors in bulk operation:
- E11000 duplicate key error collection: config.lockpings index: _id_ dup key: { : "MongoDB01:30000:1523346496:1804289383" }
- E11000 duplicate key error collection: config.lockpings index: _id_ dup key: { : "MongoDB02:30000:1529566939:1804289383" }
2018-06-25T10:08:09.287+0800 restoring indexes for collection config.lockpings from metadata
2018-06-25T10:08:09.297+0800 restoring indexes for collection config.shards from metadata
2018-06-25T10:08:09.318+0800 finished restoring config.lockpings (2 documents)
2018-06-25T10:08:09.318+0800 reading metadata for config.databases from /home/config/config/databases.metadata.json
2018-06-25T10:08:09.318+0800 restoring config.databases from /home/config/config/databases.bson
2018-06-25T10:08:09.330+0800 restoring indexes for collection config.version from metadata
2018-06-25T10:08:09.370+0800 finished restoring config.shards (1 document)
2018-06-25T10:08:09.370+0800 reading metadata for config.settings from /home/config/config/settings.metadata.json
2018-06-25T10:08:09.370+0800 restoring config.settings from /home/config/config/settings.bson
2018-06-25T10:08:09.372+0800 finished restoring config.version (1 document)
2018-06-25T10:08:09.372+0800 reading metadata for config.tags from /home/config/config/tags.metadata.json
2018-06-25T10:08:09.372+0800 restoring config.tags from /home/config/config/tags.bson
2018-06-25T10:08:09.374+0800 restoring indexes for collection config.tags from metadata
2018-06-25T10:08:09.382+0800 restoring indexes for collection config.databases from metadata
2018-06-25T10:08:09.400+0800 finished restoring config.tags (0 documents)
2018-06-25T10:08:09.400+0800 reading metadata for config.chunks from /home/config/config/chunks.metadata.json
2018-06-25T10:08:09.400+0800 restoring config.chunks from /home/config/config/chunks.bson
2018-06-25T10:08:09.409+0800 restoring indexes for collection config.settings from metadata
2018-06-25T10:08:09.409+0800 finished restoring config.databases (1 document)
2018-06-25T10:08:09.411+0800 finished restoring config.settings (1 document)
2018-06-25T10:08:09.555+0800 restoring indexes for collection config.chunks from metadata
2018-06-25T10:08:09.581+0800 finished restoring config.chunks (0 documents)
2018-06-25T10:08:09.754+0800 restoring indexes for collection config.actionlog from metadata
2018-06-25T10:08:09.754+0800 finished restoring config.actionlog (8160 documents)
2018-06-25T10:08:09.754+0800 done
[root@localhost ~]#
注:我是在生产环境中执行的,当时把mongo01:20000和mongo02:20000上面的config库全部备份了一下,然后按顺序全部导入到了mongo03:20000中(备份和导入命令同上)。按理说应该只导入mongo01:20000就可以了,由于是生产环境,我就不加以测试了,等到下次再出现这种问题再试试吧。同时也欢迎大家测试一下,然后回复到评论区,帮助大家解决一下问题。
尝试启动mongo03上的mongos
[root@localhost ~]# mongos --configdb 192.168.36.218:20000,192.168.36.219:20000,192.168.36.220:20000 --port 30000 --chunkSize 500 --logpath /home/mongo/logs/mongos.log --logappend --fork
about to fork child process, waiting until server is ready for connections.
forked process: 4286
child process started successfully, parent exiting
[root@localhost ~]# ps -ef | grep mongos
root 44760 1 0 Jun21 ? 00:25:14 mongos --configdb 192.168.36.218:20000,192.168.36.219:20000,192.168.36.220:20000 --port 30000 --chunkSize 500 --logpath /home/mongo/logs/mongos.log --logappend --fork
root 66128 66090 0 11:40 pts/0 00:00:00 grep mongos
[root@localhost ~]#
如上,我们可以看到mongos成功启动并在后台保持运行。
网站名称:mongo副本集mongos启动报错
当前路径:http://pwwzsj.com/article/ppidge.html