您现在的位置是:首页 >技术教程 >修复HDFS主备节点edits不一致导致的无法进行checkpoint的问题网站首页技术教程
修复HDFS主备节点edits不一致导致的无法进行checkpoint的问题
简介修复HDFS主备节点edits不一致导致的无法进行checkpoint的问题
背景
项目上一套HDFS环境,从4月起会偶发HDFS namenode宕机的问题,后来出现的越来越频繁,最后甚至启动后四五分钟就会宕机,接到需求开始进行排查。
排查过程
日志报错
2023-05-27 22:50:21,844 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(390)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.xxx.xxx.xxx:8485, 192.xxx.xxx.yyy:8485, 192.xxx.xxx.zzz:8485], stream=QuorumOutputStream starting at txid 8169027))
java.io.IOException: Timed out waiting 20000ms for a quorum of nodes to respond.
at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)
at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:109)
at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:525)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:385)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:55)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:521)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:710)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.run(FSEditLogAsync.java:188)
at java.lang.Thread.run(Thread.java:748)
怀疑是namenode与journalnode通信超时了
参考其他博客,尝试调整dfs.qjournal.write-txns.timeout.ms参数至一分钟(默认是20s)。namenode宕机的频率下降到4-6小时一次,没有从根本上解决问题
二次排查
2023-05-29 22:00:25,517 WARN client.QuorumJournalManager (QuorumCall.java:waitFor(186)) - Waited 58044 ms (timeout=60000 ms) for a response for sendEdits. Succeeded so far: [192.xxx.xxx.xxx:8485]
2023-05-29 22:00:26,518 WARN client.QuorumJournalManager (QuorumCall.java:waitFor(186)) - Waited 59045 ms (timeout=60000 ms) for a response for sendEdits. Succeeded so far: [192.xxx.xxx.xxx:8485]
2023-05-29 22:00:27,474 FATAL namenode.FSEditLog (JournalSet.java:mapJournalsAndReportErrors(390)) - Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.xxx.xxx.xxx:8485, 192.xxx.xxx.yyy:8485, 192.xxx.xxx.zzz:8485], stream=QuorumOutputStream starting at txid 8180981))
java.io.IOException: Timed out waiting 60000ms for a quorum of nodes to respond.
at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:137)
at org.apache.hadoop.hdfs.qjournal.client.QuorumOutputStream.flushAndSync(QuorumOutputStream.java:109)
at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:113)
at org.apache.hadoop.hdfs.server.namenode.EditLogOutputStream.flush(EditLogOutputStream.java:107)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream$8.apply(JournalSet.java:525)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.mapJournalsAndReportErrors(JournalSet.java:385)
at org.apache.hadoop.hdfs.server.namenode.JournalSet.access$100(JournalSet.java:55)
at org.apache.hadoop.hdfs.server.namenode.JournalSet$JournalSetOutputStream.flush(JournalSet.java:521)
at org.apache.hadoop.hdfs.server.namenode.FSEditLog.logSync(FSEditLog.java:710)
at org.apache.hadoop.hdfs.server.namenode.FSEditLogAsync.run(FSEditLogAsync.java:188)
at java.lang.Thread.run(Thread.java:748)
2023-05-29 22:00:27,477 WARN client.QuorumJournalManager (QuorumOutputStream.java:abort(74)) - Aborting QuorumOutputStream starting at txid 8180981
2023-05-29 22:00:27,482 INFO util.ExitUtil (ExitUtil.java:terminate(210)) - Exiting with status 1: Error: flush failed for required journal (JournalAndStream(mgr=QJM to [192.xxx.xxx.xxx:8485, 192.xxx.xxx.yyy:8485, 192.xxx.xxx.zzz:8485], stream=QuorumOutputStream starting at txid 8180981))
发现是nanenode 向journalnode发送edit日志等待响应的时候超时报错,只有一台journalnode返回成功,其他两台没有响应,触发了namenode的shutdownHook,导致namenode进程中止。
进而排查journalnode 日志,发现是三个节点都正常完成了edits文件写入
2023-05-29 22:01:00,255 INFO server.Journal (Journal.java:scanStorageForLatestEdits(211)) - Latest log is EditLogFile(file=/hadoop/hdfs/journal/journalid/current/edits_inprogress_0000000000008180981,first=0000000000008180981,last=0000000000008180985,inProgress=true,hasCorruptHeader=false) ; journal id: journalid
2023-05-29 22:01:00,308 INFO server.Journal (Journal.java:getSegmentInfo(740)) - getSegmentInfo(8180981): EditLogFile(file=/hadoop/hdfs/journal/journalid/current/edits_inprogress_0000000000008180981,first=0000000000008180981,last=0000000000008180985,inProgress=true,hasCorruptHeader=false) -> startTxId: 8180981 endTxId: 8180985 isInProgress: true ; journal id: journalid
2023-05-29 22:01:00,308 INFO server.Journal (Journal.java:prepareRecovery(784)) - Prepared recovery for segment 8180981: segmentState { startTxId: 8180981 endTxId: 8180985 isInProgress: true } lastWriterEpoch: 92 lastCommittedTxId: 8180984 ; journal id: journalid
2023-05-29 22:01:00,378 INFO server.Journal (Journal.java:getSegmentInfo(740)) - getSegmentInfo(8180981): EditLogFile(file=/hadoop/hdfs/journal/journalid/current/edits_inprogress_0000000000008180981,first=0000000000008180981,last=0000000000008180985,inProgress=true,hasCorruptHeader=false) -> startTxId: 8180981 endTxId: 8180985 isInProgress: true ; journal id: journalid
2023-05-29 22:01:00,378 INFO server.Journal (Journal.java:acceptRecovery(872)) - Skipping download of log startTxId: 8180981 endTxId: 8180985 isInProgress: true: already have up-to-date logs ; journal id: journalid
2023-05-29 22:01:00,386 WARN util.AtomicFileOutputStream (AtomicFileOutputStream.java:close(96)) - Unable to delete tmp file /hadoop/hdfs/journal/journalid/current/paxos/8180981.tmp
2023-05-29 22:01:00,386 INFO server.Journal (Journal.java:acceptRecovery(905)) - Accepted recovery for segment 8180981: segmentState { startTxId: 8180981 endTxId: 8180985 isInProgress: true } acceptedInEpoch: 93 ; journal id: journalid
2023-05-29 22:01:00,424 INFO server.Journal (Journal.java:finalizeLogSegment(634)) - Validating log segment /hadoop/hdfs/journal/journalid/current/edits_inprogress_0000000000008180981 about to be finalized ; journal id: journalid
2023-05-29 22:01:00,426 INFO namenode.FileJournalManager (FileJournalManager.java:finalizeLogSegment(143)) - Finalizing edits file /hadoop/hdfs/journal/journalid/current/edits_inprogress_0000000000008180981 -> /hadoop/hdfs/journal/journalid/current/edits_0000000000008180981-0000000000008180985
2023-05-29 22:01:00,566 INFO namenode.TransferFsImage (TransferFsImage.java:copyFileToStream(396)) - Sending fileName: /hadoop/hdfs/journal/journalid/current/edits_0000000000008180981-0000000000008180985, fileSize: 1048576. Sent total: 1048576 bytes. Size of last segment intended to send: -1 bytes.
继续排查发现三台journalnode的的edits文件不一致,联想到一线反馈前期集群namenode重启过多次,结合Ambari告警,已经几个月没有checkpoint了,猜测可能导致edits文件与fsimage文件不匹配,导致的问题。于是决定手动重新大家standby namenode与journalnode。
重新搭建journal node
- 停止两个namenode,停止3个journal node
- 清空journal node current目录中所有的edits文件,包括已完成的和inprogress的
- 启动journal node
重新搭建standby namenode
- 清空standby namenode current目录,保留seen_txid与VERSION文件
- 从active namenode中同步一份最新的fsimage与对应的md5过来,修改文件属组为hdfs:hadoop
- 启动standby namenode,等待它从journal node同步edits文件,自动合并fsimage
问题解决
重新搭建journalnode 和standby namenode之后,namenode已正常运行超过24h,问题解决
风语者!平时喜欢研究各种技术,目前在从事后端开发工作,热爱生活、热爱工作。