Solved : Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try

Sudhir Pradhan Troubleshoot July 9, 2018 | 0

Post Views: 3,801

Caused by: java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage[172.27.10.191:50010,DS-51d68378-35de-4c70-b27e-7a98a53919cc,DISK], DatanodeInfoWithStorage[172.27.10.223:50010,DS-f172c682-713d-4a8f-b8af-69198ddc6756,DISK]], original=[DatanodeInfoWithStorage[172.27.10.191:50010,DS-51d68378-35de-4c70-b27e-7a98a53919cc,DISK], DatanodeInfoWithStorage[172.27.10.223:50010,DS-f172c682-713d-4a8f-b8af-69198ddc6756,DISK]]). The current failed datanode replacement policy is DEFAULT, and a client may configure this via ‘dfs.client.block.write.replace-datanode-on-failure.policy’ in its configuration.

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:925)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:988)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:1156)

at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:454)

[2017-11-10 07:08:03,097] ERROR Task hdfs-sink-prqt-stndln-rpro-fct-0 threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerSinkTask:455)

java.lang.RuntimeException: java.util.concurrent.ExecutionException: io.confluent.connect.hdfs.errors.HiveMetaStoreException: Invalid partition for default.revpro-perf-oracle-jdbc-stdln-raw-KFK_RPRO_RI_WF_SUMM_GV: partition=0

at io.confluent.connect.hdfs.DataWriter.write(DataWriter.java:226)

at io.confluent.connect.hdfs.HdfsSinkTask.put(HdfsSinkTask.java:103)

at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:435)

at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:251)

at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:180)

at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:148)

at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:146)

at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:190)

at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)

at java.util.concurrent.FutureTask.run(FutureTask.java:266)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)

at java.lang.Thread.run(Thread.java:748)

Caused by: java.util.concurrent.ExecutionException: io.confluent.connect.hdfs.errors.HiveMetaStoreException: Invalid partition for default.revpro-perf-oracle-jdbc-stdln-raw-KFK_RPRO_RI_WF_SUMM_GV: partition=0

at java.util.concurrent.FutureTask.report(FutureTask.java:122)

at java.util.concurrent.FutureTask.get(FutureTask.java:192)

at io.confluent.connect.hdfs.DataWriter.write(DataWriter.java:220)

… 12 more

Caused by: io.confluent.connect.hdfs.errors.HiveMetaStoreException: Invalid partition for default.revpro-perf-oracle-jdbc-stdln-raw-KFK_RPRO_RI_WF_SUMM_GV: partition=0

at io.confluent.connect.hdfs.hive.HiveMetaStore.addPartition(HiveMetaStore.java:107)

at io.confluent.connect.hdfs.TopicPartitionWriter$3.call(TopicPartitionWriter.java:662)

at io.confluent.connect.hdfs.TopicPartitionWriter$3.call(TopicPartitionWriter.java:659)

… 4 more

Caused by: InvalidObjectException(message:default.revpro-perf-oracle-jdbc-stdln-raw-KFK_RPRO_RI_WF_SUMM_GV table not found)

at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$append_partition_by_name_with_environment_context_result$append_partition_by_name_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:51619)

at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$append_partition_by_name_with_environment_context_result$append_partition_by_name_with_environment_context_resultStandardScheme.read(ThriftHiveMetastore.java:51596)

at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$append_partition_by_name_with_environment_context_result.read(ThriftHiveMetastore.java:51519)

at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)

at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_append_partition_by_name_with_environment_context(ThriftHiveMetastore.java:1667)

at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.append_partition_by_name_with_environment_context(ThriftHiveMetastore.java:1651)

at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.appendPartition(HiveMetaStoreClient.java:606)

at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.appendPartition(HiveMetaStoreClient.java:600)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:498)

at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:152)

at com.sun.proxy.$Proxy51.appendPartition(Unknown Source)

at io.confluent.connect.hdfs.hive.HiveMetaStore$1.call(HiveMetaStore.java:97)

at io.confluent.connect.hdfs.hive.HiveMetaStore$1.call(HiveMetaStore.java:91)

at io.confluent.connect.hdfs.hive.HiveMetaStore.doAction(HiveMetaStore.java:87)

at io.confluent.connect.hdfs.hive.HiveMetaStore.addPartition(HiveMetaStore.java:103)

… 6 more

Root cause :

This error specially occures when clusters with just 3 or less Datanodes may experience Append failures under heavy load. Here’s the config parameter to fix it.

Solution :

Append the following set of configuration to your client (not name node or data node) in hdfs-site.xml file

<property>
<name>dfs.client.block.write.replace-datanode-on-failure.enable</name>
<value>true</value>
<description>
If there is a datanode/network failure in the write pipeline,
DFSClient will try to remove the failed datanode from the pipeline
and then continue writing with the remaining datanodes. As a result,
the number of datanodes in the pipeline is decreased. The feature is
to add new datanodes to the pipeline.

This is a site-wide property to enable/disable the feature.

When the cluster size is extremely small, e.g. 3 nodes or less, cluster
administrators may want to set the policy to NEVER in the default
configuration file or disable this feature. Otherwise, users may
experience an unusually high rate of pipeline failures since it is
impossible to find new datanodes for replacement.

See also dfs.client.block.write.replace-datanode-on-failure.policy
</description>
</property>

<property>
<name>dfs.client.block.write.replace-datanode-on-failure.policy</name>
<value>DEFAULT</value>
<description>
This property is used only if the value of
dfs.client.block.write.replace-datanode-on-failure.enable is true.

ALWAYS: always add a new datanode when an existing datanode is removed.

NEVER: never add a new datanode.

DEFAULT:
Let r be the replication number.
Let n be the number of existing datanodes.
Add a new datanode only if r is greater than or equal to 3 and either
(1) floor(r/2) is greater than or equal to n; or
(2) r is greater than n and the block is hflushed/appended.
</description>
</property>

<property>
<name>dfs.client.block.write.replace-datanode-on-failure.best-effort</name>
<value>false</value>
<description>
This property is used only if the value of
dfs.client.block.write.replace-datanode-on-failure.enable is true.

Best effort means that the client will try to replace a failed datanode
in write pipeline (provided that the policy is satisfied), however, it
continues the write operation in case that the datanode replacement also
fails.

Suppose the datanode replacement fails.
false: An exception should be thrown so that the write will fail.
true : The write should be resumed with the remaining datandoes.

Note that setting this property to true allows writing to a pipeline
with a smaller number of datanodes. As a result, it increases the
probability of data loss.
</description>
</property>