Bhubaneswar, Odisha, India
+91-8328865778
support@softchief.com

Tag: spark streaming

Tips & Tricks – Spark Streaming and Amazon S3

As we all know, the Amazon S3 is an amazing storage to deal with persisting the hot and cold data in this big-data era. It has 99.99% uptime which has been claimed by amazon. You can follow the documentation from amazon for more details. When we all agreeing upon the S3 storage is easiest, resilient…
Read more

ERROR when writing file to S3 bucket from EMRFS enabled Spark cluster

ERROR : 18/03/02 01:42:17 INFO RetryInvocationHandler: Exception while invoking ConsistencyCheckerS3FileSystem.mkdirs over null. Retrying after sleeping for 10000ms. com.amazon.ws.emr.hadoop.fs.consistency.exception.ConsistencyException: Directory ‘bucket/folder/_temporary’ present in the metadata but not s3 at com.amazon.ws.emr.hadoop.fs.consistency.ConsistencyCheckerS3FileSystem.getFileStatus(ConsistencyCheckerS3FileSystem.java:506)   Root cause : Mostly the consistent problem comes due to Manual deletion of files and directory from S3 console retry logic in spark and hadoop…
Read more