Created
January 31, 2023 17:06
-
-
Save vaskokj/29904888ad20739bd95977da2fc1ce78 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Spark Command: /usr/lib/jvm/java-1.11.0-openjdk-amd64/bin/java -cp spark-3.3.1-bin-hadoop3/conf/:spark-3.3.1-bin-hadoop3/jars/* -Dcom.amazonaws.services.s3.enableV4=true -Xmx1g org.apache.spark.deploy.worker.Worker --webui-port 8081 localhost:7077 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
org.apache.hadoop.fs.s3a.AWSBadRequestException: getFileStatus on s3a://bucket/lakefs/projects/project/_lakefs/retention/gc/commits/run_id=cc8f0b1c-a563-48fe-981e-c3004e7e7bd6/commits.csv: com.amazonaws.services.s3.model.AmazonS3Exception: The authorization header is malformed; the region 'vpce' is wrong; expecting 'us-gov-west-1' (Service: Amazon S3; Status Code: 400; Error Code: AuthorizationHeaderMalformed; Request ID: <redacted>; S3 Extended Request ID: <redacted>; Proxy: null), S3 Extended Request ID: <redacted>=:AuthorizationHeaderMalformed: The authorization header is malformed; the region 'vpce' is wrong; expecting 'us-gov-west-1' (Service: Amazon S3; Status Code: 400; Error Code: AuthorizationHeaderMalformed; Request ID: <redacted>; S3 Extended Request ID: <redacted>=; Proxy: null) | |
at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:243) | |
at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:170) | |
at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:3348) | |
at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3185) | |
at org.apache.hadoop.fs.s3a.S3AFileSystem.isDirectory(S3AFileSystem.java:4277) | |
at org.apache.spark.sql.execution.streaming.FileStreamSink$.hasMetadata(FileStreamSink.scala:54) | |
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:370) | |
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:228) | |
at org.apache.spark.sql.DataFrameReader.$anonfun$load$2(DataFrameReader.scala:210) | |
at scala.Option.getOrElse(Option.scala:189) | |
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:210) | |
at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:537) | |
at org.apache.spark.sql.DataFrameReader.csv(DataFrameReader.scala:443) | |
at io.treeverse.clients.GarbageCollector.getCommitsDF(GarbageCollector.scala:95) | |
at io.treeverse.clients.GarbageCollector.getExpiredAddresses(GarbageCollector.scala:193) | |
at io.treeverse.clients.GarbageCollector$.markAddresses(GarbageCollector.scala:456) | |
at io.treeverse.clients.GarbageCollector$.main(GarbageCollector.scala:350) | |
at io.treeverse.clients.GarbageCollector.main(GarbageCollector.scala) | |
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method) | |
at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) | |
at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) | |
at java.base/java.lang.reflect.Method.invoke(Method.java:566) | |
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) | |
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:958) | |
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:180) | |
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:203) | |
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:90) | |
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1046) | |
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1055) | |
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
./spark-3.3.1-bin-hadoop3/bin/spark-submit --class io.treeverse.clients.GarbageCollector \ | |
--packages org.apache.hadoop:hadoop-aws:3.3.1 \ | |
--master spark://localhost:7077 \ | |
-c spark.hadoop.lakefs.api.url=http://lakefs:8000/api/v1 \ | |
-c spark.hadoop.lakefs.api.access_key=<lakeFS credentials> \ | |
-c spark.hadoop.lakefs.api.secret_key=lakeFS credentials> \ | |
-c spark.hadoop.fs.s3a.access.key=<AWS console credentials> \ | |
-c spark.hadoop.fs.s3a.secret.key=<AWS console credentials> \ | |
-c spark.hadoop.fs.s3a.session.token=AWS console credentials> \ | |
-c spark.hadoop.fs.s3a.endpoint=http://vpce-<myvpcID>.s3.us-gov-west-1.vpce.amazonaws.com \ | |
lakefs-spark-client-312-hadoop3-assembly-0.6.0.jar \ | |
project us-gov-west-1 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment