Skip to content

Instantly share code, notes, and snippets.

Avatar

Matthew Pick matthewpick

  • Kansas City, MO
View GitHub Profile
View DeltaWriter.scala
import io.delta.tables.DeltaTable
import org.apache.spark.sql.{AnalysisException, DataFrame, SparkSession}
object DeltaWriter {
def generateSymlinkManifest(deltaPath: String, sparkSession: SparkSession): Unit = {
val deltaTable = DeltaTable.forPath(sparkSession, deltaPath)
deltaTable.generate("symlink_format_manifest")
}
View FileNotFound_Delta.io
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 25 in stage 79.0 failed 4 times, most recent failure: Lost task 25.3 in stage 79.0 (TID 8326, ip-10-4-40-120.ec2.internal, executor 1): java.io.FileNotFoundException: No such file or directory: s3a://mybucket/mypath/delta_table/part-00018-d3f8bcb6-f5de-4d7d-88d7-becd5d3d9874-c000.snappy.parquet
It is possible the underlying files have been updated. You can explicitly invalidate the cache in Spark by running 'REFRESH TABLE tableName' command in SQL or by recreating the Dataset/DataFrame involved.
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.org$apache$spark$sql$execution$datasources$FileScanRDD$$anon$$readCurrentFile(FileScanRDD.scala:160)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.nextIterator(FileScanRDD.scala:211)
at org.apache.spark.sql.execution.datasources.FileScanRDD$$anon$1.hasNext(FileScanRDD.scala:130)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedI
@matthewpick
matthewpick / .htaccess
Created Jul 29, 2019
Docker-Compose Wordpress + MySQL + Large File uploads (htaccess)
View .htaccess
# BEGIN WordPress
php_value upload_max_filesize 20280M
php_value post_max_size 20280M
php_value memory_limit 256M
php_value max_execution_time 300
php_value max_input_time 300
# END WordPress
View mysql_docker_setup.md

Running MySQL in a Docker container

Docker

Step 1

Clone the mysql dockerfile repo

git clone https://github.com/docker-library/mysql.git
cd mysql/5.7
@matthewpick
matthewpick / extra_paycheck_calculation.rb
Last active Sep 8, 2017
Based on a bi-weekly pay cycle, determine which months you will receive an extra paycheck.
View extra_paycheck_calculation.rb
require 'active_support/time'
def paycheck_count(begin_date, years)
month_count = {}
end_date = begin_date + years.years
time_counter = begin_date
while time_counter < end_date do
key = "#{time_counter.year}.#{time_counter.month}"
View nginx.conf
worker_processes 1;
events {
worker_connections 1024;
}
http {
include mime.types;
default_type application/octet-stream;
You can’t perform that action at this time.