Skip to content

Instantly share code, notes, and snippets.

View Fokko's full-sized avatar
🚀

Fokko Driesprong Fokko

🚀
View GitHub Profile
2023-01-26T13:53:29.161 [206 Partial Content] s3.GetObject minio:9000/warehouse/wh/nyc/taxis/metadata/00006-84798267-d930-4ade-bd98-69e9a56d80fa.metadata.json 172.18.0.3 847µs ↑ 169 B ↓ 11 KiB
2023-01-26T13:53:29.203 [200 OK] s3.HeadObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/snap-4964512940758329203-1-0c16c49c-2497-4706-95b3-da2a0f056e83.avro 172.18.0.1 505µs ↑ 153 B ↓ 0 B
2023-01-26T13:53:29.206 [206 Partial Content] s3.GetObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/snap-4964512940758329203-1-0c16c49c-2497-4706-95b3-da2a0f056e83.avro 172.18.0.1 1.023ms ↑ 159 B ↓ 3.9 KiB
2023-01-26T13:53:29.225 [200 OK] s3.HeadObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/89e24c2b-36ef-4fb1-bbf7-0ff7056b77f6-m0.avro 172.18.0.1 332µs ↑ 153 B ↓ 0 B
2023-01-26T13:53:29.225 [200 OK] s3.HeadObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/973a3c07-6f09-41ff-b0dc-5d52747f42d4-m0.avro 172.18.0.1 279µs ↑ 153 B ↓ 0 B
2023-01-26T13:53:29.2
2023-01-26T13:54:37.006 [206 Partial Content] s3.GetObject minio:9000/warehouse/wh/nyc/taxis/metadata/00006-84798267-d930-4ade-bd98-69e9a56d80fa.metadata.json 172.18.0.3 1.752ms ↑ 169 B ↓ 11 KiB
2023-01-26T13:54:37.017 [200 OK] s3.HeadObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/snap-4964512940758329203-1-0c16c49c-2497-4706-95b3-da2a0f056e83.avro 172.18.0.1 578µs ↑ 153 B ↓ 0 B
2023-01-26T13:54:37.020 [206 Partial Content] s3.GetObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/snap-4964512940758329203-1-0c16c49c-2497-4706-95b3-da2a0f056e83.avro 172.18.0.1 1.176ms ↑ 159 B ↓ 3.9 KiB
2023-01-26T13:54:37.038 [200 OK] s3.HeadObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/dedc80dd-0878-4cfb-869d-7f5c445b03af-m0.avro 172.18.0.1 376µs ↑ 153 B ↓ 0 B
2023-01-26T13:54:37.039 [200 OK] s3.HeadObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/973a3c07-6f09-41ff-b0dc-5d52747f42d4-m0.avro 172.18.0.1 240µs ↑ 153 B ↓ 0 B
2023-01-26T13:54:37.
2023-01-26T08:26:42.769 [206 Partial Content] s3.GetObject minio:9000/warehouse/wh/nyc/taxis/metadata/00006-84798267-d930-4ade-bd98-69e9a56d80fa.metadata.json 172.18.0.3 2.845ms ↑ 169 B ↓ 11 KiB
2023-01-26T08:26:42.784 [200 OK] s3.HeadObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/snap-4964512940758329203-1-0c16c49c-2497-4706-95b3-da2a0f056e83.avro 172.18.0.1 519µs ↑ 153 B ↓ 0 B
2023-01-26T08:26:42.788 [206 Partial Content] s3.GetObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/snap-4964512940758329203-1-0c16c49c-2497-4706-95b3-da2a0f056e83.avro 172.18.0.1 1.968ms ↑ 159 B ↓ 3.9 KiB
2023-01-26T08:26:42.808 [200 OK] s3.HeadObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/89e24c2b-36ef-4fb1-bbf7-0ff7056b77f6-m0.avro 172.18.0.1 648µs ↑ 153 B ↓ 0 B
2023-01-26T08:26:42.808 [200 OK] s3.HeadObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/0c16c49c-2497-4706-95b3-da2a0f056e83-m0.avro 172.18.0.1 335µs ↑ 153 B ↓ 0 B
2023-01-26T08:26:42.
2023-01-26T08:28:57.406 [206 Partial Content] s3.GetObject minio:9000/warehouse/wh/nyc/taxis/metadata/00006-84798267-d930-4ade-bd98-69e9a56d80fa.metadata.json 172.18.0.3 1.018ms ↑ 169 B ↓ 11 KiB
2023-01-26T08:28:57.416 [200 OK] s3.HeadObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/snap-4964512940758329203-1-0c16c49c-2497-4706-95b3-da2a0f056e83.avro 172.18.0.1 440µs ↑ 153 B ↓ 0 B
2023-01-26T08:28:57.419 [206 Partial Content] s3.GetObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/snap-4964512940758329203-1-0c16c49c-2497-4706-95b3-da2a0f056e83.avro 172.18.0.1 868µs ↑ 159 B ↓ 3.9 KiB
2023-01-26T08:28:57.437 [200 OK] s3.HeadObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/89e24c2b-36ef-4fb1-bbf7-0ff7056b77f6-m0.avro 172.18.0.1 433µs ↑ 153 B ↓ 0 B
2023-01-26T08:28:57.437 [200 OK] s3.HeadObject 127.0.0.1:9000/warehouse/wh/nyc/taxis/metadata/f371bff9-d197-4314-ae33-75afd24e8781-m0.avro 172.18.0.1 793µs ↑ 153 B ↓ 0 B
2023-01-26T08:28:57.4
@Fokko
Fokko / Dockerfile
Created September 2, 2022 09:44
Hive Metastore Dockerfile
FROM openjdk:8-jre
RUN apt-get update && apt-get install -y libpostgresql-jdbc-java procps libsasl2-modules libsasl2-dev && rm -rf /var/lib/apt/lists/*
# Install Apache Hadoop
ENV HADOOP_VERSION=3.1.0
ENV HADOOP_HOME /opt/hadoop-$HADOOP_VERSION
ENV HADOOP_CONF_DIR=$HADOOP_HOME/conf
ENV PATH $PATH:$HADOOP_HOME/bin
RUN curl -L \
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
@Fokko
Fokko / serializers.pyi
Created August 27, 2020 15:11
serializers.pyi
#
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
*****************************************************
Summary
-------
Generated at: 2020-05-23T11:00:53+02:00
Notes: 3
Binaries: 0
Archives: 0
Standards: 31
@Fokko
Fokko / diff
Created November 17, 2019 20:15
$ colordiff -y <(xxd parquet-1-11-0.parquet) <(xxd parquet-1-10-1.parquet)
00000000: 5041 5231 1500 1514 1538 158b 8cf0 8805 PAR1.....8 | 00000000: 5041 5231 1500 1514 1538 2c15 0215 0015 PAR1.....8
00000010: 1c15 0215 0015 0615 0800 001f 8b08 0000 .......... | 00000010: 0615 081c 1804 0100 0000 1804 0100 0000 ..........
00000020: 0000 0000 0063 6260 6060 6664 0492 0030 .....cb``` | 00000020: 1600 2804 0100 0000 1804 0100 0000 0000 ..(.......
00000030: 8437 e40a 0000 0015 0015 1615 3a15 8fe7 .7........ | 00000030: 001f 8b08 0000 0000 0000 0063 6260 6060 ..........
00000040: 84fd 071c 1502 1500 1506 1508 0000 1f8b .......... | 00000040: 6664 0492 0030 8437 e40a 0000 0015 0015 fd...0.7..
00000050: 0800 0000 0000 0000 6362 6060 6066 6404 ........cb | 00000050: 1615 3a2c 1502 1500 1506 1508 1c18 0161 ..:,......
00000060: 9289 006b b98a ce0b 0000 0019 1102 1918 ...k...... | 00000060: 1801 6116 0028 0161 1801 6100 0000 1f8b ..a..(.a..
00000070: 0401 0000 0019 1804 0100 0000 1502 1916 .......... | 000

Read Delta Lake as plain Apache Parquet

Let's create a very simple Delta Lake table using three rows (3 people), and write it to the local disk /tmp/delta/:

Welcome to
      ____              __
     / __/__  ___ _____/ /__
    _\ \/ _ \/ _ `/ __/  '_/
 /___/ .__/\_,_/_/ /_/\_\ version 2.4.4