Skip to content

Instantly share code, notes, and snippets.

View anthonymobile's full-sized avatar

Anthony Townsend anthonymobile

View GitHub Profile
@CodeBear801
CodeBear801 / csv_to_parquet.py
Created June 27, 2019 15:43
convert csv into parquet
import pandas as pd
import pyarrow as pa
import pyarrow.parquet as pq
csv_file = 'id2ids.csv'
parquet_file = 'id2ids.parquet'
chunksize = 10_000_000
csv_stream = pd.read_csv(csv_file, sep='\t', chunksize=chunksize, low_memory=False)
@rvaidya
rvaidya / database_to_parquet.py
Last active May 16, 2024 14:13
Dump database table to parquet file using sqlalchemy and fastparquet. Useful for loading large tables into pandas / Dask, since read_sql_table will hammer the server with queries if the # of partitions/chunks is high. Using this you write a temp parquet file, then use read_parquet to get the data into a DataFrame
import pandas as pd
import numpy as np
import fastparquet
from sqlalchemy import create_engine, schema, Table
# Copied from pandas with modifications
def __get_dtype(column, sqltype):
import sqlalchemy.dialects as sqld
@devdrops
devdrops / example.md
Last active March 25, 2024 15:09
Mysqldump from Docker container

Mysqldump from Docker container

docker exec -i mysql_container mysqldump -uroot -proot --databases database_name --skip-comments > /path/to/my/dump.sql

OBS

  • This will generate a dump.sql file in your host machine. Awesome, eh?
  • Avoid using --compact on your dump. This will make MySQL check your constraints which will cause troubles when reading your file (damm you MySQL). And don't use --force to fix this scenario: recreate your dump without --compact ¯_(ツ)_/¯
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@georgy7
georgy7 / extract_mbox_attachments.py
Last active May 19, 2024 10:07
Extract attachments from mbox file.
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
# Modified.
# Original script source:
# http://blog.marcbelmont.com/2012/10/script-to-extract-email-attachments.html
# https://web.archive.org/web/20150312172727/http://blog.marcbelmont.com/2012/10/script-to-extract-email-attachments.html
# Usage:
# Run the script from a folder with file "all.mbox"
@joshuaadickerson
joshuaadickerson / gist:3938763
Created October 23, 2012 13:34
CSV MySQL dump to gzip
mysqldump -uroot -ppass --tab="." --fields-enclosed-by=\" --fields-terminated-by="," myschema mytable | gzip > myschema.mytable.csv.gz
@JonJanzen
JonJanzen / ddns.py
Created August 23, 2012 04:26
Dynamic DNS on WebFaction
#!/usr/bin/python
# Use this script to update a DNS override using the webfaction API
# be sure to set your username, password, dns override, and ethenet interface.
# Then add a crontab entry for the script, I use every 5 minutes
# */5 * * * * /path/to/ddns.py
# This is safe as the script exit(0)'s if the ip is the same as wehat is recorded in the file.
# Webfaction documentation on DNS overrides
# http://docs.webfaction.com/user-guide/domains.html#overriding-dns-records-with-the-control-panel
@tecoholic
tecoholic / osm2geo.js
Created November 27, 2011 04:57
OSM2GEO - A JS Converter to convert OSM to GeoJSON
/**************************************************************************
* OSM2GEO - OSM to GeoJSON converter
* OSM to GeoJSON converter takes in a .osm XML file as input and produces
* corresponding GeoJSON object.
*
* AUTHOR: P.Arunmozhi <aruntheguy@gmail.com>
* DATE : 26 / Nov / 2011
* LICENSE : WTFPL - Do What The Fuck You Want To Public License
* LICENSE URL: http://sam.zoy.org/wtfpl/
*