Skip to content

Instantly share code, notes, and snippets.

@arammaliachi
Last active May 31, 2020 16:49
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save arammaliachi/f7940a3501b6b9b2132e317ed3e2119d to your computer and use it in GitHub Desktop.
Save arammaliachi/f7940a3501b6b9b2132e317ed3e2119d to your computer and use it in GitHub Desktop.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Decompress LZO within python3\n",
"Requires [lzop](https://www.lzop.org/), install it using a package management tool for your os, e.g. [homebrew](https://brew.sh/) for mac or [apt-get](https://help.ubuntu.com/community/AptGet/Howto) for ubuntu."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import subprocess\n",
"\n",
"print(os.listdir())"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`['mm_impressions_20200516_2020051601_0000.tsv.lzo']`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"result = subprocess.run(['lzop', '-d', 'mm_impressions_20200516_2020051601_0000.tsv.lzo'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"`['mm_impressions_20200516_2020051601_0000.tsv.lzo','mm_impressions_20200516_2020051601_0000.tsv']`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Sparkmagic (PySpark)",
"language": "",
"name": "pysparkkernel"
},
"language_info": {
"codemirror_mode": {
"name": "python",
"version": 3
},
"mimetype": "text/x-python",
"name": "pyspark",
"pygments_lexer": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment