Skip to content

Instantly share code, notes, and snippets.

View mh0w's full-sized avatar
💭
🦆

Matthew Hawkes_ONS mh0w

💭
🦆
View GitHub Profile
@mh0w
mh0w / Extract table schemas from an EPIDD.py
Created June 26, 2024 15:49
Extract table schemas from an EPIDD
"""
Extract table schemas from an EPIDD.
Requirements: pip install pandas xlsxwriter
"""
import pandas as pd
from docx.api import Document
doc_path = "C:/path/to/my.docx"
@mh0w
mh0w / BOTO3 basics.py
Last active June 21, 2024 13:39
BOTO3 basics
#################################################################
# Reading and writing xlsx files Sparklessly from S3 with BOTO3 #
#################################################################
import boto3
import raz_client
import pandas as pd
import io
my_bucket = "bucket_name_goes_here"
@mh0w
mh0w / Basic unix.sh
Last active June 26, 2024 09:47
Basic unix
# Resources:
# https://mally.stanford.edu/~sr/computing/basic-unix.html
# https://mally.stanford.edu/~sr/computing/more-unix.html
###############
# Console use #
###############
# Get help manual on a specific command
@mh0w
mh0w / Basic intro to Object Oriented Programming (OOP), classes, and objects in Python.md
Last active June 5, 2024 09:00
Basic intro to Object Oriented Programming (OOP), classes, and objects in Python

There are objects in the real world (phone, microphone, popcorn). Objects have attributes (what it is or has; properties) and methods (what it can do). In Python, a class can be used to create an object. A class is a blueprint or template that describes what attributes and methods an object has.

Class: data type, such as int, string, function, or user-defined Car

Object: instantiated class

Method: a function encapsulated within a class, such as the string class .upper() method.

Encapsulation: is where data in a class are hidden from other classes and only accessible via the class’s methods (‘data-hiding’); Objects (e.g. Cat) manage their own state/attributes (e.g., energy, age), have private methods (e.g., .meow(), .sleep()) that they can call whenever they want, and can only be touched by other classes via public methods (e.g., .feed()).

@mh0w
mh0w / Using spark locally.md
Last active May 31, 2024 14:11
Using spark locally
@mh0w
mh0w / intro.sql
Last active May 1, 2024 14:33
SQL intro
-- SQL: Structured Query Language for communicating with relational (tabular) databases
-- RDBMS: Relational Database Management Systems
-- CRUD: Create, Read, Update, and Delete
-- Databases contain tables, plus potentially Views and Queries too
-- Once connected to the server database, the following example snippets work
------------------
-- Querying data -
------------------
@mh0w
mh0w / Quarto.ps1
Last active May 1, 2024 09:13
Quarto
# Generate a fully standalone html preview of an md file (in the same folder as the md file)
quarto preview //network/path/foldername/Matthew/repos/sape_hamlet/README.md --no-browser --no-watch-inputs --embed-resources
# See https://quarto.org/docs/reference/formats/html.html
# SEO terms: quartro, quatro, quato
@mh0w
mh0w / dtypes data types cast casts casting downcast downcasting.md
Last active February 21, 2024 16:07
dtypes data types cast casts casting downcast downcasting

image NB: Foats work differently to integars. Not all values in the range can be represented. For example, float16 cannot store 12345 - the closest it can get is 12344.

image

@mh0w
mh0w / R basics.r
Last active March 19, 2024 14:26
R basics
# Package management: install packages and any dependencies
install.packages("rmarkdown", dependencies = TRUE, type = "win.binary")
# Load Packages
library(ggplot2)
library(tidyverse)
# Create toy dataframe
ab.seq <- as.data.frame(seq(from = 1, to = 100, by = 1))
ab.seq$new_var <- rep(0, times = 100)
@mh0w
mh0w / Create animated plotly express line chart.py
Last active February 6, 2024 14:51
Create animated plotly express line or scatter chart
# From https://plotly.com/python/sliders/
import plotly.express as px
df = px.data.gapminder()
fig1 = px.scatter(
df,
x="gdpPercap",
y="lifeExp",
animation_frame="year",
animation_group="country",