Skip to content

Instantly share code, notes, and snippets.

View MattOates's full-sized avatar

Matt Oates MattOates

View GitHub Profile
#!/usr/bin/env perl6
use v6;
use Stats;
#Quick benchmark script for https://www.reddit.com/r/perl6/comments/8tdfvz/handwavy_speed_test/
# $ perl6 annual-rate.p6 --upto=100
# annual-rate-with-helper: ran in 0.0551038961038961 seconds (σ = 0.015335042908917152 seconds)
# Result was 1.3232569411493311
# annual-rate-rat-exp: ran in 0.7545948945615983 seconds (σ = 0.03632803218476478 seconds)
# Result was 1.3232569411482977
#!/usr/bin/env perl
use strict;
use warnings;
open my $file, '<', '../../example-input.csv';
while(my $record = <$file>) {
chomp $record;
my ($ElementId, $VehicleId, $Term, $Mileage, $Value) = split ',', $record;
}
@MattOates
MattOates / gist:fefadbf7b99573886b205893d4bff1d7
Last active June 8, 2018 10:20
Weirdness in reindex based on epoch offset datetime or a date level integer defined one....
In [37]: date = datetime.datetime.now()
...: pd.DataFrame(data={'value': {date:1.0}}).reindex(pd.date_range('2018-05-30','2018-06-15'), fill_value=0)
Out[37]:
value
2018-05-30 0.0
2018-05-31 0.0
2018-06-01 0.0
2018-06-02 0.0
2018-06-03 0.0
2018-06-04 0.0
@MattOates
MattOates / explode.p6
Created January 5, 2018 17:09
Explode records of files into new files based on the value of a given field
#!/usr/bin/env perl6
%*SUB-MAIN-OPTS<named-anywhere> = True;
my %out_files;
sub MAIN( @in-files where { .IO.f // die "Input file $_ not found" },
Int :$f(:$field) = 0,
Str :$d(:$delimiter) = ',',
Str :$t(:$template) = '%') {
@MattOates
MattOates / timking_any_primes.p6
Created December 15, 2017 21:22
Timking's original elegant attempt at finding primes.
sub primes-any (Int $max) {
my Int @primes;
@primes.push($_) unless $_ %% any(@primes) for 2 .. $max;
return @primes;
}
@MattOates
MattOates / perf_improvements.sh
Last active December 15, 2017 21:08
Get some numbers on % and X increases in performance since last Christmas
git log --since 2016-12-25 --oneline | perl6 -e 'say lines().grep(/(\d+\.?\d*?)\%(" "|$)/).map({/(\d+\.?\d*?)\%(" "|$)/; ~$/[0]}).sort.join("\n")' > %_increase.txt
git log --since 2016-12-25 --oneline | perl6 -e 'say lines().grep(/(\d+\.?\d*?)x(" "|$)/).map({/(\d+\.?\d*?)x(" "|$)/; ~$/[0]}).sort.join("\n")' > x_increase.txt
#!/usr/bin/env perl6
use v6;
# Test files produced with mkfile -n 10m 10m for each file size
my @file_sizes = <1m 10m 100m 1g 10g>;
my @read_sizes = <1024 65536>;
my %langs = ( ruby => {cmd => "\{time -p ruby -e 'while buf = STDIN.read(%d) do end' < %s\} 2>&1"},
perl5 => {cmd => "\{time -p perl -e 'while(read(STDIN,\$buf,%d)) {}' < %s\} 2>&1"},
use strict;
use warnings;
use English;
use v5.10;
my %pins;
#Read lines from files given as cli arguments
while (my $line = <<>>) {
#Trim the newline
@MattOates
MattOates / gist:b8bb6387ec7fb92370f8c3c6be2a6af4
Created November 23, 2017 15:21
Getting the following error trying to use levenshtein func in postgres
normalised_name = 'joe bloggs'
team_uuid = UUID('B54D2B9B-AEB3-4782-B452-CCB6D8F2850F')
db.session.query(User.user_id).filter(User.user_id == UserTeam.user_id) \
.filter(UserTeam.team_uuid == team_uuid) \
.filter(func.levenshtein(
func.unaccent(
func.replace(
func.lower(
func.concat(
User.first_name, ' ', User.last_name
@MattOates
MattOates / random_rows.py
Created November 14, 2017 15:34
The following code doesn't work, but a by hand tailored equivalent `random.choice(None or [None])` works just fine.
def random_rows(db, model, columns={}, num=100):
column_types = [ {'name': col.name, 'type': str(col.type)} for col in model.__table__.columns]
rows = []
for n in range(1, num):
rows.append([random.choice(columns.get(col['name']) or [None]) or __random_data(col['type']) for col in column_types])
return rows