Skip to content

Instantly share code, notes, and snippets.

@cpatulea
cpatulea / gist:8446030f4a70e92fd66fa9a090c238c3
Created April 30, 2024 04:54
Foulab wiki: non-ASCII characters
with h as (
select *, convert(data using utf8) as data_utf8 from tiki_history
where data <> convert(convert(data using utf8) using ascii)
),
h1b as (
with recursive cte as (
select pageName, version, data_utf8, 0 as ascii, ' ' as c, 1 as pos
from h
union all
select pageName, version, data_utf8,
@cpatulea
cpatulea / gist:8818127
Last active November 6, 2018 03:08
last.fm data export readme.txt
== Your Last.fm archive ==
This archive contains musical data you sent to Last.fm, conveniently packaged
as easy-to-process formats. Use them to discover your listening habits, to
bootstrap your own apps or to sync them with other data.
We're always interested to see what you're doing with your data, so make sure
to let us know on the forum: http://www.last.fm/forum/21717
== Types of data ==
There are 6 different types of data in this archive, some of them less obvious.
@cpatulea
cpatulea / gist:7394412
Created November 10, 2013 05:59
Find Python string literals that should probably be Unicode
#!/usr/bin/python
import ast, _ast, os
for root, dirs, files in os.walk('.'):
for name in files:
if name.endswith('.py'):
full = os.path.join(root, name)
t = ast.parse(open(full).read())
for n in ast.walk(t):
if isinstance(n, _ast.Str) and not isinstance(n.s, unicode):
@cpatulea
cpatulea / Explation:
Created October 31, 2013 03:03 — forked from ant6n/Explation:
Question: "Python:
>>> LINE A
>>> LINE B
>>> LINE C
False
>>> LINE A; LINE B; LINE C
True
what are the lines? nothing stateful allowed & no __methods."
(https://twitter.com/akaptur/status/395252265117687808).