Skip to content

Instantly share code, notes, and snippets.

View leonardo-fernandes's full-sized avatar

Leonardo Monteiro Fernandes leonardo-fernandes

View GitHub Profile
@leonardo-fernandes
leonardo-fernandes / plagiarism.sh
Last active June 5, 2016 12:02
Detect plagiarism in Word documents containing images
find . -type f -print0 | while IFS= read -r -d '' file; do printf "%s ($file)\n" "$(unzip -p "$file" | egrep -o '(left|right|cx|cy)="[0-9]+"|<xdr:(col|colOff|row|rowOff)>[0-9]+</xdr:(col|colOff|row|rowOff)>' | md5sum)" ; done | awk '{ if (assoc[$1] && assoc[$1] != 1) { print assoc[$1]; assoc[$1] = 1; } if (assoc[$1] && assoc[$1] == 1) { print $0; } if (!assoc[$1]) { assoc[$1] = $0; } }'
@leonardo-fernandes
leonardo-fernandes / tt2srt.py
Last active April 19, 2016 15:29
Convert youtube timedtext XML fromat to SRT subtitles
#!/usr/bin/env python
# Usage: python tt2srt.py source.xml output.srt
# Download the .xml file from youtube by using the Network tab
# on the browser Developer Tools, and searching for requests to
# the "timedtext" endpoint
from xml.dom.minidom import parse
import sys