Created
July 10, 2012 14:03
-
-
Save emk/3083417 to your computer and use it in GitHub Desktop.
mkcards: Turn DVD audio or songs into Anki cards
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This toolkit will allow you turn TV episodes and transcriptions into | |
high-quality Anki cards. Quite a lot of assembly is required the first | |
time through, but after that, 90% of your effort will be (1) either looking | |
at the text or (2) aligning the start and end of each audio clip using a | |
halfway civilized tool. In other words, it's still going to take time, but | |
you'll actually spend that time studying the language, not messing around | |
with a 19-step process. | |
There's a good chance you can this working on a Mac or on Linux. Windows | |
will be a fairly major challenge. | |
Tools needed: | |
- The transcode utility or another way to rip audio tracks from a DVD | |
- The ffmpeg command-line tool | |
- Ruby 1.9 | |
- TranscriberAG | |
- Anki | |
- Optional: Dropbox and AnkiDroid for mobile use | |
To download TranscriberAG, check out the following sites: | |
http://transag.sourceforge.net/ (Mac, Windows) | |
https://gitorious.org/transcriberag-aapo (Modern Linux systems) | |
Rip an audio track from a DVD. Replace $CHAPTER with the episode number | |
(1, 2, etc.) and $OUTPUT with a file name. | |
transcode -i /dev/dvd -x dvd -T $CHAPTER,-1 -a 1 -y null,wav -m $OUTPUT.wav | |
If you wish, you can convert stereo WAV to mono. | |
ffmpeg -i buffy_209_stereo.wav -ac 1 buffy_209.wav | |
To preserve your sanity, replace all spaces in the audio file name with "_". | |
If you want to work with music instead of DVD audio, just grab your MP3 | |
file and use it in place of the *.wav. | |
Use TranscriberAG to align your transcription with sound file. Read the | |
tutorials and take a few minutes to experiment; it's a very powerful and | |
efficient tool, designed for professionals, but you need to get used to it. | |
Note that I currently assign speaker names when transcribing. If you | |
assign "no speaker" or leave the text of a segment blank, no card will be | |
made for that segment. | |
When done, export your transcript in *.stm format, and invoke 'mkcards' as | |
follows: | |
./mkcards -c da_vinci_claude.stm da_vinci_claude.mp3 | |
You may need to 'mv mkcards.rb && chmod +x mkcards' the first time, or call | |
it as 'ruby mkcards.rb', or something along those lines. | |
The '-c' here is optional. If you specify it, the volume will be cut in | |
half, which is generally the right choice for music, which is roughly twice | |
as loud as television episodes. This will let you do reps in Anki without | |
crazy volume shifts. | |
This will generate da_vinci_claude.tsv and a media/ directory. Copy all | |
the MP3 files in the media directory to your Anki media folder, and import | |
the TSV file. | |
You'll want to create a new card model before importing the TSV file. The | |
model fields I use are: | |
- Text | |
- Speaker | |
- Source | |
- Start | |
- Sound | |
- Note | |
I usually put "Speaker" and "Sound" on the front of the card, and | |
everything else on the back. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env ruby | |
require 'fileutils' | |
# Represents one line of an STM file. | |
class Line | |
# Time in seconds to add to the start and end of each clip. This makes | |
# it much less time-consuming and error prone to deal with fast | |
# conversations and individual lines in songs. | |
PADDING = 0.25 | |
# Given a line of an STM file, parse it into the fields we need. | |
def initialize(text) | |
fields = text.split(' ', 7) | |
puts fields.inspect | |
@source = fields[0] | |
if fields[2] != 'inter_segment_gap' | |
# Strip file-specific speaker prefixes. | |
@speaker = fields[2].sub("#{@source}_", '') | |
end | |
@start = fields[3] | |
@end = fields[4] | |
# Split apart dialog on "///" in case there are notes | |
@dialog, @note = fields[6].split(%r{///}, 2).map {|t| process_text(t) } | |
@note ||= '' | |
end | |
# Do we have a speaker field? If not, this is dead space. | |
def blank? | |
@speaker.nil? || @dialog.nil? || @dialog.empty? | |
end | |
# What file name should we use for this clip? | |
def clip_name | |
"#{@source}-#{start}".gsub('.', '_') + ".mp3" | |
end | |
# Convert this line to tab-separated values for import into Anki. | |
def to_tsv | |
[@dialog, @speaker, @source, start, "[sound:#{clip_name}]", | |
@note].join("\t") | |
end | |
# Generate a command which we can use to export an MP3 file. | |
def export_command(audio_source_path, outdir, cut_volume) | |
cmd = ['ffmpeg', '-i', audio_source_path, '-ss', start, '-t', duration] | |
if cut_volume | |
cmd += ['-vol', '128'] | |
end | |
cmd << File.join(outdir, clip_name) | |
cmd | |
end | |
# Parse the file at the specified location into an array of Line objects. | |
def self.parse_stm(path) | |
lines = [] | |
File.open(path, 'r') do |f| | |
f.each_line do |line| | |
line.chomp! | |
next if line =~ /^;/ || line.empty? | |
parsed = Line.new(line) | |
next if parsed.blank? | |
lines << parsed | |
end | |
end | |
lines | |
end | |
private | |
# When does this clip start, in seconds, represented as a string? | |
def start | |
sprintf("%.2f", @start.to_f - PADDING) | |
end | |
# How long is this clip in seconds, represented as a string? We use this | |
# to invoke ffmpeg. | |
def duration | |
sprintf("%.2f", (@end.to_f - @start.to_f) + PADDING*2) | |
end | |
# Split a text field into lines at "//" and clean up whitespace. | |
def process_text(text) | |
text.split(%r{//}).map {|l| l.strip }.join("<br>") | |
end | |
end | |
# Parse our command-line args. | |
cut_volume = false | |
if ARGV.length >= 1 && ARGV[0] == '-c' | |
ARGV.shift # Drop first argument. | |
cut_volume = true | |
end | |
if ARGV.length != 2 | |
STDERR.puts "Usage: mkcards [-c] input.stm input.wav" | |
STDERR.puts " -c will cut the volume in half, making songs more like DVDs" | |
exit 1 | |
end | |
stm, audio_source_path = ARGV | |
# Create our output media directory. | |
FileUtils.mkdir_p('media') | |
# Process our STM file and generate our output files. | |
lines = Line.parse_stm(stm) | |
File.open(stm.sub(/\.stm$/, '.tsv'), 'w') do |tsv| | |
for line in lines | |
tsv.puts line.to_tsv | |
system *line.export_command(audio_source_path, "media", cut_volume) | |
end | |
end |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment