Skip to content

Instantly share code, notes, and snippets.

@kobaltz
Created January 7, 2018 03:51
Show Gist options
  • Save kobaltz/89d9fac0f887181ab6793a6d7f34fb50 to your computer and use it in GitHub Desktop.
Save kobaltz/89d9fac0f887181ab6793a6d7f34fb50 to your computer and use it in GitHub Desktop.
Convert Audio from MP4 to Text
#!/usr/bin/env ruby
# Install some tools first
# brew install ffmpeg
# brew install lame
# brew install --HEAD watsonbox/cmu-sphinx/cmu-sphinxbase
# brew install --HEAD watsonbox/cmu-sphinx/cmu-pocketsphinx
# Place this file in the directory that you have files to convert.
# Make this file executable
# chmod +x convert.rb
# Let er rip!
# ./convert.rb
TYPE_OF_FILE_TO_CONVERT = "*.mp4"
def convert
Dir.glob(TYPE_OF_FILE_TO_CONVERT) do |file|
@file = file
# This step can be commented out if your file is already an MP3
convert_file_to_mp3(file, output)
add_silence_to_audio_file(output, file_with_silence)
remix_file_to_sanitize_headers(file_with_silence, file_with_remix)
convert_to_wav(file_with_remix, wave_output)
convert_audio_to_text(wave_output, text_output)
cleanup([output, file_with_silence, file_with_remix, wave_output])
end
end
def add_silence_to_audio_file(input, output)
`ffmpeg -i concat:'silence.mp3|#{input}' -codec copy #{output} -y`
end
def create_silence
`ffmpeg -f lavfi -i aevalsrc=0:0::duration=10 -ab 320k silence.mp3 -y`
end
def convert_file_to_mp3(input, output)
`ffmpeg -i #{input} -b:a 320K -vn #{output} -y`
end
def remix_file_to_sanitize_headers(input, output)
`lame #{input} #{output}`
end
def convert_to_wav(input, output)
`ffmpeg -i #{input} -acodec pcm_s16le -ac 1 -ar 16000 #{output} -y`
end
def convert_audio_to_text(input, output)
`pocketsphinx_continuous -infile #{input} -logfn /dev/null > #{output}`
end
def basename
File.basename(@file, '.*')
end
def output
basename + '.mp3'
end
def file_with_silence
"with_silence_#{basename}.mp3"
end
def file_with_remix
"remix_#{basename}.mp3"
end
def wave_output
basename + '.wav'
end
def text_output
basename + '.txt'
end
def cleanup(array_of_files)
array_of_files.each do |file|
`rm #{file}`
end
end
create_silence
convert
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment