Skip to content

Instantly share code, notes, and snippets.

@secwang
Forked from seungwon0/srt2txt.pl
Created January 20, 2017 08:28
Show Gist options
  • Save secwang/bb917aed6a7a87a5bf5ea8e2af7a17d1 to your computer and use it in GitHub Desktop.
Save secwang/bb917aed6a7a87a5bf5ea8e2af7a17d1 to your computer and use it in GitHub Desktop.
Convert SRT into Text
#!/usr/bin/env perl
#
# srt2txt - Convert SRT into Text
#
# Seungwon Jeong <seungwon0@gmail.com>
#
# Copyright (C) 2012 by Seungwon Jeong
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful, but
# WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
# General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see
# <http://www.gnu.org/licenses/>.
use strict;
use warnings;
use English qw< -no_match_vars >;
use HTML::Strip;
local $INPUT_RECORD_SEPARATOR = q{};
my $hs = HTML::Strip->new();
while ( defined( my $input = <> ) ) {
# Remove subtitle number and start/end time
$input =~ s{\A \d+ \n [\d:,]+ [ ]-->[ ] [\d:,]+ \n}{}xms;
# Strip HTML-like markup
$input = $hs->parse($input);
$hs->eof();
print $input;
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment