Skip to content

Instantly share code, notes, and snippets.

@fredriko
Last active September 6, 2023 13:33
Show Gist options
  • Save fredriko/a8a6ee8e9bcb72d4232ba8207c48a976 to your computer and use it in GitHub Desktop.
Save fredriko/a8a6ee8e9bcb72d4232ba8207c48a976 to your computer and use it in GitHub Desktop.
Keyword in context concordancer
#!/usr/bin/perl -w
##
## kwic.perl --
##
## A simple key-word in context concordancer.
##
## Fredrik Olsson, 8/10 1999. Modified 15/8 2012 to work with stdin as the only input
##
## Usage:
##
## ./kwic.perl <pattern> <maxoutputwidth>
##
## Example:
##
## cat data.txt | kwic.perl handset 80 | less
##
## t Apple iPhone 6 NEW DELHI: Chinese handset maker OnePlus is set to officially
## y, founded by executives of another handset maker Oppo in late 2013, has made r
## n other markets, where it sells the handset only through the invitation system.
## e upcoming launch, stating that the handset will be making its way to the count
## rm has seen the addition of several handset vendors this year, and it look like
## e upcoming launch, stating that the handset will be making its way to the count
## inalise our partner this week.\nThe handset vendor will deviate away from the i
## nePlus will not be the only Chinese handset vendor launching its wares in Decem
## Xiaomi, have found success selling handsets exclusively through local e-commer
## rm has seen the addition of several handset vendors this year, and it look like
## e upcoming launch, stating that the handset will be making its way to the count
## get it Android's on the march to a handset near you Related stories HTC and Sa
## te within a day of it hitting Nexus handsets, so the wait might be tiny. HTC \n
## head and assume that most other HTC handsets released in the last year or so wi
## and some of the other recent Desire handsets are likely to get the update too,
##
##########################################################################
use strict;
use vars qw($PATTERN $WIDTH $CONTEXT);
die ("\n\t$0: <string> <outputwidth>\n") unless scalar @ARGV > 1;
$PATTERN = quotemeta(shift @ARGV);
$WIDTH = shift @ARGV;
warn ("$0: ignoring @ARGV\n") if @ARGV;
$CONTEXT = int(($WIDTH - length($PATTERN))/2);
$/ = '';
my(@tmp, $build, $prev);
while(<>) {
$_ =~ s/\n/ /g;
push @tmp, ($_ =~ /(.{0,$CONTEXT}$PATTERN)(?=(.{0,$CONTEXT}))/gio);
foreach (@tmp) {
if($build) {
($prev = " " x ($CONTEXT - length($prev) + 6) . $prev)
unless length($prev) == $CONTEXT;
print " " . $prev . $_, "\n";
$build = 0;
} else {
$prev = $_;
$build++;
}
}
@tmp = ();
}
exit;
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment