Skip to content

Instantly share code, notes, and snippets.

@heyajulia
Created November 7, 2023 22:35
Show Gist options
  • Save heyajulia/b54a476be8eab73209afca67b194f07d to your computer and use it in GitHub Desktop.
Save heyajulia/b54a476be8eab73209afca67b194f07d to your computer and use it in GitHub Desktop.
Given a string, return the shortest "unambiguous abbreviation" of that string.
Assume that your input string is a non-empty string consisting of lowercase letters (a-z) and spaces.
Example:
UnambiguousAbbreviation("peter piper picked a peck of pickled peppers") = "pet pip picke a pec o pi p"
Initially, the word bank looks like:
B = ("peter", "piper", "picked", "a", "peck", "of", "pickled", "peppers")
To type "peter", you'd need to type "pet" as that's the shortest unambiguous abbreviation. The word bank now looks like:
B = ("piper", "picked", "a", "peck", "of", "pickled", "peppers")
To type "piper", you'd need to type "pip" as that's the shortest unambiguous abbreviation. The word bank now looks like:
B = ("picked", "a", "peck", "of", "pickled", "peppers")
To type "picked", you'd need to type "picke" as that's the shortest unambiguous abbreviation. The word bank now looks like:
B = ("a", "peck", "of", "pickled", "peppers")
To type "a", you'd need to type "a" as that's the shortest unambiguous abbreviation. The word bank now looks like:
B = ("peck", "of", "pickled", "peppers")
To type "peck", you'd need to type "pec" as that's the shortest unambiguous abbreviation. The word bank now looks like:
B = ("of", "pickled", "peppers")
To type "of", you'd need to type "o" as that's the shortest unambiguous abbreviation. The word bank now looks like:
B = ("pickled", "peppers")
To type "pickled", you'd need to type "pi" as that's the shortest unambiguous abbreviation. The word bank now looks like:
B = ("peppers")
To type "peppers", you'd need to type "p" as that's the shortest unambiguous abbreviation. The word bank now looks like:
B = ()
The word bank is now empty, so we're done. The shortest unambiguous abbreviation of the input string is thus:
"pet pip picke a pec o pi p"
Thus, implement a function like:
def UnambiguousAbbreviation(bank: list[str]) -> str:
...
...in the programming language of your choice.
@heyajulia
Copy link
Author

heyajulia commented Nov 8, 2023

My original solution in Raku (hidden to prevent spoilers)
sub prefixes($word) { (^$word.chars).map({ $word.comb[0 .. $_ ].join }) }

sub is-unambiguous($prefix, @words) { @words.map(*.starts-with($prefix)).one }

sub unambiguous-abbreviation($s) {
  my @words = words $s;
  my @unambiguous-abbreviation;

  while @words.elems > 0 {
      my $word = @words[0];
      LEAVE shift @words;
      my @prefixes = prefixes($word);

      for @prefixes -> $prefix {
          if is-unambiguous($prefix, @words) {
              @unambiguous-abbreviation.push: $prefix;
              last;
          }
      }
  }

  return @unambiguous-abbreviation.join(' ');
}

sub MAIN() {
  say unambiguous-abbreviation('peter piper picked a peck of pickled peppers'); # => pet pip picke a pec o pi p
}
Updated solution (fixes the duplicate word/raku raku bug)
unit sub MAIN($sentence);

say unambiguous-abbreviation($sentence);

sub prefixes($word) { (^$word.chars).map({ $word.comb[0 .. $_].join }) }

sub is-unambiguous($prefix, @words) { @words.grep(*.starts-with($prefix)).unique.elems == 1 }

sub unambiguous-abbreviation($sentence) {
  my @words = words $sentence;
  my @unambiguous-abbreviation;

  while @words.elems > 0 {
      my $word = @words[0];
      LEAVE shift @words;
      my @prefixes = prefixes($word);

      for @prefixes -> $prefix {
          if is-unambiguous($prefix, @words) {
              @unambiguous-abbreviation.push: $prefix;
              last;
          }
      }
  }

  return @unambiguous-abbreviation.join(" ");
}
Updated solution (fixes duplicate prefix/lovely love bug)
unit sub MAIN($sentence);

say unambiguous-abbreviation($sentence);

sub prefixes($word) { (^$word.chars).map({ $word.comb[0 .. $_].join }) }

sub is-unambiguous($prefix, @words) { @words.grep(*.starts-with($prefix)).unique.elems == 1 }

sub unambiguous-abbreviation($sentence) {
  my @words = words $sentence;
  my @unambiguous-abbreviation;

  while @words.elems > 0 {
      my $word = @words[0];
      LEAVE shift @words;
      my @prefixes = prefixes($word);
      my $ambiguous = True;

      for @prefixes -> $prefix {
          if is-unambiguous($prefix, @words) {
              $ambiguous = False;
              @unambiguous-abbreviation.push: $prefix;
              last;
          }
      }

      @unambiguous-abbreviation.push($word) if $ambiguous;
  }

  return @unambiguous-abbreviation.join(" ");
}

Property-based testing

I realized that, given a string S and its unambiguous abbreviation A, two things should always hold:

  1. A contains as many words as S, and
  2. Each word in A is a prefix of the corresponding word in S.

So, naturally, I built a quick-n-dirty proptest """framework""":

sub random-words() { (("a".."z").pick((1..30).roll).join xx (1..25).roll).join(" ") }

my $tests = 0;

while True {
    $tests++;

    my $sentence = random-words;
    my $abbreviation = unambiguous-abbreviation($sentence.join(" "));

    die "[test $tests] Expected $sentence to have the same number of words as $abbreviation" unless $sentence.words.elems == $abbreviation.words.elems;
    
    for ^$sentence.words.elems -> $i {
        my $word = $sentence.words[$i];
        my $prefix = $abbreviation.words[$i];

        die "[test $tests] Expected $prefix to be a prefix of $word" unless $word.starts-with($prefix);
    }

    if $tests % 100 == 0 {
        say "Passed $tests tests";
    }
}

It identified problems with my first two attempts fairly quickly, but it hasn't identified any problems with the latest version.

@massa
Copy link

massa commented Nov 17, 2023

I ended up writing your `unambiguous-abbreviations` sub as a one-liner... Like so:
sub unambiguous-abbreviations(+w) {
  w.map({ .&{ [\~] .comb }.first({ w.grep(*.begins-with($_)).uniq.elems == 1 }) or $_ })
}

The .&{ [\~] .comb } gets all the prefixes of $_;
The .first({ w.grep(*.begins-with($_)).uniq.elems == 1 }) is the smallest unambiguous prefix (or Nil, in case there isn't one and you need the full word... hence the or $_ part in the end

@heyajulia
Copy link
Author

@massa Awesome! 🤩 I love how versatile Raku is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment