Skip to content

Instantly share code, notes, and snippets.

@Davidebyzero
Last active November 25, 2020 07:58
Show Gist options
  • Star 5 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save Davidebyzero/9090628 to your computer and use it in GitHub Desktop.
Save Davidebyzero/9090628 to your computer and use it in GitHub Desktop.
Solutions to teukon's Draft Regex Golf Bonus Levels (SPOILERS! SPOILERS!) and discussion of mathematical regexes
@teukon
Copy link

teukon commented Mar 18, 2014

I think this is a nice feature.

We're just going to have to agree to disagree.

@Davidebyzero
Copy link
Author

Found another bizarre PCRE bug:

$ echo 'abc def'|pcregrep -o '^.*?\b'
abc
$ echo 'abc def'|pcregrep -o '\babc'
abc
$ echo 'abc def'|pcregrep -o 'abc def\b'
abc def
$ echo 'aaa'|pcregrep -o '^.*?(?=a)'
a
$ echo 'aaa'|pcregrep -o '^.*?(?=aaa)'

$ echo 'aaa'|pcregrep -o '^.*(?=a)'
aa
$ echo 'aaa'|pcregrep -o '^.*?(^|$)'
aaa
$ echo 'aaa'|pcregrep -o '^.*?a'
a

Seems like a lazy search with a minimum count of 0 tries a count of 1 as the first possibility if the match following it is zero-length, only backtracking to a count of 0 for the match if it has to.

Perl does not have this bug:

$ echo 'abc def'|perl -E '@m = <> =~ /^.*?\b/g; print @m[0]'

$ echo 'abc def'|perl -E '@m = <> =~ /^.*\b/g; print @m[0]'
abc def
$ echo 'aaa'|perl -E '@m = <> =~ /^.*?(?=a)/g; print @m[0]'

$ echo 'aaa'|perl -E '@m = <> =~ /^.*(?=a)/g; print @m[0]'
aa
$ echo 'aaa'|perl -E '@m = <> =~ /^.*?(^|$)/g; print @m[0]'

$ echo 'aaa'|perl -E '@m = <> =~ /^.*?a/g; print @m[0]'
a

@teukon
Copy link

teukon commented Mar 18, 2014

Weird. I'm surprised that there are so many bugs in common regex engines.

@Davidebyzero
Copy link
Author

I implemented character classes :) and of course the first thing I tried was our robust Triples solution. It works perfectly.

@teukon
Copy link

teukon commented Mar 19, 2014

I implemented character classes :) and of course the first thing I tried was our robust Triples solution. It works perfectly.

Brilliant.

@Davidebyzero
Copy link
Author

Hi teukon!

Well I finally got it releasable and posted my regex engine on github. The name isn't final. There's still some polishing to be done (especially, adding parser error messages), but it is quite usable. Hopefully you'll be able to compile it without any trouble.

@teukon
Copy link

teukon commented Mar 22, 2014

Great. I'll put this on my to-do list but I'm currently snowed under with work.

@Davidebyzero
Copy link
Author

This gist has gotten very long, so I've started a new one to continue the discussion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment