Skip to content

Instantly share code, notes, and snippets.

@drgomesp
Last active December 14, 2015 18:09
Show Gist options
  • Save drgomesp/5126867 to your computer and use it in GitHub Desktop.
Save drgomesp/5126867 to your computer and use it in GitHub Desktop.
XML Shallow Parsing using Greppy API
<?php
// [^<]+
$textScanningExpression =
p()->not('<')->more;
// [^-]*-
$untilHyphen =
p()->not('-')->until->literal('-');
// $untilHyphen(?:[^-]$untilHyphen)*-
$untilTwoHyphens =
$untilHyphen .
p()->silent(
p()->not('-') .
$untilHyphen
) . p()->until->literal('-');
// $untilTwoHyphens>?
$commentCompletionExpression =
$untilTwoHyphens .
p()->literal('>')->optional;
// [^\]]*]([^\]]+])*]+
$rightSquareBrackets =
p()->not(']')->until->capture(
p()->not(']')->literal(']')->more
) . p()->any->literal(']')->more;
@alganet
Copy link

alganet commented Mar 11, 2013

I didn't get what "beginningOfLine" means!

@alganet
Copy link

alganet commented Mar 11, 2013

Got it. It's mixed up. "^" means line start outside sets, inside [sets] as the first char it means a negated set.

A set can only contain literals and ranges. Perhaps ->set('a', 'b', '0-9') would be short while still meaningful, and we can remove the closeSet instruction. Groups could be the same. Every group has another expression inside, so we could have something like ...->capture(p()->set( '/'); Negated sets could simply be named not.

The last regex written as suggested:

<?php
// [^\]]*]([^\]]+])*]+
$rightSquareBrackets =
p()->not(']')->any->capture(
    p()->not(']')->literal(']')->more
)->any->literal(']')->more;

@alganet
Copy link

alganet commented Mar 11, 2013

PS: Don't give up on p()! We could have $p() with pure PSR-0 compatibility, or Greppy::autodefine('function_name') based on the autodefine RFC https://wiki.php.net/rfc/autodefine

@drgomesp
Copy link
Author

Thanks for the contribution.

The gist is updated.

About the p() function, yes, I do want to provide that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment