Skip to content

Instantly share code, notes, and snippets.

@raiph

raiph/p6 -a .md Secret

Last active December 8, 2018 16:08
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save raiph/1d0cbefcb3cfe45b0b906282e6e40519 to your computer and use it in GitHub Desktop.
Save raiph/1d0cbefcb3cfe45b0b906282e6e40519 to your computer and use it in GitHub Desktop.
p6 -a option (autosplit)

What

This gist is an outline for implementing a perl6 -a command line option for Autosplitting, a feature that P5 already has and one that is supposed to one day be implemented for P6 but which has not yet attracted a champion.

Why

It would be useful for:

  • Users. They could much more easily write a particular category of one liners.

  • Demos. Some one liners will work great on the perl6.org home page.

  • Helping to clarify that P6 is for one liners too.

When

I think a good goal would be having this and other one liner cleanups and powerups done in time for a Christmas 2019 Perl one liners calendar which focuses on P6 syntax but also uses P5 and Python modules in one liners.

Who

Me and anyone else interested would write docs, tests, code, blog posts, etc.

Inchstones

  • Write a gist. [X]

  • Try to write up a nice Minimum Viable Pretotype. Publish to P6 community. [ ]

  • Discuss and improve till rough consensus we have nice MVP feature proposal and its doc. [ ]

  • Publish and discuss at other one liner communities, eg. r/awk, r/commandline etc. [ ]

  • Restart with another MVP or continue. [ ]

  • Write tests matching MVP doc. [ ]

  • Write an initial implementation of MVP without touching the compiler. [ ]

  • Discuss integration into the compiler. [ ]

  • Beef implementation up to be sufficiently robust. [ ]

  • Integrate it into the compiler. [ ]

  • Discuss and fix it prior to merging into master to ensure folk are happy with it. [ ]

  • Write some examples. [ ]

  • Write a calendar entry. [ ]

  • Write a marketing flyer about P6 use for one liners. [ ]

  • Write an opensource.org artible about P6 use for one liners. [ ]

Where

Collaboration can happen anywhere participants want -- the #perl6 IRC channel, reddit, GH, email, mailing lists, gists, etc.

Once done, articles about P6 one liners including autosplit can happen on twitter, reddit, opensource.org, etc.

Inspiration

Brian Kernighan started his one hour 2015 Nottingham University lecture Computer Science - Brian Kernighan on successful language design with a bang. He showed students a coding example that instantly and powerfully demonstrated the commanding lead awk still had/has for producing an ultra simple one liner solution to a classic simple problem. I'll return to it in a mo.

Imo simple use, including command line one-liners, is an important use case for Perl 6, especially for encouraging its take up. This gist focuses on the classic awk showcase example that Brian demoed. Perl 5 failed to catch up with this case, and Perl 6 is currently a backward step compared to Perl 5. But Perl 6 could instead not only close the gap but pull ahead for this example.

This gist includes a strawman proposal of a way for Perl 6 to duplicate awk for this case, at least for a simple case, and some notes about where we might go from there. If you think this idea might have legs, please consider sharing links to this gist to fuel discussion, writing tests for the proposal, giving feedback, etc. TIA.

P6 design speculation

The mentions of awk in the design/speculation docs are in DRAFT: Synopsis 19: Command Line Interface.

In a section titled Unchanged Syntactic Features it notes:

Several features have not changed from Perl 5, including:

  Option...                            Still means...
  -a                                   Autosplit
  -c                                   Check syntax
  -e *line*                            Execute
  -F *expression*                      Specify autosplit field separator
  -h                                   Display help and exit
  -I *directory*[,*directory*[,...]]   Unshift CompUnitRepo::Local('s) to @?INC
  -n                                   Act like awk
  -p                                   Act like sed
  -S                                   Search PATH for script
  -T                                   Enable taint mode
  -v                                   Display version info
  -V                                   Display verbose config info

Remember, while these documents were written as if the features they describe had already been implemented they were instead speculative. These documents are now just historical artifacts; they don't record what actually got done and they don't record where things ended up actually being designed differently than these design documents speculated they would be.

Of the above list, -c, -e, -h, -I, -n, -p, and -v- are implemented, more or less as speculated.

-a isn't.

Nor is --autoloop-delim, -F expression:

Pattern to split on (used with -a).
Substitutes an expression for the default split function, which is {split ' '}.
Accepts Unicode strings (as long as your shell lets you pass them).
Allows passing a closure (e.g. -F "{use Text::CSV}").
Awk's not better any more :)

Note the last line. Actually, Awk's better because this hasn't been implemented yet.

Rosettacode

Quoting the corresponding Rosettacode page, Kernighan's large earthquake problem:

You are given a data file of thousands of lines; each of three whitespace separated fields: a date, a one word name and the magnitude of the event. Example lines from the file would be lines like:

8/27/1883    Krakatoa            8.8
5/18/1980    MountStHelens       7.6
3/13/2009    CostaRica           5.1

Create a program or script invocation to find all the events with magnitude greater than 6.

Here's the awk solution:

$3 > 6

With good reason, Brian spends 3-4 minutes on this example. Here's a link that starts exactly one minute into the video of his lecture where he introduces the example: Kernighans large earthquake problem.

Perl rose to fame in large part because it was competitive in several programming domains and awk's one-liners was one of them. But Brian's chosen one liner emphasizes the truth that, technically speaking, Perls 1 thru 5 got close enough to be a noteworthy runner up -- but definitely just a runner up.

Brian goes on to mention Perl 6 near the end but basically dismisses it, saying:

[Perl has] stopped evolving in some sense and Perl 6 is never going to arrive really and so Perl really in some sense has missed a boat -- permanently, I don't know, it's still a very useful language but it'll be very interesting to see what happens...

"Awk's not better any more :)" <-- so far, a pipe dream

Would anyone be willing to help do something about this? Something small, simple, and tightly focused, something that cries out to be done, something that is, I think, guaranteed to be a hit at some level, that may also catalyze going on a journey that can go much further?

Returning to the design (aka speculations) doc S19 -- Command Line Interface:

Option...                            Still means...
-a                                   Autosplit
-F *expression*                      Specify autosplit field separator
-n                                   Act like awk
-p                                   Act like sed

Per the doc and my Rakudo tests, -n and -p have been implemented; -a and -F have not.

This post is about:

  • Implementing -a;

  • How important -a could be;

  • Some ideas about a better -a design.

-a in Perl 5

The Perl 5 implementation of -a (Autosplit) and -Fpattern (Field separator) is as follows:

-a in Perl 6

Emulating Perl 5's -a without -Fpattern being specified in P6 would mean wrapping the user's code something like this:

my @F;
for lines {
  @F = .split: /\s+/, :skip-empty;
  # user's code runs here
}

My thinking, for making progress with Perl 6 in this regard, is that we can ignore -F and the especially arcane details of how Perl 5's myriad split special cases work, and implement an -a that's:

  • Generally backwards compatible with Perl 5's -a without use of -F;

  • Supports -aPattern instead of -Fpattern to specify a pattern.

  • Nicely covers Perl 5's default functionality by default.

Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment