Skip to content

Instantly share code, notes, and snippets.

@r00k
Last active August 29, 2015 14:06
Show Gist options
  • Save r00k/c5a3ec1680768bff36ce to your computer and use it in GitHub Desktop.
Save r00k/c5a3ec1680768bff36ce to your computer and use it in GitHub Desktop.
2014-03-20
Line 1
Line 2
2014-03-21
Line 1
Line 2
# What I want:
[{:body=>"Line 1\n\nLine 2", :year=>"2014", :month=>"03", :day=>"20"},
{:body=>"Line 1\n\nLine 2", :year=>"2014", :month=>"03", :day=>"21"}]
# What I'm getting
[{:year=>"2014"@0,
:month=>"03"@5,
:day=>"20"@8,
:body=>"Line 1\n\nLine 2\n\n2014-03-21\n\nLine 1\n\nLine 2"@12}]
class Parser < Parslet::Parser
rule(:integer) { match('[0-9]').repeat(1) }
rule(:hyphen) { str('-') }
rule(:date) { integer.repeat(1,4).as(:year) >>
hyphen >>
integer.repeat(1,2).as(:month) >>
hyphen >>
integer.repeat(1,2).as(:day) >>
str("\n\n") }
rule(:body) { (any.repeat >> (date.present? | any.absent?)).as(:body) }
rule(:entry) { date >> body }
rule(:entries) { entry.repeat }
root(:entries)
end
@r00k
Copy link
Author

r00k commented Sep 27, 2014

I believe the problem is that my :body rule is too greedy, but can't figure out how to remedy that.

@josephgrossberg
Copy link

@r00k Sorry to jump in, mid-conversation, but I'm confused as to why :body has extra \n's in both your desired results and actual results -- shouldn't there be just one, between Line 1 and Line 2? It looks like file_to_parse.txt has one newline between Line 1 and Line 2.

@josephgrossberg
Copy link

Also, I hope you're coding outside, or at least with the window open. It's amazing, outside, today. 😄

@jferris
Copy link

jferris commented Sep 27, 2014

Not familiar with this parser, but I think you want something besides any.

@calebhearth
Copy link

  rule(:body) { (date.absent? >> any).repeat.as(:body) }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment