Skip to content

Instantly share code, notes, and snippets.

@ryanrolds
Created March 5, 2013 21:51
Show Gist options
  • Save ryanrolds/5094649 to your computer and use it in GitHub Desktop.
Save ryanrolds/5094649 to your computer and use it in GitHub Desktop.
<ryanrolds> Having an issue with json we are sending isn't getting parsed.
<ryanrolds> It looks like if we POST a JSON array with mixed strings and objects it only parses a fraction of the messages in the array.
<r0tha> hmmm
<r0tha> have an example of the message
<ryanrolds> One sec
<ryanrolds> "UPDATING LOAD DATA",{"message":"queryserver apps","ip":"192.168.0.42","fqdn":"office-moth","apps":{},"empty":true},{"total_ram":4061912,"free_ram":2958524,"os_used":1000000,"qs_rss":8680,"qs_mmap":8680,"ip":"192.168.0.42","fqdn":"office-moth","load_five":0,"swap":0,"used_mem":0.2483264039201243,"overfull":false,"full":false,"draining":true,"empty":true,"message":"queryserver load"},"FINISHED UPDATING LOAD DATA"
<ryanrolds> ,"STARTUP: Now accepting requests on port 80",{"message":"moth started"}]
<ryanrolds> Er, that's missing a [ at the start
<ryanrolds> copy paste issue, it's there
<r0tha> ah so check this out
<r0tha> it's not valid json
<r0tha> which is why it's only indexing some as JSON and leaving the others as raw text
<ryanrolds> What part isn't valid?
<ryanrolds> I run it through jsonlint and it's valid.
<ryanrolds> I'm asserting that it's valid, do you have anything to backup your claim that's not valid JSON?
<r0tha> I could be missing something from the paste
<r0tha> but im doing the same running it through lint
<r0tha> aahhh sec
<ryanrolds> Probably that [ that I said wasn't i the copy + paste. Also it wraps to different lines because of the irc client I'm using.
<r0tha> http://fpaste.org/
<ryanrolds> http://fpaste.org/F1Ip/
<ryanrolds> Included the method, uri and content-type header
<ryanrolds> If it's not parsing all of the lines correctly then why is the system reporting 200 OK?
<r0tha> that's just saying we received the data
<ryanrolds> If it's not been processed yet, shouldn't that be a 202 Deferred?
<ryanrolds> Er, 202 Accepted (which essentially means deferred processing)
<ryanrolds> http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html
<r0tha> i could
<r0tha> **it could
<ryanrolds> It's not a big deal, just 202 fits the situation better then 200. 200 implies that it's processed and should show up if I immediately do a search.
<r0tha> ah yea it's def. legalize
<r0tha> anyway I'm looking into the JSON format
<ryanrolds> Thanks
<r0tha> one thing to remember is that the way we handle search is key value for json
<r0tha> i'll be back in an hour or so
<ryanrolds> It's possible that my search queries have a problem, I don't believe it is. This looks like a parsing or indexing issue to me. Even if I don't use any filters, message that I'm sure was sent that we received a 200 for don't show up.
<r0tha> yea this has to do with the way we handle JSON. All the data needs to have a key / corresponding value
<r0tha> even if it's null that's fine
* jbartus (~ec2-user@ec2-75-101-242-179.compute-1.amazonaws.com) has left #loggly
<ryanrolds> I don't understand what requires me to do.
<ryanrolds> Is it an issue with it being in an array? Empty objects? What?
<r0tha> My recommendation is to use something like this: http://fpaste.org/rmhB/
<ryanrolds> So it's an issue with the array.
<ryanrolds> The array is valid JSON.
<ryanrolds> I'm not willing to change an perfectly normal and valid JSON structure (an array containing strings and object) to something that requires I litter the structure with nulls and unnecessary nesting.
<ryanrolds> Thank you for your help.
<r0tha> np
<ryanrolds> If you guy ever improving your parsing and you root level arrays, let me know. ryan@moonshadowmobile.com
<ryanrolds> One thing to note that is absolutly wrong with the your suggestion. JSON objects do not ensure that property order remains the same.
<ryanrolds> So, depending on the parse the order of the keys may change, which is serious problem. It completely changes the meaning of the logs.
<ryanrolds> *depending on the parser
<ryanrolds> That make sense?
<r0tha> to a certain extent
<ryanrolds> The JSON spec only mandates that arrays maintain order. Many parser sort objects alphabetically, so when you iterate over the object keys the order may be different then when it was originally serialized.
<r0tha> correct
<ryanrolds> So, in your example "STARTING LOAD AND PROVISIONING UPDATE", "UPDATING LOAD DATA" may be processed after "FINISHED UPDATING LOAD DATA" even though they appear before in the serialized format.
<r0tha> my suggestion is intended to help you get the data into loggly...
<ryanrolds> Only an array [ ] ensures order.
<r0tha> yes they might be out of order
<ryanrolds> What good are logs if they are out of order?
<ryanrolds> A dev has no idea if the order is a bug in the code or the bug in the log collector.
<ryanrolds> This is why you guys should support arrays at the root of a JSON message body.
<r0tha> totally understand
<r0tha> the next update to the site will handle your array and not bork the ordering
<ryanrolds> ETA?
<r0tha> some time next quarter
<ryanrolds> Yeah, not having charts of our provisining process for that long isn't going to fly with my boss or anyone else here. Looks like I'm going to have to check someone and send everything as a series of POST, one message at a time, until I moved to a different system.
<ryanrolds> *have to check something and send
<ryanrolds> ugh, * have to hack something and send
<ryanrolds> I wish your system reported things it couldn't parse, even if it was a way to do search for failed message. This issue explains some odd things we have been seeing for a whie.
<ryanrolds> Frankly, the fact your system reports OK when things aren't OK is enough to justify moving away from loggly.
<r0tha> there are sometimes technical hurdles that need to be considered for responses
<r0tha> while verifying that each log event has been processed correctly is very doable for 200mb of log data
<ryanrolds> There are, but nothing that wouldn't be solved by auto scaled front-ends that either completely process or perform prechecks before returning a response.
<ryanrolds> would you want to work with someone or something that lied and required you double check it's work all the time?
<r0tha> at the volume of data accepted it is not possible to perform these checks in realtime
<r0tha> we will give a 404 or 503 if the data is not accepted
<r0tha> if it's a valid event and we give back a response ok then the event gets processed....
<ryanrolds> Right, but that OK can't be trusted.
<ryanrolds> You're suggestion that we double check everything we send to Loggly.
<ryanrolds> That's a broken model.
<r0tha> no
<r0tha> i just said if you send an event
<ryanrolds> Ok, then I misunderstood what you mean when you said " while verifying that each log event has been processed correctly is very doable for 200mb of log data"
<ryanrolds> Also, it is possible to parsing checks in real time w/ a regionally dispersed load balancers in front of auto or dynamically scaled front-ends.
<r0tha> if you send us valid event, the event will be processed, you don't need to double check
<ryanrolds> Could probably even do full processing with that solution.
<ryanrolds> Ok, the comes down to what's a valid event.
<ryanrolds> *it comes down
<ryanrolds> It has to be more then just valid JSON.
<ryanrolds> Is there documentation I can read outlining the correctly way to send multiple "events" in a single message body using JSON?
<ryanrolds> That's a load question, from are conversion you guys don't support that without risking log order getting screwed up.
<ryanrolds> *from our conversation
<r0tha> we require the outer piece of json to be a dictionary
<r0tha> whatever you put inside is fine
<ryanrolds> Right, but dictionaries are an "unordered collection".
<r0tha> yes so order can be thrown off
<ryanrolds> Logs that are unordered a significantly less useful, I'm surprised that issue wasn't spotted during dev.
<ryanrolds> *unordered are sig...
<ryanrolds> Loggly's implementation makes Loggly's own product less useful to Loggly's customers.
<r0tha> we look at each POST as a different log event
<ryanrolds> So, every event has to be it's own request/message? Doesn't that means users have to send more requests then required as they can't chunk JSON messages?
<ryanrolds> *er, group/batch JSON messages
<ryanrolds> Here is the deal, I'm batching our messages/events so minimize the number of requests, which is both our and Loggly's interests. It looks like you're telling me to break up the batch and send +4 times more requests.
<r0tha> http://forum.loggly.com/discussion/comment/667/#Comment_667
<ryanrolds> If I'm reading that thread correctly, what I'm doing should work if JSON is enabled on the intput.
<r0tha> order is implemented by timestamp
<r0tha> the timestamp for all the events is the same when bulk uploading
<r0tha> as I mentioned it looks like only one event within the main dashboard
<ryanrolds> What I'm reporting is that some messages aren't being processed.
<ryanrolds> Well, indexed. So when I search for JSON entires in the array they do not appear.
<ryanrolds> Now we are back a square one.....
<ryanrolds> *at
<ryanrolds> Is there someone I can talk to that is aware of those tickets? I just spend a lot of time on this to only have it come back around to the original question.
<ryanrolds> *er, those issues.
<ryanrolds> This now suggests that the issue is arrays of mixed types, so strings and objects can't be mixed in an array.
* r0tha has changed the topic to: "Welcome to #Loggly, if you have any questions ping an operator, docs are located @ http://loggly.com/support, you can also email support@loggly.com if no one appears to be on"
<r0tha> please email support@loggly.com
<r0tha> sorry I couldn't answer all of your questions
<ryanrolds> One sec, I'm confirming it's an issue with arrays containing strings AND objects. All of the examples in that thread are arrays containing objects.
<ryanrolds> That's what is was or is at least related to.
<ryanrolds> So, ["foo", {"message": "bar"}] doesn't work.
<ryanrolds> But, [{"message": "foo"}, {"message":bar}] does work.
<ryanrolds> It's either an issue with arrays containing mixed types or arrays containing strings.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment