Skip to content

Instantly share code, notes, and snippets.

@pomu0325
Created August 24, 2011 16:48
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save pomu0325/1168507 to your computer and use it in GitHub Desktop.
Save pomu0325/1168507 to your computer and use it in GitHub Desktop.
comparing >- and >#
def time(fn: => Unit) = {
val start = System.currentTimeMillis
fn
println("%d ms".format(System.currentTimeMillis - start))
}
import dispatch._
import json._
import JsHttp._
time { Http(url("http://search.twitter.com/search.json?q=dispatch") >- {s => Js(s)}) } // 400-500ms NORMAL
time { Http(url("http://search.twitter.com/search.json?q=dispatch") >- {s => Js(s)}) } // 400-500ms STILL NORMAL
time { Http(url("http://search.twitter.com/search.json?q=dispatch") ># {s => s}) } // 16000 ms ># TOO SLOW !
time { Http(url("http://search.twitter.com/search.json?q=dispatch") >- {s => Js(s)}) } // 16000 ms >- ALSO TOO SLOW!
time { Http(url("http://search.twitter.com/search.json?q=dispatch") >- {s => Js(s)}) } // 500 ms >- BACK TO NORMAL
time { Http(url("http://search.twitter.com/search.json?q=dispatch") ># {s => s}) } // 16000-17000 ms ># SLOW AGAIN
import java.io._
val js = """{"A":"a","B":[1,2],"C":"c"}"""
time { Js(js) } // 0-1ms
time { Js(new ByteArrayInputStream(js.getBytes)) } // 50-60 ms
@n8han
Copy link

n8han commented Aug 24, 2011

Yeah, this is pretty strange, thanks for looking into it. Dispatch is still ultimately using an inputstream either way, just with >- it has passed through a scala.io.Source.

@pomu0325
Copy link
Author

i found out not only >#, >- sometime becomes very slow.
it seems that using JsonParser(CharArrayReader) and JsonParser(StreamReader) alternately causes parsing to be slow.

what i've found so far:

@pomu0325
Copy link
Author

eclipse debug screenshot while running slow JsonParser(StreamReader) after JsonParser(CharArrayReader)
https://picasaweb.google.com/lh/photo/7XFKrIe0pRZgcKZqSjX0wg?feat=directlink

  • lastNoSuccess is created from CharSequenceReader at offset 86869
  • current StreamReader is still at offset 12504 which is smaller than lastNoSuccess
  • lastNoSuccess was created at the previous call of JsonParser(CharArrayReader) ?

@pomu0325
Copy link
Author

i got it!

there are two "lastNoSuccess" in different "trait Parsers" involved here.

first, look at class hierarchy:

trait Scanners extends Parsers
 ↑
abstract class Lexical extends Scanners with Tokens
 ↑
class StdLexical extends Lexical with StdTokens
 ↑
class Lexer extends StdLexical with ImplicitConversions

trait TokenParsers extends Parsers
 ↑
trait StdTokenParsers extends TokenParsers
 ↑
object JsonParser extends StdTokenParsers with ImplicitConversions

both dispatch.json.JsonParser and scala.util.parsing.json.Lexer (which has type aliased as Tokens) extends trait scala.util.parsing.combinator.Parsers.
thus, both JsonParser and Lexer has "var lastNoSuccess"

JsonParser#lastNoSuccess is updated to null inside JsonParser#phrase, but Lexer#lastNoSuccess is NOT.
thus, lastNoSuccess created at previous parse (if it has larger offset) remains still...

@pomu0325
Copy link
Author

@n8han
Copy link

n8han commented Aug 25, 2011

I pulled that commit from your branch into master. Thanks for the extensive debugging!

@pomu0325
Copy link
Author

above patch works in single thread, but might not in multi thread...

@n8han
Copy link

n8han commented Aug 26, 2011

Hm... glancing through the inherited traits they don't strike me as thread safe either. It might be better to move the lexical entirely into the apply function so that we have one instance per application.

I should also mention that this JSON parser was not written by me and as you can tell I'm not deeply familiar with its internals. The spray-json library is pretty similar to it and I've been considering deprecating it in favor of binding to spray. Any thoughts on that?

@pomu0325
Copy link
Author

trait scala.util.parsing.combinator.Parsers itself is not thread safe due to "var lastNoSuccess".
it is also mentioned here: http://scala-programming-language.1934581.n4.nabble.com/Scala-Parsers-are-not-thread-safe-td2243477.html

scala.util.parsing.json.JSON has similar implementation (having 2 subclass of Parsers) as dispatch.json's, but it only accept CharArrayReader, so performance issue does not happen.
note scala.util.parsing.json.JSON never returns error detail, so overwriting "var lastNoSuccess" by other thread technically seems to be not a problem.

if you give away the memory usage and change JsonParser.apply(Reader[Char]) to JsonParser.apply(CharSequenceReader), I think it's technically ok (in terms of performance issue).

about spray-json: +1 for it. I've never used spray-json, but I can see it's pretty similar at a glance :)
one thing I worry about is that sjson (and maybe some other library I don't know of) has a dependency to dispatch-json and change of binding breaks the dependency...

@pomu0325
Copy link
Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment