Skip to content

Instantly share code, notes, and snippets.

@anjackson
Last active August 29, 2015 14:23
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save anjackson/06971ff43e50645e3f2f to your computer and use it in GitHub Desktop.
Save anjackson/06971ff43e50645e3f2f to your computer and use it in GitHub Desktop.
OpenWayback Stacktrace Analysis

Most of the threads look like this - waiting for something to do:

"http-8080-198" daemon prio=10 tid=0x00007f06cc18e800 nid=0x10ca in Object.wait() [0x00007f073b6f5000]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        - waiting on <0x0000000703eeafa8> (a org.apache.tomcat.util.net.JIoEndpoint$Worker)
        at java.lang.Object.wait(Object.java:503)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.await(JIoEndpoint.java:458)
        - locked <0x0000000703eeafa8> (a org.apache.tomcat.util.net.JIoEndpoint$Worker)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:484)
        at java.lang.Thread.run(Unknown Source)

The first suspicious thing is the two threads using getRemoteAddr - this can be a significant overhead in some cases, expecially if the server is not set up properly for reverse DNS lookups, in which case this can lead to nasty timeout pauses.

"Thread 211 all handling: 20090908045216js_/http://cn.yimg.com/a/alen/js/cn_flash_general_20090201.js" daemon prio=10 tid=0x00007f06cc182000 nid=0x10c4 runnable [0x00007f073bcfb000]
   java.lang.Thread.State: RUNNABLE
        at java.net.Inet4AddressImpl.getHostByAddr(Native Method)
        at java.net.InetAddress$1.getHostByAddr(Unknown Source)
        at java.net.InetAddress.getHostFromNameService(Unknown Source)
        at java.net.InetAddress.getHostName(Unknown Source)
        at java.net.InetAddress.getHostName(Unknown Source)
        at org.apache.coyote.http11.Http11Processor.action(Http11Processor.java:1067)
        at org.apache.coyote.Request.action(Request.java:351)
        at org.apache.catalina.connector.Request.getLocalName(Request.java:1278)
        at org.apache.catalina.connector.RequestFacade.getLocalName(RequestFacade.java:910)
        at org.archive.wayback.core.WaybackRequest.extractHttpRequestInfo(WaybackRequest.java:1116)
        at org.archive.wayback.webapp.AccessPoint.handleRequest(AccessPoint.java:284)
        at org.archive.wayback.util.webapp.RequestMapper.handleRequest(RequestMapper.java:198)
        at org.archive.wayback.util.webapp.RequestFilter.doFilter(RequestFilter.java:146)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Unknown Source)

And one BLOCKED thread, waiting for something to do with the logging to the console. This kind of thing can be a bad idea when there are many threads writing to the same console. Consider disabling it and logging only to files instead.

"http-8080-146" daemon prio=10 tid=0x00007f06cc124800 nid=0x1094 waiting for monitor entry [0x00007f0742bea000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at java.util.logging.StreamHandler.flush(Unknown Source)
        - locked <0x000000070004b3c8> (a java.util.logging.ConsoleHandler)
        at java.util.logging.ConsoleHandler.publish(Unknown Source)
        at java.util.logging.Logger.log(Unknown Source)
        at java.util.logging.Logger.doLog(Unknown Source)
        at java.util.logging.Logger.log(Unknown Source)
        at java.util.logging.Logger.finer(Unknown Source)
        at org.archive.wayback.util.webapp.PortMapper.getRequestHandlerContext(PortMapper.java:156)
        at org.archive.wayback.util.webapp.RequestMapper.mapRequest(RequestMapper.java:159)
        at org.archive.wayback.util.webapp.RequestMapper.handleRequest(RequestMapper.java:188)
        at org.archive.wayback.util.webapp.RequestFilter.doFilter(RequestFilter.java:146)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Unknown Source)

Then there are about 30 CDX lookups, all in read0 because RandomAccessFile.readLine uses byte-wise reads.

"Thread 208 all handling: 20090908045216im_/http://cn.yimg.com/hxlia/hxli/729933_l2_090809_595x90.jpg" daemon prio=10 tid=0x00007f06cc17b800 nid=0x10c1 runnable [0x00007f073bffe000]
   java.lang.Thread.State: RUNNABLE
        at java.io.RandomAccessFile.read0(Native Method)
        at java.io.RandomAccessFile.read(Unknown Source)
        at java.io.RandomAccessFile.readLine(Unknown Source)
        at org.archive.wayback.util.flatfile.FlatFile.findKeyOffset(FlatFile.java:106)
        at org.archive.wayback.util.flatfile.FlatFile.getRecordIterator(FlatFile.java:184)
        at org.archive.wayback.resourceindex.cdx.CDXIndex.getPrefixIterator(CDXIndex.java:61)
        at org.archive.wayback.resourceindex.CompositeSearchResultSource.getPrefixIterator(CompositeSearchResultSource.java:78)
        at org.archive.wayback.resourceindex.LocalResourceIndex.doCaptureQuery(LocalResourceIndex.java:210)
        at org.archive.wayback.resourceindex.LocalResourceIndex.query(LocalResourceIndex.java:326)
        at org.archive.wayback.webapp.AccessPoint.queryIndex(AccessPoint.java:597)
        at org.archive.wayback.webapp.AccessPoint.searchCaptures(AccessPoint.java:1015)
        at org.archive.wayback.webapp.AccessPoint.handleReplay(AccessPoint.java:771)
        at org.archive.wayback.webapp.AccessPoint.handleRequest(AccessPoint.java:313)
        at org.archive.wayback.util.webapp.RequestMapper.handleRequest(RequestMapper.java:198)
        at org.archive.wayback.util.webapp.RequestFilter.doFilter(RequestFilter.java:146)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Unknown Source)

Then there are three threads that are sat opening the RandomAccessFile objects, presumable waiting for the OS finish opening the file. The code opens and closes RandomAccessFile objects for every CDX file, on every request. This is not a very good tactic from a performance point of view.

"Thread 205 all handling: 20090908045815im_/http://cn.yimg.com/lm1201/1/c819.jpg" daemon prio=10 tid=0x00007f06cc175800 nid=0x10be runnable [0x00007f07403c2000]
   java.lang.Thread.State: RUNNABLE
        at java.io.RandomAccessFile.open(Native Method)
        at java.io.RandomAccessFile.<init>(Unknown Source)
        at org.archive.wayback.util.flatfile.FlatFile.getRecordIterator(FlatFile.java:183)
        at org.archive.wayback.resourceindex.cdx.CDXIndex.getPrefixIterator(CDXIndex.java:61)
        at org.archive.wayback.resourceindex.CompositeSearchResultSource.getPrefixIterator(CompositeSearchResultSource.java:78)
        at org.archive.wayback.resourceindex.LocalResourceIndex.doCaptureQuery(LocalResourceIndex.java:210)
        at org.archive.wayback.resourceindex.LocalResourceIndex.query(LocalResourceIndex.java:326)
        at org.archive.wayback.webapp.AccessPoint.queryIndex(AccessPoint.java:597)
        at org.archive.wayback.webapp.AccessPoint.searchCaptures(AccessPoint.java:1015)
        at org.archive.wayback.webapp.AccessPoint.handleReplay(AccessPoint.java:771)
        at org.archive.wayback.webapp.AccessPoint.handleRequest(AccessPoint.java:313)
        at org.archive.wayback.util.webapp.RequestMapper.handleRequest(RequestMapper.java:198)
        at org.archive.wayback.util.webapp.RequestFilter.doFilter(RequestFilter.java:146)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)
        at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)
        at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:606)
        at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
        at java.lang.Thread.run(Unknown Source)

To really understand what's happening, it would be good to use jmap -dump to create a heap dump when the system is loaded up and then using something like jvisualvm or jhat to work out what classes of objects are consuming the most heapspace. My suspicion is that there will be a lot of RandomAccessFile objects and their associated input stream wrappers.

The current CDX binary-seach logic could be improved in a number of different ways. The most obvious one would be to re-use the RandomAccessFile objects rather than creating/destroying them (which is not only a heavy JVM overhead, but may also stress the OS as it struggles to cope with the rapid consumption and turnover of file handles).

However, RandomAccessFile is not threadsafe, so in the context of OpenWayback, we would have to wrap them up as ThreadLocal variables. This would mean that each thread would open a separate RandomAccessFile object for each CDX file, hold them open and re-use them. The current code expects to close the input streams that are generated from the files, so that would need to be changed/prevented in order to allow re-use.

However, just doing that would still leave us reliant on unbuffered reads, which are very slow on modern kit. We could re-use an existing BufferedRandomAccessFile implementation, like this one (no longer used in Cassandra):

https://github.com/facebookarchive/cassandra/blob/master/src/org/apache/cassandra/io/BufferedRandomAccessFile.java

Or, if we can expect most folks to be on 64bit platforms, we could move to a version that allows OS memory mapping to be used, which will make things very fast if the index size is not too far off RAM size. e.g.

http://dsiutils.di.unimi.it/docs/it/unimi/dsi/io/ByteBufferInputStream.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment