Last active
August 29, 2015 14:07
-
-
Save quasiben/2c7b5199ac13227bc4eb to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SLF4J: Class path contains multiple SLF4J bindings. | |
SLF4J: Found binding in [jar:file:/Users/quasiben/Research/ContinuumDev/Memex/nutch_application/nutch/runtime/local/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] | |
SLF4J: Found binding in [jar:file:/Users/quasiben/anaconda/envs/nutchpy/lib/python2.7/site-packages/nutchpy/java_libs/seqreader-app-1.0-SNAPSHOT-jar-with-dependencies.jar!/org/slf4j/impl/StaticLoggerBinder.class] | |
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. | |
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] | |
2014-09-29 16:31:55.463 java[17872:5403] Unable to load realm info from SCDynamicStore | |
URL: /var/folders/1t/t94brwgx7sjcn8jgz4gr3_c00000gq/T/tmpaJeuZN | |
14/09/29 16:31:56 INFO crawl.Injector: Injector: starting at 2014-09-29 16:31:56 | |
14/09/29 16:31:56 INFO crawl.Injector: Injector: crawlDb: /Users/quasiben/Research/ContinuumDev/Memex/nutchpy/crawl | |
14/09/29 16:31:56 INFO crawl.Injector: Injector: urlDir: /var/folders/1t/t94brwgx7sjcn8jgz4gr3_c00000gq/T/tmpaJeuZN | |
14/09/29 16:31:56 INFO crawl.Injector: tempDir: /tmp/hadoop-quasiben/mapred/temp/inject-temp-587370943 | |
14/09/29 16:31:56 INFO crawl.Injector: CONF: Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, nutch-default.xml, nutch-site.xml | |
14/09/29 16:31:56 INFO crawl.Injector: Injector: Converting injected urls to crawl db entries. | |
14/09/29 16:31:56 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable | |
14/09/29 16:31:56 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. | |
14/09/29 16:31:56 WARN snappy.LoadSnappy: Snappy native library not loaded | |
14/09/29 16:31:56 INFO mapred.FileInputFormat: Total input paths to process : 1 | |
14/09/29 16:31:56 INFO mapred.JobClient: Running job: job_local409772736_0001 | |
14/09/29 16:31:56 INFO mapred.LocalJobRunner: Waiting for map tasks | |
14/09/29 16:31:56 INFO mapred.LocalJobRunner: Starting task: attempt_local409772736_0001_m_000000_0 | |
14/09/29 16:31:56 INFO mapred.Task: Using ResourceCalculatorPlugin : null | |
14/09/29 16:31:56 INFO mapred.MapTask: Processing split: file:/var/folders/1t/t94brwgx7sjcn8jgz4gr3_c00000gq/T/tmpaJeuZN/seed.txt:0+25 | |
14/09/29 16:31:56 INFO mapred.MapTask: numReduceTasks: 0 | |
14/09/29 16:31:56 INFO plugin.PluginRepository: Plugins: looking in: /Users/quasiben/Research/ContinuumDev/Memex/nutch_application/nutch/runtime/local/plugins | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Plugin Auto-activation mode: [true] | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Registered Plugins: | |
14/09/29 16:31:57 INFO plugin.PluginRepository: the nutch core extension points (nutch-extensionpoints) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Basic URL Normalizer (urlnormalizer-basic) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Html Parse Plug-in (parse-html) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Basic Indexing Filter (index-basic) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: SOLRIndexWriter (indexer-solr) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: HTTP Framework (lib-http) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Regex URL Filter (urlfilter-regex) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Pass-through URL Normalizer (urlnormalizer-pass) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Http Protocol Plug-in (protocol-http) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Regex URL Normalizer (urlnormalizer-regex) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: CyberNeko HTML Parser (lib-nekohtml) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Tika Parser Plug-in (parse-tika) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: OPIC Scoring Plug-in (scoring-opic) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Anchor Indexing Filter (index-anchor) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Regex URL Filter Framework (lib-regex-filter) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Registered Extension-Points: | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Nutch URL Normalizer (org.apache.nutch.net.URLNormalizer) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Nutch Protocol (org.apache.nutch.protocol.Protocol) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Nutch Segment Merge Filter (org.apache.nutch.segment.SegmentMergeFilter) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Nutch URL Filter (org.apache.nutch.net.URLFilter) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Nutch Index Writer (org.apache.nutch.indexer.IndexWriter) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Nutch Indexing Filter (org.apache.nutch.indexer.IndexingFilter) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: HTML Parse Filter (org.apache.nutch.parse.HtmlParseFilter) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Nutch Content Parser (org.apache.nutch.parse.Parser) | |
14/09/29 16:31:57 INFO plugin.PluginRepository: Nutch Scoring (org.apache.nutch.scoring.ScoringFilter) | |
14/09/29 16:31:57 INFO conf.Configuration: found resource regex-normalize.xml at file:/Users/quasiben/Research/ContinuumDev/Memex/nutch_application/nutch/runtime/local/conf/regex-normalize.xml | |
14/09/29 16:31:57 INFO conf.Configuration: found resource regex-urlfilter.txt at file:/Users/quasiben/Research/ContinuumDev/Memex/nutch_application/nutch/runtime/local/conf/regex-urlfilter.txt | |
14/09/29 16:31:57 INFO regex.RegexURLNormalizer: can't find rules for scope 'inject', using default | |
14/09/29 16:31:57 INFO mapred.Task: Task:attempt_local409772736_0001_m_000000_0 is done. And is in the process of commiting | |
14/09/29 16:31:57 INFO mapred.LocalJobRunner: | |
14/09/29 16:31:57 INFO mapred.Task: Task attempt_local409772736_0001_m_000000_0 is allowed to commit now | |
14/09/29 16:31:57 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local409772736_0001_m_000000_0' to file:/tmp/hadoop-quasiben/mapred/temp/inject-temp-587370943 | |
14/09/29 16:31:57 INFO mapred.LocalJobRunner: file:/var/folders/1t/t94brwgx7sjcn8jgz4gr3_c00000gq/T/tmpaJeuZN/seed.txt:0+25 | |
14/09/29 16:31:57 INFO mapred.Task: Task 'attempt_local409772736_0001_m_000000_0' done. | |
14/09/29 16:31:57 INFO mapred.LocalJobRunner: Finishing task: attempt_local409772736_0001_m_000000_0 | |
14/09/29 16:31:57 INFO mapred.LocalJobRunner: Map task executor complete. | |
14/09/29 16:31:57 INFO mapred.JobClient: map 100% reduce 0% | |
14/09/29 16:31:57 INFO mapred.JobClient: Job complete: job_local409772736_0001 | |
14/09/29 16:31:57 INFO mapred.JobClient: Counters: 11 | |
14/09/29 16:31:57 INFO mapred.JobClient: File Input Format Counters | |
14/09/29 16:31:57 INFO mapred.JobClient: Bytes Read=25 | |
14/09/29 16:31:57 INFO mapred.JobClient: File Output Format Counters | |
14/09/29 16:31:57 INFO mapred.JobClient: Bytes Written=160 | |
14/09/29 16:31:57 INFO mapred.JobClient: injector | |
14/09/29 16:31:57 INFO mapred.JobClient: urls_injected=1 | |
14/09/29 16:31:57 INFO mapred.JobClient: FileSystemCounters | |
14/09/29 16:31:57 INFO mapred.JobClient: FILE_BYTES_READ=546517 | |
14/09/29 16:31:57 INFO mapred.JobClient: FILE_BYTES_WRITTEN=635913 | |
14/09/29 16:31:57 INFO mapred.JobClient: Map-Reduce Framework | |
14/09/29 16:31:57 INFO mapred.JobClient: Map input records=1 | |
14/09/29 16:31:57 INFO mapred.JobClient: Spilled Records=0 | |
14/09/29 16:31:57 INFO mapred.JobClient: Total committed heap usage (bytes)=515375104 | |
14/09/29 16:31:57 INFO mapred.JobClient: Map input bytes=25 | |
14/09/29 16:31:57 INFO mapred.JobClient: SPLIT_RAW_BYTES=125 | |
14/09/29 16:31:57 INFO mapred.JobClient: Map output records=1 | |
14/09/29 16:31:57 INFO crawl.Injector: Injector: Total number of urls rejected by filters: 0 | |
14/09/29 16:31:57 INFO crawl.Injector: Injector: Total number of urls after normalization: 1 | |
14/09/29 16:31:57 INFO crawl.Injector: Injector: Merging injected urls into crawl db. | |
14/09/29 16:31:57 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. | |
14/09/29 16:31:57 INFO mapred.FileInputFormat: Total input paths to process : 2 | |
14/09/29 16:31:57 INFO mapred.JobClient: Running job: job_local917149307_0002 | |
14/09/29 16:31:57 INFO mapred.LocalJobRunner: Waiting for map tasks | |
14/09/29 16:31:57 INFO mapred.LocalJobRunner: Starting task: attempt_local917149307_0002_m_000000_0 | |
14/09/29 16:31:57 INFO mapred.Task: Using ResourceCalculatorPlugin : null | |
14/09/29 16:31:57 INFO mapred.MapTask: Processing split: file:/Users/quasiben/Research/ContinuumDev/Memex/nutchpy/crawl/current/part-00000/data:0+148 | |
14/09/29 16:31:57 INFO mapred.MapTask: numReduceTasks: 1 | |
14/09/29 16:31:57 INFO mapred.MapTask: io.sort.mb = 100 | |
14/09/29 16:31:57 INFO mapred.MapTask: data buffer = 79691776/99614720 | |
14/09/29 16:31:57 INFO mapred.MapTask: record buffer = 262144/327680 | |
14/09/29 16:31:57 INFO mapred.MapTask: Starting flush of map output | |
14/09/29 16:31:57 INFO mapred.MapTask: Finished spill 0 | |
14/09/29 16:31:57 INFO mapred.Task: Task:attempt_local917149307_0002_m_000000_0 is done. And is in the process of commiting | |
14/09/29 16:31:57 INFO mapred.LocalJobRunner: file:/Users/quasiben/Research/ContinuumDev/Memex/nutchpy/crawl/current/part-00000/data:0+148 | |
14/09/29 16:31:57 INFO mapred.Task: Task 'attempt_local917149307_0002_m_000000_0' done. | |
14/09/29 16:31:57 INFO mapred.LocalJobRunner: Finishing task: attempt_local917149307_0002_m_000000_0 | |
14/09/29 16:31:57 INFO mapred.LocalJobRunner: Starting task: attempt_local917149307_0002_m_000001_0 | |
14/09/29 16:31:57 INFO mapred.Task: Using ResourceCalculatorPlugin : null | |
14/09/29 16:31:57 INFO mapred.MapTask: Processing split: file:/tmp/hadoop-quasiben/mapred/temp/inject-temp-587370943/part-00000:0+148 | |
14/09/29 16:31:57 INFO mapred.MapTask: numReduceTasks: 1 | |
14/09/29 16:31:57 INFO mapred.MapTask: io.sort.mb = 100 | |
14/09/29 16:31:58 INFO mapred.MapTask: data buffer = 79691776/99614720 | |
14/09/29 16:31:58 INFO mapred.MapTask: record buffer = 262144/327680 | |
14/09/29 16:31:58 INFO mapred.MapTask: Starting flush of map output | |
14/09/29 16:31:58 INFO mapred.MapTask: Finished spill 0 | |
14/09/29 16:31:58 INFO mapred.Task: Task:attempt_local917149307_0002_m_000001_0 is done. And is in the process of commiting | |
14/09/29 16:31:58 INFO mapred.LocalJobRunner: file:/tmp/hadoop-quasiben/mapred/temp/inject-temp-587370943/part-00000:0+148 | |
14/09/29 16:31:58 INFO mapred.Task: Task 'attempt_local917149307_0002_m_000001_0' done. | |
14/09/29 16:31:58 INFO mapred.LocalJobRunner: Finishing task: attempt_local917149307_0002_m_000001_0 | |
14/09/29 16:31:58 INFO mapred.LocalJobRunner: Map task executor complete. | |
14/09/29 16:31:58 INFO mapred.Task: Using ResourceCalculatorPlugin : null | |
14/09/29 16:31:58 INFO mapred.LocalJobRunner: | |
14/09/29 16:31:58 INFO mapred.Merger: Merging 2 sorted segments | |
14/09/29 16:31:58 INFO mapred.Merger: Down to the last merge-pass, with 2 segments left of total size: 116 bytes | |
14/09/29 16:31:58 INFO mapred.LocalJobRunner: | |
14/09/29 16:31:58 INFO crawl.Injector: Injector: overwrite: false | |
14/09/29 16:31:58 INFO crawl.Injector: Injector: update: false | |
14/09/29 16:31:58 INFO compress.CodecPool: Got brand-new compressor | |
14/09/29 16:31:58 INFO mapred.Task: Task:attempt_local917149307_0002_r_000000_0 is done. And is in the process of commiting | |
14/09/29 16:31:58 INFO mapred.LocalJobRunner: | |
14/09/29 16:31:58 INFO mapred.Task: Task attempt_local917149307_0002_r_000000_0 is allowed to commit now | |
14/09/29 16:31:58 INFO mapred.FileOutputCommitter: Saved output of task 'attempt_local917149307_0002_r_000000_0' to file:/Users/quasiben/Research/ContinuumDev/Memex/nutchpy/crawl/1532362367 | |
14/09/29 16:31:58 INFO mapred.LocalJobRunner: reduce > reduce | |
14/09/29 16:31:58 INFO mapred.Task: Task 'attempt_local917149307_0002_r_000000_0' done. | |
14/09/29 16:31:58 INFO mapred.JobClient: map 100% reduce 100% | |
14/09/29 16:31:58 INFO mapred.JobClient: Job complete: job_local917149307_0002 | |
14/09/29 16:31:58 INFO mapred.JobClient: Counters: 19 | |
14/09/29 16:31:58 INFO mapred.JobClient: File Input Format Counters | |
14/09/29 16:31:58 INFO mapred.JobClient: Bytes Read=320 | |
14/09/29 16:31:58 INFO mapred.JobClient: File Output Format Counters | |
14/09/29 16:31:58 INFO mapred.JobClient: Bytes Written=389 | |
14/09/29 16:31:58 INFO mapred.JobClient: injector | |
14/09/29 16:31:58 INFO mapred.JobClient: urls_merged=1 | |
14/09/29 16:31:58 INFO mapred.JobClient: FileSystemCounters | |
14/09/29 16:31:58 INFO mapred.JobClient: FILE_BYTES_READ=3280972 | |
14/09/29 16:31:58 INFO mapred.JobClient: FILE_BYTES_WRITTEN=3817822 | |
14/09/29 16:31:58 INFO mapred.JobClient: Map-Reduce Framework | |
14/09/29 16:31:58 INFO mapred.JobClient: Reduce input groups=1 | |
14/09/29 16:31:58 INFO mapred.JobClient: Map output materialized bytes=124 | |
14/09/29 16:31:58 INFO mapred.JobClient: Combine output records=0 | |
14/09/29 16:31:58 INFO mapred.JobClient: Map input records=2 | |
14/09/29 16:31:58 INFO mapred.JobClient: Reduce shuffle bytes=0 | |
14/09/29 16:31:58 INFO mapred.JobClient: Reduce output records=1 | |
14/09/29 16:31:58 INFO mapred.JobClient: Spilled Records=4 | |
14/09/29 16:31:58 INFO mapred.JobClient: Map output bytes=108 | |
14/09/29 16:31:58 INFO mapred.JobClient: Total committed heap usage (bytes)=1546125312 | |
14/09/29 16:31:58 INFO mapred.JobClient: Map input bytes=124 | |
14/09/29 16:31:58 INFO mapred.JobClient: Combine input records=0 | |
14/09/29 16:31:58 INFO mapred.JobClient: Map output records=2 | |
14/09/29 16:31:58 INFO mapred.JobClient: SPLIT_RAW_BYTES=262 | |
14/09/29 16:31:58 INFO mapred.JobClient: Reduce input records=2 | |
14/09/29 16:31:58 INFO crawl.Injector: Injector: URLs merged: 1 | |
14/09/29 16:31:58 INFO crawl.Injector: Injector: Total new urls injected: 0 | |
14/09/29 16:31:58 INFO crawl.Injector: Injector: finished at 2014-09-29 16:31:58, elapsed: 00:00:02 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment