This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Python 2.7.14 (default, Sep 23 2017, 22:06:14) | |
[GCC 7.2.0] on linux2 | |
Type "help", "copyright", "credits" or "license" for more information. | |
>>> foo = "hi" | |
>>> bar = "bye" | |
>>> (foo, bar,) | |
('hi', 'bye') | |
>>> (foo,) | |
('hi',) | |
>>> (foo) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2018-04-03 23:30:14,574 ERROR [Executor task launch worker for task 571523] org.apache.spark.executor.Executor: Exception in task 254.0 in stage 107.0 (TID 571523) | |
java.lang.OutOfMemoryError: Java heap space | |
at java.util.Arrays.copyOf(Arrays.java:3332) | |
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:124) | |
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:596) | |
at java.lang.StringBuffer.append(StringBuffer.java:367) | |
at java.io.BufferedReader.readLine(BufferedReader.java:370) | |
at java.io.BufferedReader.readLine(BufferedReader.java:389) | |
at org.apache.commons.io.IOUtils.readLines(IOUtils.java:1033) | |
at org.apache.commons.io.IOUtils.readLines(IOUtils.java:987) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
milesc@torre 0 tmp-docker ↠ cat Dockerfile | |
FROM library/ubuntu:16.04 | |
ENTRYPOINT ["head", "-n", "1"] | |
milesc@torre 0 tmp-docker ↠ cat bigfile | docker run -i example && echo "sucesss" | |
paperid paper_title publisher doi field pdf_processed viewable users_28days users_7days frac_users_28days frac_users_7days | |
read unix @->/var/run/docker.sock: read: connection reset by peer | |
milesc@torre 0 tmp-docker ↠ head bigfile | docker run -i example && echo "sucesss" | |
paperid paper_title publisher doi field pdf_processed viewable users_28days users_7days frac_users_28days frac_users_7days |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Creative Commons Attribution License Http://creativecommons.org/licenses/by/3.0 | |
Community-associated Methicillin-resistant Staphylococcus Aureus CA-MRSA | |
Alpha-amino-3-hydroxy-5-methyl-4-isoxazolepropionic Acid AMPA Receptor | |
Community-associated Methicillin-resistant Staphylococcus Aureus | |
Endobronchial Ultrasound-guided Transbronchial Needle Aspiration | |
Matrix-assisted Laser Desorption/ionization Mass Spectrometry | |
Chromatin Immunoprecipitation Sequencing ChIP-seq Experiments | |
Reverse Transcription Loop-mediated Isothermal Amplification | |
National Polar-orbiting Operational Environmental Satellite | |
Creative Commons Attribution-NonCommercial-NoDerivs License |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Miriam Blatt Dennis Chen Scott Cooke Piyush Desai Manjunath Doreswamy Mark Elgood Gary Feierbach Tim Goldsbury Dale Greenley Raju Joshi Mike Khosraviani Robert Kwong Manish Motwani Chitresh Narasimhaiah Sam J. Nicolino Jr. Tooru Ozeki Gary Peterson Chris Salzmann Nas James Gateley | |
A. Adalal J. Bauman P. Delisle P. Dedood P. Donehue M. Dell'OcaKhouja T. Doan M. Doreswamy P. Ferolito O. Geva D. Greenhill S. Gopaladhine J. Irwin L. Lev J. MacDonald M. Ma S. Mitra P. Patel A. Prabhu R. Puranik S. Rozanski N. Ross P. Saggurti S. Simovich R. Sunder A. Cao | |
Elena Biasibetti Alberto Valazza Maria T Capucchio Laura Annovazzi Luigi Battaglia Daniela Chirio Marina Gallarate Marta Mellai Elisabetta Muntoni Elena Peira Chiara Riganti Davide Schiffer Pierpaolo Panciani And Michele Lanotte | |
Galit H Frydman Robert P Marini Vasudevan Bakthavatchalu Kathleen E Biddle Sureshkumar Muthupalani Charles R Vanderburg Barry Lai Pavan K Bendapudi Ronald G Tompkins And James G Fox | |
En Representacion Del Grupo Colaborativo Para El Estudio |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1. what's the news launch loves warm | |
Recording technology improves and makes television easier to edit, satellite technology continues to get better | |
2. what's the news 1 plus 1 | |
See http://www.nytimes.com/2010/12/03/science/03arsenic.html?pagewanted=1&_r=3 and http://science.nasa.gov/science-news/science-at-nasa/2010/02dec_monolake/ for further information on this controversial finding. | |
3. how big is earth | |
5th largest planet in the solar system | |
4. what is the find mean orbit the sky | |
planets; sun | |
5. where does photosynthesis take place | |
In a plant's leaves |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2017-03-23 20:25:00 [scrapy.extensions.logstats] INFO: Crawled 2631 pages (at 556 pages/min), scraped 96 items (at 25 items/min) | |
2017-03-23 20:25:06 [anansi.dao.frontier] INFO: ['infocenter.arm.com', 'landmark.cs.cornell.edu', 'dblp.uni-trier.de', 'aclanthology.info', 'events.cornell.edu'] | |
2017-03-23 20:25:06 [anansi.dao.frontier] INFO: Dequeuing batch of frontier URIs; frontier size 912644; select using sample {TABLESAMPLE BERNOULLI(0.10957174977318648)}, dominant {AND uri NOT LIKE '%infocenter.arm.com%' AND uri NOT LIKE '%landmark.cs.cornell.edu%' AND uri NOT LIKE '%dblp.uni-trier.de%' AND uri NOT LIKE '%aclanthology.info%' AND uri NOT LIKE '%events.cornell.edu%'} | |
2017-03-23 20:25:07 [anansi.dao.frontier] INFO: Populated cache with 737 frontier URIs | |
2017-03-23 20:26:00 [scrapy.extensions.logstats] INFO: Crawled 3055 pages (at 424 pages/min), scraped 113 items (at 17 items/min) | |
2017-03-23 20:26:28 [anansi.dao.frontier] INFO: ['infocenter.arm.com', 'aclanthology.info', 'events.cornell.edu', 'dblp.uni-trier.de |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Time: 361.734 ms | |
s2crawler=> explain analyze SELECT frontier_uri_id FROM frontier_uri TABLESAMPLE BERNOULLI(0.01) WHERE ( started IS NULL OR (completed IS NULL AND started < now() - interval '3 hours')); | |
QUERY PLAN | |
------------------------------------------------------------------------------------------------------------------- | |
Sample Scan on frontier_uri (cost=0.00..235476.19 rows=148 width=4) (actual time=3.139..349.858 rows=93 loops=1) | |
Sampling: bernoulli ('0.01'::real) | |
Filter: ((started IS NULL) OR ((completed IS NULL) AND (started < (now() - '03:00:00'::interval)))) | |
Rows Removed by Filter: 1105 | |
Planning time: 0.056 ms | |
Execution time: 350.009 ms |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2017-03-21 22:04:37 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) | |
2017-03-21 22:04:59 [anansi.dao.frontier] INFO: Dequeuing batch of frontier URIs; frontier size 91134; select using {TABLESAMPLE BERNOULLI(1.0972853161278997)} | |
2017-03-21 22:04:59 [anansi.dao.frontier] INFO: Populated cache with 1005 frontier URIs | |
2017-03-21 22:04:59 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min) | |
2017-03-21 22:05:07 [anansi.dao.frontier] INFO: Dequeuing batch of frontier URIs; frontier size 89157; select using {TABLESAMPLE BERNOULLI(1.1216169229561335)} | |
2017-03-21 22:05:08 [anansi.dao.frontier] INFO: Populated cache with 943 frontier URIs | |
2017-03-21 22:05:42 [anansi.dao.frontier] INFO: Dequeuing batch of frontier URIs; frontier size 81513; select using {TABLESAMPLE BERNOULLI(1.2267981794315017)} | |
2017-03-21 22:05:42 [anansi.dao.frontier] INFO: Populated cache with 985 frontier URIs | |
2017-03-21 22:06:06 [scrapy.extensions |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
milesc@torre 0 tmp-datasets ↠ grep -h 8677da2812971b454c26c29994cabd8dbf72aebf * | jq -S . | |
{ | |
"facets": [ | |
{ | |
"facetType": "dataset", | |
"values": [ | |
"Penn Treebank", | |
"QuestionBank" | |
] | |
} |
NewerOlder