This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
curl -g --location --header 'Accept: application/x-bibtex' "http://dx.doi.org/10.1651/0278-0372(2005)025[0159:GR]2.0.CO;2" > test.txt | |
RETURNS | |
<h1>Internal Server Error</h1> | |
(I've encountered about 91 DOIs that appear to give this error) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
I know I'm doing all types of wrong here: | |
Source HTML file here: http://mdpi.com/1420-3049/19/4/5150/htm | |
I want the text for the dc.source: | |
Molecules 2014, Vol. 19, Pages 5150-5162 | |
Am using beautiful soup, so probably best to do it in that BUT it should also be regex-able. I can do this in bash no problem! |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thanks for your feedback Rod. I really value it. | |
I don't pretend to have all the answers. All of the academic content discovery | |
services are fairly murky about how they actually index things, | |
as I'm sure you know (Google Scholar perhaps being the most open-ish about how it does things?). | |
> how comparable are PLoS and Zootaxa from the perspective of search engines? | |
I am not a search engine. I am a human researcher. Whether a paper is | |
published in Nature, Science, PLOS ONE or Zootaxa, it is the same to me - |
We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 34 columns, instead of 6. in line 1.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
img1,Actinokineospora_fastidiosa,Amycolatopsis_alba_DSM_44262,Amycolatopsis_albidoflavus,Amycolatopsis_azurea,Amycolatopsis_balhimycina,Amycolatopsis_benzoatilytica,Amycolatopsis_coloradensis,Amycolatopsis_decaplanina_DSM_44594,Amycolatopsis_echigonensis,Amycolatopsis_kentuckyensis,Amycolatopsis_keratiniphila,Amycolatopsis_keratiniphila_subsp._nogabecina,Amycolatopsis_lexingtonensis,Amycolatopsis_lurida,Amycolatopsis_marina,Amycolatopsis_mediterranei,Amycolatopsis_methanolica_239,Amycolatopsis_nigrescens_CSC17Ta-90,Amycolatopsis_orientalis,Amycolatopsis_palatopharyngis,Amycolatopsis_plumensis,Amycolatopsis_regifaucium,Amycolatopsis_rifamycinica,Amycolatopsis_rubida,Amycolatopsis_saalfeldensis,Amycolatopsis_sacchari,Amycolatopsis_sp.,Amycolatopsis_sulphurea,Amycolatopsis_taiwanensis,Amycolatopsis_thermoflava_N1165,Amycolatopsis_tolypomycina,Amycolatopsis_vancoresmycina,Prauserella_rugosa | |
img2,Antarctobacter_heliothermus,Donghicola_eburneus,Jannaschia_helgolandensis,Ketogulonicigenium_vulgare,Loktanella_salsila |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
line_number | msg | _id | _full_text | occurrenceID | catalogNumber | scientificName | scientificNameAuthorship | typeStatus | locality | country | waterBody | expedition | recordedBy | collectionCode | kingdom | phylum | class | order | family | genus | subgenus | specificEpithet | infraspecificEpithet | higherClassification | taxonRank | stateProvince | continent | island | islandGroup | higherGeography | habitat | decimalLongitude | decimalLatitude | geodeticDatum | georeferenceProtocol | maxError | verbatimLongitude | verbatimLatitude | minimumElevationInMeters | maximumElevationInMeters | minimumDepthInMeters | maximumDepthInMeters | recordNumber | individualCount | lifeStage | sex | preparations | identifiedBy | dateIdentified | identificationQualifier | eventTime | day | month | year | earliestEonOrLowestEonothem | latestEonOrHighestEonothem | earliestEraOrLowestErathem | latestEraOrHighestErathem | earliestPeriodOrLowestSystem | latestPeriodOrHighestSystem | earliestEpochOrLowestSeries | latestEpochOrHighestSeries | earliestAgeOrLowestStage | latestAgeOrHighestStage | lowestBiostratigraphicZone | highestBiostratigraphicZone | group |
---|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
http://rsbl.royalsocietypublishing.org/content/suppl/2008/12/08/2.1.101.DC1/rsbl20050379supp1.avi | |
http://rsbl.royalsocietypublishing.org/content/suppl/2008/12/08/2.1.101.DC1/rsbl20050379supp2.avi | |
http://rsbl.royalsocietypublishing.org/content/suppl/2008/12/08/2.1.113.DC1/rsbl20050374supp.pdf | |
http://rsbl.royalsocietypublishing.org/content/suppl/2008/12/08/2.1.116.DC1/rsbl20050406supp.pdf | |
http://rsbl.royalsocietypublishing.org/content/suppl/2008/12/08/2.1.125.DC1/rsbl20050378supp1.pdf | |
http://rsbl.royalsocietypublishing.org/content/suppl/2008/12/08/2.1.128.DC1/rsbl20050386supp.pdf | |
http://rsbl.royalsocietypublishing.org/content/suppl/2008/12/08/2.1.131.DC1/rsbl20050388supp.pdf | |
http://rsbl.royalsocietypublishing.org/content/suppl/2008/12/08/2.1.140.DC1/rsbl20050390supp.pdf | |
http://rsbl.royalsocietypublishing.org/content/suppl/2008/12/08/2.1.148.DC1/rsbl20050397supp.pdf | |
http://rsbl.royalsocietypublishing.org/content/suppl/2008/12/08/2.1.152.DC1/rsbl20050401supp.pdf |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(((((EU379932:195.0,EU840722:135.0)NT1.25:86.0,EU840723:301.0)NT1.17:131.0,(NC_000853:251.0,NC_000918:364.0)NT1.15:106.0)NT1.5:12.0,(NC_009525:223.0,NC_004307:333.0)NT1.19:158.0)NT1.2:10.0,((NC_010571:441.0,(((Desulfo:295.0,((((ES:140.0,U32697:164.0)NT1.28:63.0,NC_OO2516:165.0)NT1.22:24.0,(NC_002929:183.0,NC_002946:185.0)NT1.26:55.0)NT1.21:78.0,(NC_009667:186.0,NC_002696:194.0)NT1.23:105.0)NT1.14:56.0)NT1.10:21.0,(NC_001218:279.0,NC_002967:302.0)NT1.16:107.0)NT1.8:8.0,((AEO15924:189.0,NC_003228:191.0)NT1.27:187.0,NC_OO2932:343.0)NT1.12:55.0)NT1.6:11.0,NC_005027:497.0)NT1.3:6.0,((NC_006576:348.0,(AJ307978:124.0,AJ307974:167.0)NT1.24:205.0)NT1.7:12.0,(NC_000912:484.0,((M94261:260.0,NC_010376:320.0)NT1.13:33.0,((Lactoba:157.0,NC_009785:181.0)NT1.20:21.0,NC_00964:152.0)NT1.18:91.0)NT1.11:22.0)NT1.9:32.0)NT1.4:11.0)NT1.1:9.0)NT1.60; | |
(((AACO7677:305.0,AAD36645:324.0)NT1.5:25.0,DU723069:293.0)NT1.2:11.0,(BAD79413:350.0,((ABY33776:363.0,(((((AAGO7791:238.0,(NP_285794:178.0,ABROO457:214.0)NT1.22:53.0)NT1.20:37.0,(AAW8 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Apache Maven 3.0.5 | |
Maven home: /usr/share/maven | |
Java version: 1.7.0_79, vendor: Oracle Corporation | |
Java home: /usr/lib/jvm/java-7-openjdk-amd64/jre | |
Default locale: en_GB, platform encoding: UTF-8 | |
OS name: "linux", version: "3.19.0-25-generic", arch: "amd64", family: "unix" | |
[INFO] Error stacktraces are turned on. | |
[DEBUG] Reading global settings from /usr/share/maven/conf/settings.xml | |
[DEBUG] Reading user settings from /home/ross/.m2/settings.xml | |
[DEBUG] Using local repository at /home/ross/.m2/repository |
We can make this file beautiful and searchable if this error is corrected: It looks like row 2 should actually have 1 column, instead of 6. in line 1.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
FileName TipsInFile EmptyTips ExactMatchEGIDs SpacesInParentheses noEGID | |
02565-0-001 16 7 9 0 7 | |
025668-0-000 12 2 7 3 2 | |
025668-0-001 12 2 0 0 12 | |
02567-0-000 21 21 0 0 21 | |
025700-0-000 0 0 0 0 0 | |
025718-0-000 30 7 18 5 8 | |
025742-0-001 26 1 23 1 1 | |
02576-0-000 21 0 0 0 21 | |
02576-0-001 17 1 13 3 1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Apache Maven 3.0.5 | |
Maven home: /usr/share/maven | |
Java version: 1.7.0_79, vendor: Oracle Corporation | |
Java home: /usr/lib/jvm/java-7-openjdk-amd64/jre | |
Default locale: en_GB, platform encoding: UTF-8 | |
OS name: "linux", version: "3.19.0-26-generic", arch: "amd64", family: "unix" | |
[INFO] Error stacktraces are turned on. | |
[DEBUG] Reading global settings from /usr/share/maven/conf/settings.xml | |
[DEBUG] Reading user settings from /home/ross/.m2/settings.xml | |
[DEBUG] Using local repository at /home/ross/.m2/repository |
OlderNewer