Skip to content

Instantly share code, notes, and snippets.

View gmlee7's full-sized avatar

Gene Moo Lee gmlee7

View GitHub Profile
We can't make this file beautiful and searchable because it's too large.
24.0 29.0 29.0 28.0 28.0 28.0 28.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 2.0 2.0 2.0 2.0 2.0 2.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 6.0 14.0 14.0 14.0 14.0 14.0 14.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 2.0 2.0 2.0 2.0 2.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 0.0 0.0 0.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0
@gmlee7
gmlee7 / gist:d3cf475a065685d786e188a82173327d
Created September 28, 2017 23:46
corpus.raw.txt.1995-2015-73-partI.doc2vec.config.json
{
"embeddings": [
{
"tensorName": "1995-2015 SIC73 PartI",
"tensorShape": [
9067,
100
],
"tensorPath": "https://gist.githubusercontent.com/gmlee7/c61ca2d7a7d59dc05842511a74502a04/raw/c433041ab4262110fb0faa4a88ddae4c4d4d0a97/corpus.raw.txt.1995-2015-73-partI.doc2vec.dim100_tensor.tsv",
"metadataPath": "https://gist.githubusercontent.com/gmlee7/09085e93bedf6bd118c058f80aeb97e8/raw/428eb1bfc431540150fa214314473e342c5aaa42/corpus.raw.txt.1995-2015-73-partI.doc2vec.dim100_metadata.tsv"
CONFORMED-NAME SIC CIK YEAR
AUTOMATIC DATA PROCESSING INC-95 7374 8670 1995
BI INC-95 7380 716629 1995
AMPLICON INC-95 7377 803016 1995
HENRY JACK & ASSOCIATES INC-95 7373 779152 1995
AMSERV HEALTHCARE INC-95 7363 78302 1995
CPT HOLDINGS INC-95 7373 25360 1995
BESTWAY RENTAL INC-95 7359 4344 1995
DATAPOINT CORP-95 7373 205239 1995
STERLING SOFTWARE INC-95 7372 716714 1995
We can't make this file beautiful and searchable because it's too large.
1.86794 -1.51905 2.31601 0.602272 -1.03752 0.713474 -0.643293 1.75996 0.626688 0.886203 2.6459 1.42209 -1.49806 -0.079178 1.32911 -0.679573 2.33063 -0.379173 2.17297 2.24872 -0.572018 -0.793121 1.29554 -0.092854 1.65064 3.12232 4.09422 0.050085 0.255998 -0.892773 1.07452 0.294613 -0.4051 0.542353 -0.72601 -3.18963 2.71017 -0.601664 -1.5819 -2.88906 3.00663 -1.60644 -0.77362 1.32142 3.52996 -3.31732 1.14948 -0.137356 0.118646 0.436632 -2.02639 1.49054 -3.94762 5.71068 0.603852 0.7984 3.91732 5.1063 1.5562 -2.31397 1.17455 -0.498265 -1.388 -0.798582 1.37716 -0.452186 -2.31728 4.41573 -0.39246 -1.15122 1.03436 -2.22796 -1.85845 -3.73352 -0.037556 -0.236724 2.54365 0.345185 -0.319269 -2.9165 -4.19491 -0.791766 -0.444429 -0.914013 -1.11973 -0.129124 -4.13767 5.48107 -0.19368 0.717122 0.344019 2.95637 -1.64424 -3.91504 1.67852 0.446105 -0.948455 2.57533 -2.97522 -0.46017
-2.68952 -3.37097 -1.05213 -2.07974 -1.00746 2.06481 3.60173 -2.92695 -0.669212 -4.58963 0.201339 0.02189 -0.91823 -0.725174 0.466488 0.346949 -0.
asn org
1 Level 3 Communications-AS1
2 University of Delaware-AS2
3 Massachusetts Institute of Technology-AS3
4 University of Southern California-AS4
5 Symbolics Inc.-AS5
8 Rice University-AS8
9 Carnegie Mellon University-AS9
10 CSNET Coordination and Information Center (CSNET-CIC)-AS10
12 New York University-AS12
{
"embeddings": [
{
"tensorName": "ASN CBL VOLUME 2015",
"tensorShape": [
5443,
365
],
"tensorPath": "https://gist.githubusercontent.com/gmlee7/57328cb9b21072d2eeed68d1a6420a72/raw/75a8a9f42700376cdc31af28e3a72473e346076c/asn_date_volume.tsv",
"metadataPath": "https://gist.githubusercontent.com/gmlee7/e936ce805bdaf7996357cc877f22b977/raw/0783c44a128fcd2da002f2e98a716b0afd4c6b80/asn_meta.tsv"
We can't make this file beautiful and searchable because it's too large.
0 26 26 2 25 1 1 1 2 1 1 1 1 1 2 2 2 1 1 2 1 1 2 1 2 2 1 1 1 1 2 5 2 2 2 3 3 3 2 12 12 12 1 1 2 1 2 1 1 1 2 2 2 2 1 31 1 1 1 1 1 1 1 1 1 229 1 1 1 1 5 5 1 2 2 1 2 2 2 5 3 1 1 1 1 1 1 1 1 2 2 2 1 2 2 5 1 1 1 2 5 4 4 1 1 2 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 135 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 56 0 25 0 25 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 377 377 77 57 57 57 59 0 3 20 165 31 33 2 2
0 2 1 1 1 4 144 130 3 2 2 2 2 12 1 1 3 3 2 1 2 1 1 1 1 1 1 2 8 1 1 9 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 2 1 1 1 2 262 218 216 308 220 269 269 2 1 1 1 1 1 1 1 2 2 2 2 2 2 2 4 6 6 6 9 8 7 7 7 6 6 8 20 18 18 17 4 3 3 4 3 3 4 4 4 5 5 6 6 6 6 6 5 5 5 5 5 8 5 5 8
{
"embeddings": [
{
"tensorName": "riskfactors.firm.year2016",
"tensorShape": [
5320,
50
],
"tensorPath": "https://gist.githubusercontent.com/gmlee7/4543ae027ff2119060b735be6595a82b/raw/5bc9b97744aeef26fe944b33768af6997ee55ae4/gistfile1.txt",
"metadataPath": "https://gist.githubusercontent.com/gmlee7/109ce810b9ed0d0aa2b7be88217cb38c/raw/8cec8156e3e8c2de7cacd972fc41f3daa0884b07/gistfile1.txt"
DOC-ID NAME CIK YEAR
FERRO CORP-2016 FERRO CORP 35214 2016
MYLAN N.V.-2016 MYLAN N.V. 1623613 2016
GAIN CAPITAL HOLDINGS, INC.-2016 GAIN CAPITAL HOLDINGS, INC. 1444363 2016
CATCHMARK TIMBER TRUST, INC.-2016 CATCHMARK TIMBER TRUST, INC. 1341141 2016
CAPSTONE TURBINE CORP-2016 CAPSTONE TURBINE CORP 1009759 2016
MOBILE MINI INC-2016 MOBILE MINI INC 911109 2016
CANANDAIGUA BRANDS INC-2016 CANANDAIGUA BRANDS INC 16918 2016
OGL HOLDINGS LTD.-2016 OGL HOLDINGS LTD. 1634421 2016
AMERICAN FARMLAND CO-2016 AMERICAN FARMLAND CO 1474777 2016
This file has been truncated, but you can view the full file.
-1.13075 -0.580338 1.03365 0.956848 0.689389 -2.56415 -0.693136 -3.04627 -0.49323 -1.63306 1.67732 -0.982112 -0.127652 -2.21246 -1.8522 -0.130637 -1.45359 0.966819 3.02773 1.24529 2.85043 -1.44317 0.05786 -0.746715 -0.542937 0.323487 -0.23958 -0.961994 1.51797 -0.086841 -1.66017 -1.31595 0.4865 0.167041 -0.014215 -1.56639 1.17892 -1.67358 0.100089 0.446065 0.485189 -0.953124 -1.84499 -0.771531 -1.71042 -0.339266 1.35656 1.34879 2.94956 -1.70595
-0.783131 -0.765064 1.20543 -2.19484 0.234648 -2.59824 -0.173639 0.623266 -1.06685 -1.76689 0.139137 -0.116113 -2.21292 -3.38141 -1.08288 -2.93424 1.21643 2.25994 0.125711 0.02393 -0.197829 -3.64882 -0.474027 0.141369 -2.27086 -1.35961 -0.975697 2.64426 1.71782 1.15025 -3.85508 -3.19708 1.85639 2.37079 -0.493929 0.285395 -0.411832 -1.47355 0.527195 1.10925 0.260974 -1.5217 -2.71264 -1.37408 -0.395547 0.306541 1.24305 0.496051 1.49822 -3.55819