Skip to content

Instantly share code, notes, and snippets.

View gmlee7's full-sized avatar

Gene Moo Lee gmlee7

View GitHub Profile
@gmlee7
gmlee7 / config.dim50.json
Last active August 18, 2017 21:08
edgar-demo-data
{
"embeddings": [
{
"tensorPath": "https://gist.githubusercontent.com/gmlee7/59c3341ec8754640fe5959a7fef2b2bd/raw/66abca972fda925996be3aad5d34d54a3386879f/corpus.raw.txt.2016-2016-partI.doc2vec.dim50_tensor.tsv",
"tensorName": "2016-2016-partI",
"tensorShape": [
5797,
50
],
"metadataPath": "https://gist.githubusercontent.com/gmlee7/59c3341ec8754640fe5959a7fef2b2bd/raw/66abca972fda925996be3aad5d34d54a3386879f/corpus.raw.txt.2016-2016-partI.doc2vec.dim50_metadata.tsv"
This file has been truncated, but you can view the full file.
-1.13075 -0.580338 1.03365 0.956848 0.689389 -2.56415 -0.693136 -3.04627 -0.49323 -1.63306 1.67732 -0.982112 -0.127652 -2.21246 -1.8522 -0.130637 -1.45359 0.966819 3.02773 1.24529 2.85043 -1.44317 0.05786 -0.746715 -0.542937 0.323487 -0.23958 -0.961994 1.51797 -0.086841 -1.66017 -1.31595 0.4865 0.167041 -0.014215 -1.56639 1.17892 -1.67358 0.100089 0.446065 0.485189 -0.953124 -1.84499 -0.771531 -1.71042 -0.339266 1.35656 1.34879 2.94956 -1.70595
-0.783131 -0.765064 1.20543 -2.19484 0.234648 -2.59824 -0.173639 0.623266 -1.06685 -1.76689 0.139137 -0.116113 -2.21292 -3.38141 -1.08288 -2.93424 1.21643 2.25994 0.125711 0.02393 -0.197829 -3.64882 -0.474027 0.141369 -2.27086 -1.35961 -0.975697 2.64426 1.71782 1.15025 -3.85508 -3.19708 1.85639 2.37079 -0.493929 0.285395 -0.411832 -1.47355 0.527195 1.10925 0.260974 -1.5217 -2.71264 -1.37408 -0.395547 0.306541 1.24305 0.496051 1.49822 -3.55819
-1.01449 -2.53211 1.11097 -2.0256 1.95391 -0.992454 -2.29612 -1.07925 0.163381 -0.938077 0.522775 -1.44567 -0.754829 -3.59687 1.
This file has been truncated, but you can view the full file.
-1.13075 -0.580338 1.03365 0.956848 0.689389 -2.56415 -0.693136 -3.04627 -0.49323 -1.63306 1.67732 -0.982112 -0.127652 -2.21246 -1.8522 -0.130637 -1.45359 0.966819 3.02773 1.24529 2.85043 -1.44317 0.05786 -0.746715 -0.542937 0.323487 -0.23958 -0.961994 1.51797 -0.086841 -1.66017 -1.31595 0.4865 0.167041 -0.014215 -1.56639 1.17892 -1.67358 0.100089 0.446065 0.485189 -0.953124 -1.84499 -0.771531 -1.71042 -0.339266 1.35656 1.34879 2.94956 -1.70595
-0.783131 -0.765064 1.20543 -2.19484 0.234648 -2.59824 -0.173639 0.623266 -1.06685 -1.76689 0.139137 -0.116113 -2.21292 -3.38141 -1.08288 -2.93424 1.21643 2.25994 0.125711 0.02393 -0.197829 -3.64882 -0.474027 0.141369 -2.27086 -1.35961 -0.975697 2.64426 1.71782 1.15025 -3.85508 -3.19708 1.85639 2.37079 -0.493929 0.285395 -0.411832 -1.47355 0.527195 1.10925 0.260974 -1.5217 -2.71264 -1.37408 -0.395547 0.306541 1.24305 0.496051 1.49822 -3.55819
-1.01449 -2.53211 1.11097 -2.0256 1.95391 -0.992454 -2.29612 -1.07925 0.163381 -0.938077 0.522775 -1.44567 -0.754829 -3.59687 1.
DOC-ID NAME CIK YEAR
FERRO CORP-2016 FERRO CORP 35214 2016
MYLAN N.V.-2016 MYLAN N.V. 1623613 2016
GAIN CAPITAL HOLDINGS, INC.-2016 GAIN CAPITAL HOLDINGS, INC. 1444363 2016
CATCHMARK TIMBER TRUST, INC.-2016 CATCHMARK TIMBER TRUST, INC. 1341141 2016
CAPSTONE TURBINE CORP-2016 CAPSTONE TURBINE CORP 1009759 2016
MOBILE MINI INC-2016 MOBILE MINI INC 911109 2016
CANANDAIGUA BRANDS INC-2016 CANANDAIGUA BRANDS INC 16918 2016
OGL HOLDINGS LTD.-2016 OGL HOLDINGS LTD. 1634421 2016
AMERICAN FARMLAND CO-2016 AMERICAN FARMLAND CO 1474777 2016
{
"embeddings": [
{
"tensorName": "riskfactors.firm.year2016",
"tensorShape": [
5320,
50
],
"tensorPath": "https://gist.githubusercontent.com/gmlee7/4543ae027ff2119060b735be6595a82b/raw/5bc9b97744aeef26fe944b33768af6997ee55ae4/gistfile1.txt",
"metadataPath": "https://gist.githubusercontent.com/gmlee7/109ce810b9ed0d0aa2b7be88217cb38c/raw/8cec8156e3e8c2de7cacd972fc41f3daa0884b07/gistfile1.txt"
We can't make this file beautiful and searchable because it's too large.
0 26 26 2 25 1 1 1 2 1 1 1 1 1 2 2 2 1 1 2 1 1 2 1 2 2 1 1 1 1 2 5 2 2 2 3 3 3 2 12 12 12 1 1 2 1 2 1 1 1 2 2 2 2 1 31 1 1 1 1 1 1 1 1 1 229 1 1 1 1 5 5 1 2 2 1 2 2 2 5 3 1 1 1 1 1 1 1 1 2 2 2 1 2 2 5 1 1 1 2 5 4 4 1 1 2 1 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 4 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 1 135 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 56 0 25 0 25 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 377 377 77 57 57 57 59 0 3 20 165 31 33 2 2
0 2 1 1 1 4 144 130 3 2 2 2 2 12 1 1 3 3 2 1 2 1 1 1 1 1 1 2 8 1 1 9 1 3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 2 2 1 1 1 2 262 218 216 308 220 269 269 2 1 1 1 1 1 1 1 2 2 2 2 2 2 2 4 6 6 6 9 8 7 7 7 6 6 8 20 18 18 17 4 3 3 4 3 3 4 4 4 5 5 6 6 6 6 6 5 5 5 5 5 8 5 5 8
{
"embeddings": [
{
"tensorName": "ASN CBL VOLUME 2015",
"tensorShape": [
5443,
365
],
"tensorPath": "https://gist.githubusercontent.com/gmlee7/57328cb9b21072d2eeed68d1a6420a72/raw/75a8a9f42700376cdc31af28e3a72473e346076c/asn_date_volume.tsv",
"metadataPath": "https://gist.githubusercontent.com/gmlee7/e936ce805bdaf7996357cc877f22b977/raw/0783c44a128fcd2da002f2e98a716b0afd4c6b80/asn_meta.tsv"
asn org
1 Level 3 Communications-AS1
2 University of Delaware-AS2
3 Massachusetts Institute of Technology-AS3
4 University of Southern California-AS4
5 Symbolics Inc.-AS5
8 Rice University-AS8
9 Carnegie Mellon University-AS9
10 CSNET Coordination and Information Center (CSNET-CIC)-AS10
12 New York University-AS12
We can't make this file beautiful and searchable because it's too large.
1.86794 -1.51905 2.31601 0.602272 -1.03752 0.713474 -0.643293 1.75996 0.626688 0.886203 2.6459 1.42209 -1.49806 -0.079178 1.32911 -0.679573 2.33063 -0.379173 2.17297 2.24872 -0.572018 -0.793121 1.29554 -0.092854 1.65064 3.12232 4.09422 0.050085 0.255998 -0.892773 1.07452 0.294613 -0.4051 0.542353 -0.72601 -3.18963 2.71017 -0.601664 -1.5819 -2.88906 3.00663 -1.60644 -0.77362 1.32142 3.52996 -3.31732 1.14948 -0.137356 0.118646 0.436632 -2.02639 1.49054 -3.94762 5.71068 0.603852 0.7984 3.91732 5.1063 1.5562 -2.31397 1.17455 -0.498265 -1.388 -0.798582 1.37716 -0.452186 -2.31728 4.41573 -0.39246 -1.15122 1.03436 -2.22796 -1.85845 -3.73352 -0.037556 -0.236724 2.54365 0.345185 -0.319269 -2.9165 -4.19491 -0.791766 -0.444429 -0.914013 -1.11973 -0.129124 -4.13767 5.48107 -0.19368 0.717122 0.344019 2.95637 -1.64424 -3.91504 1.67852 0.446105 -0.948455 2.57533 -2.97522 -0.46017
-2.68952 -3.37097 -1.05213 -2.07974 -1.00746 2.06481 3.60173 -2.92695 -0.669212 -4.58963 0.201339 0.02189 -0.91823 -0.725174 0.466488 0.346949 -0.
CONFORMED-NAME SIC CIK YEAR
AUTOMATIC DATA PROCESSING INC-95 7374 8670 1995
BI INC-95 7380 716629 1995
AMPLICON INC-95 7377 803016 1995
HENRY JACK & ASSOCIATES INC-95 7373 779152 1995
AMSERV HEALTHCARE INC-95 7363 78302 1995
CPT HOLDINGS INC-95 7373 25360 1995
BESTWAY RENTAL INC-95 7359 4344 1995
DATAPOINT CORP-95 7373 205239 1995
STERLING SOFTWARE INC-95 7372 716714 1995