Skip to content

Instantly share code, notes, and snippets.

View TheFifthFreedom's full-sized avatar

Laurent Mazouer TheFifthFreedom

View GitHub Profile
# -*- coding: utf-8 -*-
import csv
import json
import time
import requests
class Scraper:
def __init__(self):
self.output_file = csv.writer(open('venmo_output.csv', 'a'))
We can make this file beautiful and searchable if this error is corrected: No tabs found in this TSV file in line 0.
www.forbes.com
www.lhw.com
www.telegraph.co.uk
www.tripadvisor.com
amp.cnn.com
www.slh.com
www.businessinsider.com
travel.usnews.com
www.cntraveler.com
www.therichest.com
We can't make this file beautiful and searchable because it's too large.
-9.02873e-05 0.000460889 0.00070722 -6.03205e-06 -0.00128882 -0.000153795 -0.00113259 0.000370917 -3.74205e-05 0.000721245 -0.000355268 -1.1218e-05 -0.000897583 0.00103905 -0.00063548 -0.00146889 0.00100931 0.00135221 0.000167402 0.000124683 0.000799039 -0.00113171 8.81382e-06 0.00111264 5.82179e-05 -1.23595e-05 0.00164752 4.91173e-05 0.00086248 0.000845138 -0.000185453 0.000206807 -0.000127143 0.00101563 0.000810542 0.000661416 -0.000618637 -0.000691165 -0.000531557 0.000461022 0.000717136 -0.000543157 -0.00161385 0.00130947 0.00135343 -0.000894406 -0.000710941 -0.00102623 -0.000441068 -0.00139023 -0.0014104 -0.00135954 -0.000556536 -0.00056319 -0.000807459 0.00160377 -0.000269869 0.00146824 0.000372397 0.000344375 0.000634506 0.000730831 0.00165912 -0.00126886 -0.000586241 -0.00163661 0.000729063 0.000384274 -0.000636633 -0.00114241 -0.000122818 -0.00131895 -0.000508028 0.000963727 0.00048409 -0.000598821 0.000358109 -0.000901013 0.00144163 -0.00153063 -0.00126841 0.00163606 4.94332e-06 0.000785541 -0.00051
{
"embeddings": [
{
"tensorName": "My tensor",
"tensorShape": [
1000,
50
],
"tensorPath": "https://cdn.rawgit.com/TheFifthFreedom/305f2cd7ced49b19fa5e68acf75f465a/raw/e0abd97902e7b203e304b58f3d6ec77e42372f68/luxury_hotels_docvec_6338_300d_tensors.tsv",
"metadataPath": "https://cdn.rawgit.com/TheFifthFreedom/c236966932dc129638197ba30a881d6b/raw/78e1f77a6cadf6233dc778a3b45c0a86e1a66822/luxury_hotels_docvec_6338_300d_labels.tsv"
We can make this file beautiful and searchable if this error is corrected: No tabs found in this TSV file in line 0.
www.forbes.com
www.telegraph.co.uk
www.lhw.com
www.tripadvisor.com
amp.cnn.com
www.slh.com
www.businessinsider.com
travel.usnews.com
www.cntraveler.com
www.therichest.com
-0.000369644 -0.000288173 0.00134783 0.00165578 0.000235155 0.000408636 0.000440371 -0.000243116 0.000756231 0.00121602 0.00161962 -0.00162696 0.000500214 -0.00136171 -0.000106975 0.000311875 -0.00138724 0.000167888 -0.0014182 0.00146102 0.00153973 0.000516921 0.000585931 0.0010963 -0.00039045 0.00052218 0.000741193 0.00152278 0.000661572 -0.000955456 -0.000210887 0.000248254 0.00075336 0.000533114 -0.000609706 -1.24527e-05 0.0011874 0.00061945 -0.000578053 -0.000762837 -0.000775256 -0.000832863 0.000754967 7.2874e-05 -0.00153767 0.00138385 0.00090131 -0.000254997 0.00133746 -0.000810086 0.000509438 -0.000653424 0.00144265 0.00103896 -0.00142288 0.000306969 -0.00117945 -0.0016188 0.00122307 -0.00131256 0.000699665 0.00127923 -0.000440024 0.000663851 -0.000891284 0.000746215 -0.00139226 0.000295705 -0.00088486 0.00139131 -0.00103494 0.000780231 -0.000573236 -0.000779618 -0.00115155 -0.0014142 -0.000392127 -4.80331e-05 -0.000199087 -0.00116402 -0.00161901 -0.000284641 0.00011943 0.000463086 -0.000493529 -0.0012
{
"embeddings": [
{
"tensorName": "Luxury hotels (reduced)",
"tensorShape": [
1000,
50
],
"tensorPath": "https://cdn.rawgit.com/TheFifthFreedom/2c1b7e663f8bd4edeab9f6f83de0f523/raw/546b47fc563658cd0e2cd4978ae0474d2abf7a34/luxury_hotels_docvec_reduced_tensors.tsv",
"metadataPath": "https://cdn.rawgit.com/TheFifthFreedom/d24b7ca2fb7ddba83f17df47d232946a/raw/b5ac26d5e3d71e02ea787dbee1934254495f193b/luxury_hotels_docvec_reduced_labels.tsv"
We can make this file beautiful and searchable if this error is corrected: No tabs found in this TSV file in line 0.
www.lhw.com
www.forbes.com
www.tripadvisor.com
www.telegraph.co.uk
www.slh.com
travel.usnews.com
www.therichest.com
super-top10.blogspot.com
www.tablethotels.com
www.cntraveler.com
-0.0122734 -0.173377 -0.0453558 -0.00932097 0.0855966 -0.107086 0.138757 -0.0558036 0.106451 -0.0324574 -0.0238968 0.0936562 -0.129498 0.0650235 -0.0892725 -0.015831 0.129788 -0.0365706 -0.054816 0.0414325 -0.111601 -0.0769558 0.0231551 -0.0158742 -0.131847 -0.0142021 0.170007 -0.117543 0.173883 0.0749348 -0.046554 0.0903936 -0.0905629 -0.0953943 0.1089 0.126874 0.0235336 0.0978232 0.040172 -0.0356636 0.00540761 -0.0823187 0.190579 0.0643644 0.116926 0.0897044 -0.0604489 -0.0529965 0.0791858 0.0528981 -0.0122716 0.0566183 0.0962952 -0.152099 -0.102654 0.0038059 0.0119423 -0.152239 -0.0471654 -0.168128 0.130222 0.0654599 0.0507467 0.0516565 -0.00125144 -0.0404749 -0.109948 0.200288 -0.0178853 0.0234887 -0.0647822 0.16049 0.0178571 -0.0130833 0.177415 0.025955 -0.0603214 -0.00307175 -0.0420683 -0.191315 -0.0380536 0.12837 -0.0569703 -0.0775312 -0.0251834 0.0895255 0.00505476 0.00971237 0.03072 -0.0207849 -0.0753884 0.0304028 -0.10805 0.102407 0.139847 0.0210223 -0.0715144 0.0821832 -0.0371126 0.00635285 0.02637
{
"embeddings": [
{
"tensorName": "Luxury hotels Top 44",
"tensorShape": [
1000,
50
],
"tensorPath": "https://cdn.rawgit.com/TheFifthFreedom/8034d11d65c801cf44ea15391614a3ac/raw/f8e53b1ae5a788ebbb8af2694c66d1e27e8bfe3a/luxury_hotels_top44_tensors.tsv",
"metadataPath": "https://cdn.rawgit.com/TheFifthFreedom/f5b80735db0f87081d597d645bbaa96a/raw/be7f161ecb6091dd1fa429bd1cce3a7342d30b69/luxury_hotels_top44_labels.tsv"