Skip to content

Instantly share code, notes, and snippets.

@widoyo
Last active March 12, 2017 05:45
Show Gist options
  • Save widoyo/d2187e2fa5565139903e34b0b7acaa44 to your computer and use it in GitHub Desktop.
Save widoyo/d2187e2fa5565139903e34b0b7acaa44 to your computer and use it in GitHub Desktop.
MarketPlace Stats
# -*- coding: utf-8 -*-
from urllib2 import urlopen
from HTMLParser import HTMLParser
products = open('produk.txt').readlines()
STATS_START = '<dl class=\'c-deflist\''
STATS_END = '</dl>'
class StatParser(HTMLParser):
mydata = []
def handle_data(self, data):
self.mydata.append(data)
def out(self):
out = self.mydata[:]
self.mydata = []
return out
class TitleParser(HTMLParser):
def handle_data(self, data):
self.mydata = data
def out(self):
out = self.mydata[:]
return out
def main():
title_parser = TitleParser()
stat_parser = StatParser()
for product in products:
if not product:
continue
page = urlopen(product).read()
page = page.split('\n')
stats_feed = ''
baca = False
for line in page:
if '<h1' in line:
title_parser.feed(line)
if line.startswith(STATS_START):
baca = True
if line.endswith(STATS_END):
baca = False
if baca:
stats_feed += line
stat_parser.feed(stats_feed)
stat_parser.close()
print title_parser.out(), '\t', '\t'.join([a for i, a in enumerate(stat_parser.out()) if i % 2])
if __name__ == '__main__':
main()
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/kursi-sofa/54he0t-jual-stabil-kaki-meja-kursi-d41
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/meja/563sjb-jual-karet-kaki-kompor-d23_d20_t25
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/meja/5hkrot-jual-karet-kaki-meja-kursi-1-1-2-in-f_dop_ry_11p2
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/meja/563msq-jual-karet-kaki-kompor-d22_d16_t24-besi
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/kursi-sofa/7gt93b-jual-karet-kaki-kursi-dop-2-inch-f-cop-et-2
https://www.bukalapak.com/p/rumah-tangga/home-stuff/7ld9c1-jual-1000-pcs-seal-sil-karet-tabung-gas-lpg-elpiji-hitam
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/lemari-1645/7ls62j-jual-plastik-kotak-1p5x3p5-f_pktk-am-2x4
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/kursi-sofa/7gtsie-jual-karet-kaki-kursi-dop-3-4-f-dop-enh-3p4
https://www.bukalapak.com/p/rumah-tangga/dapur/nvj0u-jual-seal-sil-karet-tabung-gas-lpg-elpiji-hitam-1-pack-berisi-1000-pcs
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/lemari-1645/5nqoxn-jual-karet-holo-2-5-cm-x-5-cm-f_ktk_ry_2p5x5
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/kursi-sofa/6boksq-jual-karet-kaki-kursi-1-in-bergaris
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/meja/57nzuk-jual-stabil-kaki-kursi-dengan-nut-m6
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/lain-lain-1669/3aidgb-jual-plastik-lubang-untuk-paku
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/meja/4mi7vb-jual-pengaman-kaki-meja-kursi-warna-transparan-f_blt_t_d19_d13_t7
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/lain-lain-1669/4rcl5q-jual-karet-kaki-holo-3-5-x-3-5
https://www.bukalapak.com/p/rumah-tangga/home-stuff/7b4vmo-jual-4-pcs-mur-nanas-m6x20-dan-baut-m6x30
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/kursi-sofa/6jcehb-jual-akaret-alas-kaki-meja-atau-kursi-dari-besi-bundar-5-8-in
https://www.bukalapak.com/p/rumah-tangga/home-stuff/5iidut-jual-mur-nanas-dan-baut-m6-obeng-l
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/lain-lain-1669/563gfk-jual-karet-bulat-d23_d20_t15
https://www.bukalapak.com/p/rumah-tangga/home-stuff/5ihf3k-jual-karet-kaki-tangga
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/meja/7hz718-jual-karet-kaki-kursi-dop-7-8
https://www.bukalapak.com/p/rumah-tangga/furniture-interior/dekorasi-rumah/524me2-jual-karet-kaki-rak-besi-holo-10mm-x-33-mm
We can make this file beautiful and searchable if this error is corrected: It looks like row 10 should actually have 1 column, instead of 2. in line 9.
stabil kaki meja kursi D41 Baru 20 100 6 2017-03-12 05:31:38
karet kaki kompor D23_D20_T25 Baru 4 37 4 2017-03-12 05:31:39
karet kaki meja kursi 1 1/2 in F_DOP_RY_11P2 Baru 12 158 9 2017-03-12 09:53:34
karet kaki kompor D22_D16_T24 Besi Baru 4 122 8 2017-03-12 05:31:39
karet kaki kursi DOP 2 inch F-COP-ET-2 Baru 0 31 2 2017-03-12 05:31:46
1000 pcs Seal / Sil karet tabung gas LPG/elpiji hitam Baru 0 6 1 2017-03-12 05:31:46
plastik kotak 1p5x3p5 F_PKTK-AM-2x4 Baru 0 1 0 2017-03-12 05:31:46
karet kaki kursi DOP 3/4 F-DOP-ENH-3P4 Baru 4 52 3 2017-03-12 05:31:46
Seal / Sil karet tabung gas LPG/elpiji hitam 1 pack berisi 1000 pcs Baru 28 1314 83 2017-03-12 05:31:20
karet holo 2,5 cm x 5 cm F_KTK_RY_2P5x5 Baru 16 57 3 2017-03-12 05:31:40
karet kaki kursi 1 in bergaris Baru 7 87 6 2017-03-12 05:31:45
stabil kaki kursi dengan Nut M6 Baru 3 106 13 2017-03-12 05:31:39
plastik lubang untuk paku Baru 3 103 7 2017-03-12 05:31:34
F_BLT_T_D19_D13_T7 Baru 12 143 6 2017-03-12 05:31:35
karet kaki holo 3.5 x 3.5 Baru 258 240 14 2017-03-12 05:31:35
4 pcs Mur nanas M6x20 dan Baut M6x30 Baru 0 50 3 2017-03-12 05:31:46
akaret alas kaki meja atau kursi dari besi bundar 5/8 in Baru 0 40 2 2017-03-12 05:31:45
Mur Nanas dan baut M6 obeng L Baru 4 174 12 2017-03-12 05:31:40
karet bulat D23_D20_T15 Baru 0 35 2 2017-03-12 05:31:39
karet kaki tangga Baru 3 58 5 2017-03-12 05:31:40
karet kaki kursi DOP 7/8 Baru 0 8 0 2017-03-12 05:31:46
karet kaki rak besi holo 10mm x 33 mm Baru 143 169 12 2017-03-12 05:31:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment