Skip to content

Instantly share code, notes, and snippets.

@javierprovecho
Forked from jorgesancha/python_code_test_carto.md
Last active May 25, 2017 20:11
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save javierprovecho/cd341eea7520968fc7a062bf406bbfd9 to your computer and use it in GitHub Desktop.
Save javierprovecho/cd341eea7520968fc7a062bf406bbfd9 to your computer and use it in GitHub Desktop.
Python code test - CARTO

Build the following and make it run as fast as you possibly can using Python 3 (vanilla). The faster it runs, the more you will impress us!

Your code should:

All of that in the most efficient way you can come up with.

That's it. Make it fly!


Results

Code for downloading file

import urllib

testfile = urllib.URLopener()
testfile.retrieve(\
	"https://s3.amazonaws.com/carto-1000x/data/yellow_tripdata_2016-01.csv",\
	"yellow_tripdata_2016-01.csv")

(NOTE: I didn't include the download block in the benchmark due to network speed impact)

Time for requisites 2 and 3

root@ubuntu-1gb-fra1-01:~# time python3 main.py
10906858 1.7506631158122512

real	0m40.243s
user	0m37.004s
sys	0m2.140s

Box details

OS

root@ubuntu-1gb-fra1-01:~# uname -a
Linux ubuntu-1gb-fra1-01 4.4.0-78-generic #99-Ubuntu SMP Thu Apr 27 15:29:09 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

CPU

root@ubuntu-1gb-fra1-01:~# cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 79
model name	: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
stepping	: 1
microcode	: 0x1
cpu MHz		: 2199.998
cache size	: 30720 KB
physical id	: 0
siblings	: 1
core id		: 0
cpu cores	: 1
apicid		: 0
initial apicid	: 0
fpu		: yes
fpu_exception	: yes
cpuid level	: 13
wp		: yes
flags		: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon rep_good nopl eagerfpu pni pclmulqdq vmx ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch vnmi ept fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm xsaveopt arat
bugs		:
bogomips	: 4399.99
clflush size	: 64
cache_alignment	: 64
address sizes	: 40 bits physical, 48 bits virtual
power management:

RAM

root@ubuntu-1gb-fra1-01:~# cat /proc/meminfo
MemTotal:        1016156 kB
MemFree:           68416 kB
MemAvailable:     818396 kB
Buffers:            1152 kB
Cached:           872720 kB
SwapCached:            0 kB
Active:           457612 kB
Inactive:         441676 kB
Active(anon):      28232 kB
Inactive(anon):     2728 kB
Active(file):     429380 kB
Inactive(file):   438948 kB
Unevictable:        3656 kB
Mlocked:            3656 kB
SwapTotal:             0 kB
SwapFree:              0 kB
Dirty:                 0 kB
Writeback:             0 kB
AnonPages:         29072 kB
Mapped:            12020 kB
Shmem:              3124 kB
Slab:              30380 kB
SReclaimable:      18772 kB
SUnreclaim:        11608 kB
KernelStack:        1840 kB
PageTables:         2160 kB
NFS_Unstable:          0 kB
Bounce:                0 kB
WritebackTmp:          0 kB
CommitLimit:      508076 kB
Committed_AS:     202332 kB
VmallocTotal:   34359738367 kB
VmallocUsed:           0 kB
VmallocChunk:          0 kB
HardwareCorrupted:     0 kB
AnonHugePages:      4096 kB
CmaTotal:              0 kB
CmaFree:               0 kB
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB
DirectMap4k:       53240 kB
DirectMap2M:      995328 kB
DirectMap1G:           0 kB
lines_counter = 0
average_sum = 0
tick = bytes(",", "utf8")
with open("yellow_tripdata_2016-01.csv.2", "r+b") as f:
f.readline()
for line in f:
fields_remaining = 4
tick_end = len(line) - 1
tick_start = 0
while True:
tick_start = line.rfind(tick, 0, tick_end)
fields_remaining -= 1
if fields_remaining is 0:
break
tick_end = tick_start
average_sum += float(line[tick_start+1:tick_end])
lines_counter += 1
print(lines_counter, average_sum/lines_counter)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment