Skip to content

Instantly share code, notes, and snippets.

Timeseries data

Majority of indicators will be "timeseries" (Gini, GDP, temperatures, salary, population, light pollution). Need to store (at least on the raw storage part) the date even if the queryable data will be the "last year". Fortunatelly, that's easier to display / compute as a layer when doing some geo but ....

Right data at Right scale

(Let's say that the application makes us start from a world map and we can 'add layer')

Some scale makes the data hard to valuate. E.g having the exact position of hospitals at country level (even France) might be difficult to interpret. Synthethising the data as a heat map for certain zoom level might help. https://www.utc.fr/ic05/resources/EDA-cours2-3-p2010.pdf (Page 17)

And regrouping data within zoom level might be helpful (as you know at which zoom level you are, you can add / remove geo layer)

from bs4 import BeautifulSoup
import requests
import sys
import uuid
import json
data = {
'url':sys.argv[1],
'links': [],
'images': []
@totetmatt
totetmatt / dot.keras.py
Created March 24, 2018 10:15
Dot export with Keras
from keras.applications import *
from keras.utils import plot_model
# [..]
# model = ...
# Get your own model here
# [..]
model = NASNetMobile() #Example with NASNetMobile
plot_model(model,show_shapes=False, to_file='model.dot')
@totetmatt
totetmatt / organization_graph.py
Created January 26, 2018 07:09
Script to generate the contributor <-> project graph from an organization
from github import Github
import csv
GITHUB_TOKEN=""
ORGANIZATION=''
# or using an access token
g = Github(GITHUB_TOKEN)
with open('git_network.node.csv','w') as nodes:
writer_node = csv.DictWriter(nodes,fieldnames=['id','label','type'])
print(sum([int(r) for l,r in zip(CHALLENGE_DATA,CHALLENGE_DATA[len(CHALLENGE_DATA)//2:]+CHALLENGE_DATA[:len(CHALLENGE_DATA)//2]) if l==r]))
35.812409, 127.1471211;Times Square (Seoul);37.517130, 126.903111
2;Suwon World Cup Stadium;37.286278, 127.036889
3;Suwon World Cup Stadium;37.286278, 127.036889
4;Gwacheon National Science Museum;37.4388, 127.0047
5;Gwacheon National Science Museum;37.4388, 127.0047
6;N Seoul Tower;37.551425, 126.988
7;Sejong Center for the Performing Arts;37.5725, 126.9756
8;Seoul Museum of History, Gyeonghui Palace;37.570475, 126.970658
9;Lotte World;37.51, 127.1
10;National Museum of Contemporary Art;37.430952, 127.019811
import requests
import re
# Avoir les liens d'une video youtube.com/watch?v=id
def link(id):
# Regex utilisé parce que le service de liens ne retourne pas un Json valide et du coup c'est plus rapide
prog= re.compile('"videoId":"(.*?)"', re.MULTILINE)
# Service qui donne les liens embarqués dans une vidéo
req = requests.get('https://www.youtube.com/get_endscreen?v=%s'%id)
davidbowie,omid
davidbowie,kim
kim,torsten
torsten,omid
brendan,torsten
ziggy,davidbowie
mick,ziggy
https://pbs.twimg.com/tweet_video/Ck26OzdWEAAmj7z.mp4
https://pbs.twimg.com/tweet_video/Ck2lEe2WgAEr4oS.mp4
https://pbs.twimg.com/tweet_video/Ck2lEe2WgAEr4oS.mp4
https://pbs.twimg.com/tweet_video/Ck2-mDGXEAA9btW.mp4
https://video.twimg.com/ext_tw_video/742459195954434048/pu/vid/1280x720/LW_rl9jNYFvjv6hq.mp4
https://pbs.twimg.com/tweet_video/Ck3AReIXIAIl6c4.mp4
https://video.twimg.com/ext_tw_video/742466142493626368/pu/vid/1280x720/-lpZLO9KW0w5Qpbm.mp4
https://video.twimg.com/ext_tw_video/742452947334930432/pu/vid/1280x720/SQE6o1K2bGuJc0Cx.mp4
https://pbs.twimg.com/tweet_video/Ck2_KglUgAAuIsH.mp4
https://video.twimg.com/ext_tw_video/742459195954434048/pu/vid/1280x720/LW_rl9jNYFvjv6hq.mp4
import org.apache.spark.graphx._
import org.apache.spark.rdd.RDD
val r = scala.util.Random
class LayoutData(
xp: Double=0,
yp: Double=0,
dxp: Double=0,
dyp: Double=0,