Created
April 20, 2016 21:05
-
-
Save Irio/7db611b0fcc67a3334249abdc10e6026 to your computer and use it in GitHub Desktop.
Meetup PyData Berlin 20/04 - April 2016 Meetup
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
EyeQuant - market tool for understanding how pages work. heatmaps. how clean or cluttered you design seems. collect data about human reactions to try to predict the perception of a new design | |
PyData Berlin Conference is open. May 21 | |
Frankfurter Tor | |
BERLIN.PYDATA discount code for the conference. 20% discount | |
## (Jose Quesada) Distributed processing of large graphs in Python | |
Director @ Data Science Retreat | |
5 people already attended DSR | |
need to know how to make a good question | |
business case - someone needs to want to make a company out of this | |
data available | |
technology existent | |
we need to be able to know when the solution works (or not) | |
DS is a creative activity | |
try to start even with a bad question. something will break on the way. you may need to start from the scratch | |
sometimes the stakeholder will give the solution, not a question. try to work around that | |
when tweet to influence the right account? | |
find the time where everyone in the chain until an expected viewer will be online | |
algorithms: shortest-path and pagerank | |
personalized pagerank | |
being used by twitter (paper “the who to follow…”) | |
use it to find the most influential people around the user you want to influence | |
igraph | |
jung | |
neo4j (created chipher language to query graphs) | |
dato | |
graphx | |
spark | |
running in multiple machines is much harder | |
graphframes wrap graphx | |
graphframes similar to pandas | |
graphx similar to spark(?) | |
uses cipher language | |
graph had 25gb. too large for 1 machine | |
3 machines in cluster running in Amazon. no devops or docker. just gimme this cluster. using elastic mapreduce | |
pyspark couldn’t be used. too high level. used sparkshell | |
just 5 euros per day. in the old days, just google and twitter could do such thing | |
finding a good question is half of the problem | |
egograph - graph just around who you want to influence. complicated to generate. random walking it. alternative to not need 25gb of memory. you could calculate smaller graphs separately and compare results. the same = go with ego | |
project done in 1 week(?) | |
nobody has the entire graph of Twitter | |
larger cluster for less time would cost less (comment from participant) | |
answer questions who can be answered with existent technology otherwise will take a lot of time. academia will solve your problem first. if you’re the only one with huge problem and important for the business, may be worth of solving | |
### | |
Check Box2D Physics. Genetic algorithms | |
## (Sylvain Bellemare) Bitcoin: Some Nuts & Bolts | |
@sbellem | |
sylvain@ascribe.io | |
software engineer | |
work with django | |
SPOOL blockchain protocol and API | |
BigchainDB scalable blockchain database | |
2 papers | |
they use blockchain, not bitcoin | |
pros and cons | |
consensus is the coolest thing | |
nuts and bolts: many pieces. try to make something out of it | |
don’t understanding everything | |
babylonian math = understand everything starting from any piece. like developing software. no right place to start | |
paper by Satoshi Nakamoto defining Bitcoin | |
paper by Garay. Bitcoin Backbone Protocol | |
everyone can mine, be a node. requires lots of computing power solving cripto puzzle | |
Davies-Meyer one-way compression function. no way to return | |
bitcoin address = public key, in the end | |
private key is for signing messages | |
pycoin is one library | |
bitcoin | |
mainnet = doing transactions | |
testnet = testing stuff | |
racktest = to run locally | |
in testnet you can add money to your wallet | |
transaction creation, transaction signature and transaction broadcast | |
transactions.create => transaction id | |
transactions.sign(id, secret_key) | |
transactions.push | |
pushing sends to a node. may be yours or another through API | |
transactions.decode # shows “json” of transaction based on crazy hash | |
can pay a greater fee to get the transaction approved first | |
you’re expending “remaining transactions”. there’s no real value in your wallet | |
transaction is related to a block | |
blockchain = group/aggregation of blocks | |
for validating you some a part of the tree. someone can verify by the history | |
hash you calculate, merkle root, is a hash of all the transactions in the block. changing the block will change the merkle root | |
the first bitcoin block is called "genesis block”. “pre-defined" | |
genesis block has just one transaction. its hash is the same as its merkle root | |
double spending attack = publishing fake blocks | |
people is needing distributed things | |
how to ensure people maintaining the blockchain won’t change everything. the longest chain is the most valid | |
people sending invalid hashes can’t be (almost) prevented because it’s a hash function. time-consuming to calculate, quick to verify | |
blocks are incentivized to behave well cause users expend money on processing farmers. | |
25 bitcoins for each block accepted in the network. they win money for doing so | |
every node receives different transactions, depending on your position. different people will build different blocks | |
Decentralization efforts | |
- bigchaindb (database) | |
- eris industries | |
- ethereum (processing) | |
- ipfs (storage) | |
- tendermint | |
- ascribe (applications) | |
"bitcointech" coursera course | |
in dept: "bitcoin backbone protocol: analysis and applications" paper | |
##### | |
after meetup, cards where you can put what you want to listen about. or you can take one and choose to speak |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment