Skip to content

Instantly share code, notes, and snippets.

CREATE TABLE `financial_report` (
`id` int(10) NOT NULL AUTO_INCREMENT,
`company_id` int(10) NOT NULL,
`cik_id` int(10) NOT NULL,
`segment_id` int(10) NOT NULL,
`segment_name` varchar(50) NOT NULL,
`year` varchar(10) NOT NULL,
`quarter` enum('Q1','Q2', 'Q3', 'Q4', 'Y') NOT NULL,
`total_revenue` decimal(15,2) DEFAULT NULL,
`cost_of_revenue` decimal(15,2) DEFAULT NULL,
CREATE TABLE `company` (
`id` int(10) NOT NULL AUTO_INCREMENT,
`name` varchar(50) NOT NULL,
`industry` varchar(50) NOT NULL,
`ceo` varchar(50) DEFAULT NULL,
`founders` varchar(150) DEFAULT NULL,
`founded` varchar(50) DEFAULT NULL,
`headquarters` varchar(150) DEFAULT NULL,
`country` varchar(50) DEFAULT NULL,
`num_of_employees` int(10) DEFAULT NULL,
@ivanliu
ivanliu / stock_price.py
Created July 30, 2017 23:47
fetch stock price in batch mode
'''
scrape all history stock Data
'''
import datetime
import pandas_datareader.data as web
from dateutil.relativedelta import relativedelta
import mysql.connector
import xml.etree.ElementTree as ET
import pandas as pd
import os
@ivanliu
ivanliu / options.py
Created July 30, 2017 23:46
fetch option data in batch mode
from pandas_datareader.data import Options
import mysql.connector
import datetime
import os
#convert multi-index df into normal df and necessary data process
def option_convert(symbol):
stock_option = Options(symbol, "yahoo")
data = stock_option.get_all_data()
data.reset_index(inplace=True)
@ivanliu
ivanliu / RAS_vs_Isolation
Last active July 21, 2017 07:24
Storm scheduler notes
1. Problems with isolation scheduler
- Hand out dedicated machines to topology, it's a problem in heterogeneous cluster
- Low overall cluster resource utilization, users are not utilizing their isolated resource very well.
- Unbalanced resource usage: some machines are not used when others are over used.
- Per topology scheduling strategy
2. Resource aware scheduling
Refer to http://storm.apache.org/releases/current/Resource_Aware_Scheduler_overview.html
This scheduler takes into account resource avalibility on machines and resource requirement of workloads when scheduling the topology
@ivanliu
ivanliu / cool_presentation
Last active July 8, 2017 22:07
Impressive presentation/demo
1) A live python demo
http://pyvideo.org/pycon-us-2015/python-concurrency-from-the-ground-up-live.html
code -> https://github.com/ivanliu/concurrencylive
@ivanliu
ivanliu / py_pkg.howto
Created July 8, 2017 19:29
Packaging python
http://jtushman.github.io/blog/2013/06/17/sharing-code-across-applications-with-python/#3
https://hynek.me/articles/sharing-your-labor-of-love-pypi-quick-and-dirty/
http://python-notes.curiousefficiency.org/en/latest/index.html
@ivanliu
ivanliu / compitetion
Created June 5, 2017 07:30
competitor analysis
1. The Future of Investing? AI-Run Hedge Funds
https://futurism.com/the-future-of-investing-ai-run-hedge-funds/
aidyia, sentient, rebellionresearch
from flask import Flask, render_template
app = Flask(__name__)
@app.route('/')
@app.route('/index')
def index(chartID = 'chart_ID', chart_type = 'bar', chart_height = 350):
chart = {"renderTo": chartID, "type": chart_type, "height": chart_height,}
series = [{"name": 'Label1', "data": [1,2,3]}, {"name": 'Label2', "data": [4, 5, 6]}]
title = {"text": 'My Title'}
@ivanliu
ivanliu / website_scrapy_options
Last active January 28, 2018 07:11
How to scrapy website
1. Spynner
https://github.com/makinacorpus/spynner
a) Install libpng
http://ethan.tira-thompson.com/Mac_OS_X_Ports.html
b)
2. Mechanize
http://www.pythonforbeginners.com/mechanize/browsing-in-python-with-mechanize/
http://www.pythonforbeginners.com/cheatsheet/python-mechanize-cheat-sheet