Skip to content

Instantly share code, notes, and snippets.

View Freidman-Gradient-Boosting-Machines.py
import numpy as np
import pandas as pd
class Node:
'''
This class defines a node which creates a tree structure by recursively calling itself
whilst checking a number of ending parameters such as depth and min_leaf. It uses an exact greedy method
to exhaustively scan every possible split point. Algorithm is based on Frieman's 2001 Gradient Boosting Machines
Input
@Sandy4321
Sandy4321 / Naive-Gradient-Boosting.py
Created May 7, 2020 — forked from Ekeany/Naive-Gradient-Boosting.py
A naive gradient boosting implementation which I want to share on medium.com
View Naive-Gradient-Boosting.py
import numpy as np
import pandas as pd
from math import e
class Node:
'''
This class defines a node which creates a tree structure by recursively calling itself
whilst checking a number of ending parameters such as depth and min_leaf. It uses an exact greedy method
to exhaustively scan every possible split point. The gain metric of choice is conservation of varience.
This is a Naive solution and does not comapre to Frieman's 2001 Gradient Boosting Machines
@Sandy4321
Sandy4321 / XGBoost-from-scratch-python.py
Created May 7, 2020 — forked from Ekeany/XGBoost-from-scratch-python.py
A numpy/pandas implementation of XGBoost
View XGBoost-from-scratch-python.py
import numpy as np
import pandas as pd
from math import e
class Node:
'''
A node object that is recursivly called within itslef to construct a regression tree. Based on Tianqi Chen's XGBoost
the internal gain used to find the optimal split value uses both the gradient and hessian. Also a weighted quantlie sketch
and optimal leaf values all follow Chen's description in "XGBoost: A Scalable Tree Boosting System" the only thing not
View criteo_ffm.md

Data preparation

-- set mapred.max.split.size=128000000;
set hive.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
set hive.tez.input.format=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat;
set hive.mapjoin.smalltable.filesize=30000000;
-- set hive.optimize.s3.query=true;
set hive.exec.dynamic.partition.mode=nonstrict; 
set hive.optimize.sort.dynamic.partition=false;
View rolling mean and volatility.py
bitcoin = cryptos[0]
bitcoin_cash = cryptos[1]
dash = cryptos[2]
ethereum_classic = cryptos[3]
bitconnect = cryptos[4]
litecoin = cryptos[5]
monero = cryptos[6]
nem = cryptos[7]
neo = cryptos[8]
numeraire = cryptos[9]
View chi.py
#!/usr/bin/env python
# coding: utf-8
# ## Perform Chi-Square test for Bank Churn prediction (find out different patterns on customer leaves the bank) . Here I am considering only few columns to make things clear
# ### Import libraries
# In[2]:
View unique_comb_3.sql
SELECT m.* FROM #matches m
INNER JOIN #matches m1 ON m.fromid = m1.toid AND m.toid = m1.fromid AND m1.fromid <=m1.toid
ORDER BY m.toteam
View unique_comb_2.sql
SELECT t.id fromid,t.Team fromteam,t1.id toid,t1.Team toteam
INTO #matches
FROM #Team t
INNER JOIN #Team t1 ON t.id <> t1.id
SELECT * FROM #matches
View chi1.ipynb
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
View incremental_recommender_conf_1.py
# more than 3 -> 1, less than 5 -> 0
data_df['preference'] = np.where(data_df['rating'] > 3, 1, 0)
data_df.head()
You can’t perform that action at this time.