Skip to content

Instantly share code, notes, and snippets.

View guerbai's full-sized avatar
🎯
Focusing

guerbai

🎯
Focusing
View GitHub Profile
@guerbai
guerbai / bigquery_safequery.py
Created June 2, 2019 05:10
bigquery safequery #BigQuery
# Query to select comments that received more than 10 replies
query_popular = """
SELECT parent, COUNT(id)
FROM `bigquery-public-data.hacker_news.comments`
GROUP BY parent
HAVING COUNT(id) > 10
"""
# Set up the query (cancel the query if it would use too much of
# your quota, with the limit set to 1 GB)
safe_config = bigquery.QueryJobConfig(maximum_bytes_billed=1e9)
@guerbai
guerbai / init_bigquery.py
Created June 2, 2019 05:09
bigquery初始化 #Jupyter
from google.cloud import bigquery
# Create a "Client" object
client = bigquery.Client()
# Construct a reference to the "hacker_news" dataset
dataset_ref = client.dataset("hacker_news", project="bigquery-public-data")
# API request - fetch the dataset
dataset = client.get_dataset(dataset_ref)
@guerbai
guerbai / mongo_time_search.py
Created June 2, 2019 05:09
python根据时间搜索mongo记录
if getattr(request, 'start_date'):
start_date = datetime.datetime.strptime(request.start_date, '%Y-%m-%d')
search_dict['applicant_time__gte'] = start_date
if getattr(request, 'end_date'):
end_date = datetime.datetime.strptime(request.end_date, '%Y-%m-%d') + datetime.timedelta(days=1)
search_dict['applicant_time__lte'] = end_date
class ValueProcessUtils(object):
@classmethod
def pop_none(cls, **kwargs):
params = dict()
for key, value in kwargs.iteritems():
if value is not None:
params.update({key: value})
return params
@guerbai
guerbai / categorical_max_min.py
Created June 2, 2019 05:07
列出各种类的最大值与最小值 #Pandas
price_extremes = reviews.groupby('variety').price.agg([min, max])
@guerbai
guerbai / col_num.py
Created June 2, 2019 05:07
查看dataframe中某一列各元素的个数 #Pandas
iris["Species"].value_counts()
# or
reviews_written = reviews.groupby('taster_twitter_handle').size()
@guerbai
guerbai / score_to_star.py
Created June 2, 2019 05:06
将分数一栏转化为star数 #Pandas
def stars(row):
if row.country == 'Canada':
return 3
elif row.points >= 95:
return 3
elif row.points >= 85:
return 2
else:
return 1
@guerbai
guerbai / get_fruity_num.py
Created June 2, 2019 05:06
获取description中包含fruity的数量 #Pandas
n_trop = reviews.description.map(lambda desc: "tropical" in desc).sum()
n_fruity = reviews.description.map(lambda desc: "fruity" in desc).sum()
descriptor_counts = pd.Series([n_trop, n_fruity], index=['tropical', 'fruity'])
@guerbai
guerbai / get_best_item.py
Created June 2, 2019 05:05
找到性价比最高的item #Pandas
bargain_idx = (reviews.points / reviews.price).idxmax()
bargain_wine = reviews.loc[bargain_idx, 'title']
@guerbai
guerbai / country_in.py
Created June 2, 2019 05:05
country为Australia或New Zealand #Pandas
reviews[reviews.country.isin(['Australia', 'New Zealand'])]