Skip to content

Instantly share code, notes, and snippets.

View KobaKhit's full-sized avatar
🏀

Koba Khitalishvili KobaKhit

🏀
View GitHub Profile
@KobaKhit
KobaKhit / sql-spines.md
Created February 8, 2024 02:58
Examples of generating spines/dates in SQL. Assisted by Caleb Kassa.

Spines in SQL

Given a starting date 2024-02-01 I would like to generate 7 days into the future until February 8th (2024-02-08), ex.g.

dt
2024-02-01
2024-02-02
2024-02-03
2024-02-04
@KobaKhit
KobaKhit / repartition_pyspark_dataframe.py
Last active August 27, 2020 23:59
Repartition skewed pyspark dataframes.
from pyspark.sql.functions import monotonically_increasing_id, row_number
from pyspark.sql import Window
from functools import reduce
def partitionIt(size, num):
'''
Create a list of partition indices each of size num where number of groups is ceiling(len(seq)/num)
Args:
size (int): number of rows/elemets
@KobaKhit
KobaKhit / tableau_server_export.py
Last active October 18, 2023 18:36
A simple class that enables you to download (workbooks) or (csv from views) from a Tableau Server.
import tableauserverclient as TSC
import pandas as pd
from io import StringIO
class Tableau_Server(object):
"""docstring for ClassName"""
def __init__(self,username, password,site_id,url, https = False):
super().__init__() # http://stackoverflow.com/questions/576169/understanding-python-super-with-init-methods
@KobaKhit
KobaKhit / visualforce_embed_with_user.html
Last active June 13, 2019 21:13
Create dynamic embed in visual force which displays information by user
<apex:page >
<html>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.7.2/jquery.min.js"></script>
<!-- User Id in a span -->
<span id = 'user' style = 'display: none;'>
<apex:outputText label="Account Owner" value="{!$User.Id}"></apex:outputText>
</span>
<!-- Embed placeholder -->
@KobaKhit
KobaKhit / reddit_posts_and_comments.py
Created October 24, 2018 19:42
A class that enables user to download posts and comments from a subreddit
class Reddit():
def __init__(self,client_id, client_secret,user_agent='My agent'):
self.reddit = praw.Reddit(client_id=client_id,
client_secret=client_secret,
user_agent=user_agent)
def get_comments(self, submission):
# get comments information using the Post as a starting comment
comments = [RedditComment(author=submission.author,
commentid = submission.postid,
@KobaKhit
KobaKhit / unnest_byseat.R
Last active August 3, 2018 14:27
Example of how to unnest rows by seat or any other array in a cell.
library(tidyr)
setwd("~/Desktop/unnest")
fname = "file-name.csv"
df = read.csv(paste0(fname,'.csv'), stringsAsFactors = F)
df$seats =
sapply(1:nrow(df), function(x) {
seats = c(df[x,]$first_seat,df[x,]$last_seat)
@KobaKhit
KobaKhit / stubhub_inventory_v2.py
Last active November 14, 2017 14:57
Example of using stubhub inverntory v2 api to download all listings for a given event id.
import requests
import base64
import pprint
import pandas as pd
import json
from tqdm import tqdm
# https://stubhubapi.zendesk.com/hc/en-us/articles/220922687-Inventory-Search
@KobaKhit
KobaKhit / hmtl_table_parser.py
Last active July 18, 2022 07:25
Parse all html tables on a page and return them as a list of pandas dataframes. Modified from @srome
# http://srome.github.io/Parsing-HTML-Tables-in-Python-with-BeautifulSoup-and-pandas/
class HTMLTableParser:
@staticmethod
def get_element(node):
# for XPATH we have to count only for nodes with same type!
length = len(list(node.previous_siblings)) + 1
if (length) > 1:
return '%s:nth-child(%s)' % (node.name, length)
else:
return node.name
@KobaKhit
KobaKhit / Large dataframe to csv in chunks in R
Last active September 7, 2017 19:56
Write a large dataframe to csv in chunks
df = read.csv("your-df.csv")
# Number of items in each chunk
elements_per_chunk = 100000
# List of vectors [1] 1:100000, [2] 100001:200000, ...
l = split(1:nrow(df), ceiling(seq_along(1:nrow(df))/elements_per_chunk))
# Write large data frame to csv in chunks
fname = "inventory-cleaned.csv"
@KobaKhit
KobaKhit / reddit-posts.html
Last active August 29, 2015 14:11
A list of top ten posts from a subreddit using redditjs api. Working jsfiddle http://jsfiddle.net/KobaKhit/t42zkbnk/
<!-- Produces a responsive list of top ten posts from a subreddit /worldnews. Working jsfiddle http://jsfiddle.net/KobaKhit/t42zkbnk/ -->
<div id="posts">
<h2> Today's top ten news <small>from <a href = '//reddit.com/r/worldnews' target = '_blank'>/r/worldnews</a></small></h2>
<hr>
<ul class="list-unstyled"></ul>
</div>
<!-- JS -->
<script src="https://rawgit.com/sahilm/reddit.js/master/reddit.js"></script>
<script src="https://code.jquery.com/jquery-2.1.3.min.js"></script>