Skip to content

Instantly share code, notes, and snippets.


Koba Khitalishvili KobaKhit

View GitHub Profile
KobaKhit /
Last active August 27, 2020 23:59
Repartition skewed pyspark dataframes.
from pyspark.sql.functions import monotonically_increasing_id, row_number
from pyspark.sql import Window
from functools import reduce
def partitionIt(size, num):
Create a list of partition indices each of size num where number of groups is ceiling(len(seq)/num)
size (int): number of rows/elemets
KobaKhit /
Last active October 10, 2022 11:49
A simple class that enables you to download (workbooks) or (csv from views) from a Tableau Server.
import tableauserverclient as TSC
import pandas as pd
from io import StringIO
class Tableau_Server(object):
"""docstring for ClassName"""
def __init__(self,username, password,site_id,url, https = False):
super().__init__() #
KobaKhit / visualforce_embed_with_user.html
Last active June 13, 2019 21:13
Create dynamic embed in visual force which displays information by user
View visualforce_embed_with_user.html
<apex:page >
<script src=""></script>
<!-- User Id in a span -->
<span id = 'user' style = 'display: none;'>
<apex:outputText label="Account Owner" value="{!$User.Id}"></apex:outputText>
<!-- Embed placeholder -->
KobaKhit /
Created October 24, 2018 19:42
A class that enables user to download posts and comments from a subreddit
class Reddit():
def __init__(self,client_id, client_secret,user_agent='My agent'):
self.reddit = praw.Reddit(client_id=client_id,
def get_comments(self, submission):
# get comments information using the Post as a starting comment
comments = [RedditComment(,
commentid = submission.postid,
KobaKhit / unnest_byseat.R
Last active August 3, 2018 14:27
Example of how to unnest rows by seat or any other array in a cell.
View unnest_byseat.R
fname = "file-name.csv"
df = read.csv(paste0(fname,'.csv'), stringsAsFactors = F)
df$seats =
sapply(1:nrow(df), function(x) {
seats = c(df[x,]$first_seat,df[x,]$last_seat)
KobaKhit /
Last active November 14, 2017 14:57
Example of using stubhub inverntory v2 api to download all listings for a given event id.
import requests
import base64
import pprint
import pandas as pd
import json
from tqdm import tqdm
KobaKhit /
Last active July 18, 2022 07:25
Parse all html tables on a page and return them as a list of pandas dataframes. Modified from @srome
class HTMLTableParser:
def get_element(node):
# for XPATH we have to count only for nodes with same type!
length = len(list(node.previous_siblings)) + 1
if (length) > 1:
return '%s:nth-child(%s)' % (, length)
KobaKhit / Large dataframe to csv in chunks in R
Last active September 7, 2017 19:56
Write a large dataframe to csv in chunks
View Large dataframe to csv in chunks in R
df = read.csv("your-df.csv")
# Number of items in each chunk
elements_per_chunk = 100000
# List of vectors [1] 1:100000, [2] 100001:200000, ...
l = split(1:nrow(df), ceiling(seq_along(1:nrow(df))/elements_per_chunk))
# Write large data frame to csv in chunks
fname = "inventory-cleaned.csv"
KobaKhit / reddit-posts.html
Last active August 29, 2015 14:11
A list of top ten posts from a subreddit using redditjs api. Working jsfiddle
View reddit-posts.html
<!-- Produces a responsive list of top ten posts from a subreddit /worldnews. Working jsfiddle -->
<div id="posts">
<h2> Today's top ten news <small>from <a href = '//' target = '_blank'>/r/worldnews</a></small></h2>
<ul class="list-unstyled"></ul>
<!-- JS -->
<script src=""></script>
<script src=""></script>