Skip to content

Instantly share code, notes, and snippets.

🎯
Focusing

Haridas N haridas

View GitHub Profile
@haridas
haridas / find_error_lines_in_json.py
Created Nov 23, 2018
Ensure a big jsons particular field does't includes null, Helpful as part of datacleanup process.
View find_error_lines_in_json.py
import json
def read_json_lines(fname, filed_name):
num = 0
doc_size = []
error_docs = []
with open(fname) as f:
while True:
line = f.readline()
if not line:
break
@haridas
haridas / fix_unicode.py
Created Nov 2, 2018
NLP pre-processing - Remove unicode chars from text
View fix_unicode.py
import glob
import pandas as pd
files = glob.glob('out-*.json')
def remove_unicode_char(file_name):
f = open(file_name, 'rb').read()
with open(file_name, 'w') as nf:
nf.write(f.decode(encoding="ascii", errors="ignore"))
print ("=> ", file_name)
@haridas
haridas / testing.py
Created Oct 8, 2018
Python debugger on Ipython shell
View testing.py
import os
import ipdb; ipdb.set_trace()
# other codes..
View bash-trap-usage.sh
END=""
trap 'increment && END=1' 2
increment() {
ls /
echo "Cleaned up"
}
@haridas
haridas / Compile-python-from-source.md
Last active May 8, 2017
Compile python source from source, compile flags, and other settings for data science works.
View Compile-python-from-source.md
  1. Ensure all the development files required to build custom bindings, mainly bzip2, and sqlite3 bingings are important.
  2. Build python with enabling the unicode flag usc4
$ sudo apt-get install libbz2-dev libsqlite3-dev
$ ./configure --enable-unicode=ucs4
$ make
$ make install
@haridas
haridas / reduce_git_repo_size.sh
Last active Jan 6, 2017
Remove old files completely from git history
View reduce_git_repo_size.sh
## How to remove the old files/folder from all git commits.
#1. clone the repo freshly
git clone <repo.git>
#2. Do index-filter option to go through all the object indexs and look for give match
# And remove those matching objects.
git filter-branch \
--prune-empty \
--index-filter \
@haridas
haridas / android_sdk_cmd.md
Last active Nov 24, 2016
Manage Android sdk from command line
View android_sdk_cmd.md

Some times it would be very handy to check / update android sdk from command line. For automation pipelines it surely helps. Here is few commands that can be used to check or update android sdk from command line.

List Installed SDK details

haridas@haridas-HP-ProBook-4440s:~$ android list sdk
Refresh Sources:
  Fetching https://dl.google.com/android/repository/addons_list-2.xml
  Validate XML
  Parse XML
  Fetched Add-ons List successfully
View add_exif_record.py
import os
import sys
import subprocess
from optparse import OptionParser
from datetime import datetime
def run_shell_script(shell_script):
"""
Assuming that the script is comming from trusted source.
View json_unmarshalling.go
package main
import (
"encoding/json"
"fmt"
)
type Response struct {
Action string
Node Nodes
@haridas
haridas / new_memcached.py
Created Mar 24, 2015
Ketama based Consistent hashing implementation of python-memcache library.
View new_memcached.py
"""
To Test this Script the start 8 memcache servers using this command.
$ memcached -d -p {PortNumber}
PortNumber:
11211
11212
11213
11214
You can’t perform that action at this time.