This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import glob, os | |
| for file in glob.glob("your/path/here/*"): | |
| print("{}\t{}".format(os.path.split(file)[1],len(str(open(file,'r').readline()).split('\t')))) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| """ | |
| QuickSelect finds the kth smallest element of an array in mostly linear time. | |
| """ | |
| import random | |
| def Partition(a): | |
| """ | |
| Usage: (left,pivot,right) = Partition(array) | |
| Partitions an array around a randomly chosen pivot such that | |
| left elements <= pivot <= right elements. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def join_files(filepath_a,col_a,delim_a,filepath_b,col_b,delim_b): | |
| result_set = [] | |
| file_a = open(filepath_a,'r') | |
| file_b = open(filepath_b, 'r') | |
| lines_a = file_a.readlines() | |
| lines_b = file_b.readlines() | |
| for index_a, line_a in enumerate(lines_a): | |
| data_a = str(line_a.split(delim_a)[col_a]) | |
| for index_b, line_b in enumerate(lines_b): |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| from collections import defaultdict | |
| def hashJoin(table1, index1, table2, index2): | |
| h = defaultdict(list) | |
| # hash phase | |
| for s in table1: | |
| h[s[index1]].append(s) | |
| # join phase | |
| return [(s, r) for r in table2 for s in h[r[index2]]] | |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Here is my implementation of a nested loop join. | |
| # It takes the two lists, along with two other lists which | |
| # contain the indexes of the columns on which to be joined. | |
| # For example: | |
| # If a[1] is to be joined to b[2] and a[2] to b[3] than the | |
| # arguments would like like so: join(a,[1,2],b,[2,3]) | |
| listA = [["SomeString1", "A", "1"], | |
| ["SomeString2", "A", "2"], | |
| ["SomeString3", "B", "1"], | |
| ["SomeString4", "B", "2"]] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| using System; | |
| using System.Collections.Generic; | |
| using System.Linq; | |
| using System.Text; | |
| using System.Threading.Tasks; | |
| using Renci.SshNet; | |
| namespace SSH_Net | |
| { | |
| class Program |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| # Example command to run: ./sqoop_warehouse_to_hive_db warehouse_table_name table_partition_column | |
| echo "Running sqoop import on table: $1 with key $2. If this is not correct, exit now with CTRL-C - Nick" | |
| sleep 5 | |
| sqoop import --connect jdbc:db2://server.host.name:port/DatabaseName --username user123 --password pass456 \ | |
| --table $1 --split-by $2 --fields-terminated-by '\t' \ | |
| --hive-overwrite --hive-import --hive-table $1 --hive-database warehouse |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| import sys | |
| from ftplib import FTP | |
| def initialize_ftp_connection(ftp_servers, server_name): | |
| try: | |
| conf = ftp_servers[server_name] | |
| ftp_conn = FTP(conf[0]) | |
| ftp_conn.login(user=conf[1], passwd=conf[2]) | |
| ftp_conn.cwd(conf[3]) | |
| return ftp_conn |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def fib_w_memo(n, memo = {}): | |
| if n == 0: | |
| return 0 | |
| if n == 1: | |
| return 1 | |
| if n in memo: | |
| return memo[n] | |
| else: | |
| memo[n] = fib_w_memo(n-1, memo) + fib_w_memo(n-2, memo) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| def binary_search(ele, arr, min = 0, max = None): | |
| if max is None: | |
| max = len(arr) - 1 | |
| half = min + (max - min) / 2 | |
| value = arr[half] | |
| if value == ele: | |
| return half | |
| if ele < value: | |
| max = half |
NewerOlder