AbdealiLoKo AbdealiLoKo

## prepare-commit-msg
#!/usr/bin/env sh
# . "$(dirname -- "$0")/_/husky.sh"  # - uncomment if using husky

# Get the current commit message file
COMMIT_MSG_FILE=$1
SOURCE_MSG=$2

# Ref: https://git-scm.com/docs/githooks#_prepare_commit_msg
# SOURCE_MSG is the source where the commit messsage is taken from
# - EMPTY    -> No source commit message present `git commit -a`

## 0-demo-jit.ipynb

      
              7 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                AbdealiLoKo
                / 0-demo-jit.ipynb
            
            
              Last active
              August 19, 2023 19:51
            
              
                JIT In Python
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## kiit-design-principles-talk.pdf

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                AbdealiLoKo
                / kiit-design-principles-talk.pdf
            
            
              Last active
              March 24, 2023 11:28
            
              
                KIIT Design Principles Talk
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## spark-withColumn-vs-select.py
"""
Simple benchmark to check if withColumn() is faster or select() is faster
Conflusion: select() is faster than withColumn() in a for loop as lesser dataframes are created
"""

import datetime
import findspark; findspark.init(); import pyspark
spark = pyspark.sql.SparkSession.builder.getOrCreate()

for ncol in [10, 100, 1000, 2000, 5000]:

## playwright-demo.ipynb

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                AbdealiLoKo
                / playwright-demo.ipynb
            
            
              Last active
              August 19, 2023 19:46
            
              
                Playwright Demo
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## README-sqlalchemy-query-relationsihps.md

      
              5 files
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                AbdealiLoKo
                / README-sqlalchemy-query-relationsihps.md
            
            
              Last active
              June 20, 2022 05:45
            
              
                Usecase: Speedup SQLAlchemy queries
              
          
    Test SQLAlchemy application

Slides: https://docs.google.com/presentation/d/1SMtBILSrqt9SWK5BnEKe4ivGTHItrVjgCEB0AUhGeyM
Setup

$ python -m venv venv
$ venv/bin/pip install sqlalchemy colorama

  
## better-commits.md

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              0 stars
            
          
                AbdealiLoKo
                / better-commits.md
            
            
              Last active
              May 18, 2022 07:54
            
              
                Better commit messages (message creators and linters)
              
          
    Better commit messages

The 3 common ways for developers to document information about their work is:

Comments

When is this written: When the developer wants something to be clearly and immediately visible to all other developers
When is this found: As soon as other developers are reading code, they will find these comments


Commit messages

When is this written: When the developer wants to explain the work involved in them making a change. Why a change was made, explanation of the
When is this found: When other developers dig a big deeper on why or when a change was made - they will find these commit messages


Tech Documentation


## pyjava.py
# Example:
# PYJAVA_LIB=jpype venv/bin/python pyjava.py

import os
from datetime import datetime
from jpmml_evaluator import _package_classpath

lib = os.environ.get('PYJAVA_LIB')
assert lib is not None, 'Set env var PYJAVA_LIB to py4j/jnius/jpype'

## spark_to_1csv.py
# Write a spark DataFrame into a single CSV files (to open with Excel/other tools easily)
# Save the file to S3

import s3fs
import pyspark.sql.functions as F  # noqa: N812

def spark_to_csv(spark_df, out_path):
    """
    Save the file in part files with spark and then append them together

## python-workflow.yml
name: Python Tests

jobs:
  build:
    runs-on: ubuntu-18.04
    strategy:
      max-parallel: 2
      matrix:
        python-version: [3.5, 3.6, 3.7]
	#!/usr/bin/env sh
	# . "$(dirname -- "$0")/_/husky.sh" # - uncomment if using husky

	# Get the current commit message file
	COMMIT_MSG_FILE=$1
	SOURCE_MSG=$2

	# Ref: https://git-scm.com/docs/githooks#_prepare_commit_msg
	# SOURCE_MSG is the source where the commit messsage is taken from
	# - EMPTY -> No source commit message present `git commit -a`
	"""
	Simple benchmark to check if withColumn() is faster or select() is faster
	Conflusion: select() is faster than withColumn() in a for loop as lesser dataframes are created
	"""

	import datetime
	import findspark; findspark.init(); import pyspark
	spark = pyspark.sql.SparkSession.builder.getOrCreate()

	for ncol in [10, 100, 1000, 2000, 5000]:
	# Example:
	# PYJAVA_LIB=jpype venv/bin/python pyjava.py

	import os
	from datetime import datetime
	from jpmml_evaluator import _package_classpath

	lib = os.environ.get('PYJAVA_LIB')
	assert lib is not None, 'Set env var PYJAVA_LIB to py4j/jnius/jpype'
	# Write a spark DataFrame into a single CSV files (to open with Excel/other tools easily)
	# Save the file to S3

	import s3fs
	import pyspark.sql.functions as F # noqa: N812

	def spark_to_csv(spark_df, out_path):
	"""
	Save the file in part files with spark and then append them together
	name: Python Tests

	jobs:
	build:
	runs-on: ubuntu-18.04
	strategy:
	max-parallel: 2
	matrix:
	python-version: [3.5, 3.6, 3.7]