Skip to content

Instantly share code, notes, and snippets.

@RyanJulyan
RyanJulyan / generic_reconcile.py
Created December 5, 2023 17:57
This script reconciles two pandas DataFrames based on specified criteria. It efficiently compares DataFrame rows based on key columns, handling both numeric and textual differences with specified tolerances. It includes a calculating the Levenshtein distance comparison between strings.
from functools import lru_cache
from typing import Any, List
import pandas as pd
def lev_dist(a: str, b: str) -> float:
"""
Calculate the Levenshtein distance between two input strings.
@RyanJulyan
RyanJulyan / control_point_decorator.py
Last active December 9, 2023 06:51
Python decorator with control points to validate data before processing. It features an abstract base class for defining controls and subclasses for specific checks, such as positive and even value verification. The decorator is applied to a function, ensuring data meets criteria before processing.
from typing import Callable, List
def control_point_decorator(control_points: List['BaseControl']) -> Callable:
"""
Decorator function to apply control point checks before executing a function.
Args:
control_points (List[BaseControl]): A list of control point objects which will be used to check the data
before passing it to the decorated function.
@RyanJulyan
RyanJulyan / get_file_metdata.py
Created September 5, 2023 18:11
This Python script provides a comprehensive utility for file metadata extraction and categorization. It uses various libraries like Tika, NLTK, and YAKE for text parsing, keyword extraction, and file type identification. The script also handles both local files and web URLs.
import pprint
from typing import Any, Dict, Iterable, List, Optional, Union
import re
import mimetypes
import pathlib
from tika import parser
import requests
import nltk
import inflect
@RyanJulyan
RyanJulyan / extract_info.py
Last active June 16, 2024 19:28
This Python script uses decorators to automatically extract and store metadata about functions and classes. It captures details like function names, docstrings, parameter types, and return types, making it easier to document and understand code.
from functools import wraps
from typing import Any, List, Dict, Union, Callable, Type, Optional, get_type_hints
from dataclasses import dataclass
import inspect
import re
import json
import attr
# Global dictionary to store function and class details grouped by namespace
DETAILS: Dict[str, List[Dict[str, Any]]] = {}
@RyanJulyan
RyanJulyan / Test API curl examples.txt
Last active September 4, 2023 15:18
Dynamic Model CRUD Flask API using Python attrs and DictDatabase to manage various data models. Using only class definitions, it auto-generates API routes for each model, supporting basic CRUD operations.
curl -X POST \
http://127.0.0.1:5000/api/cars \
-H 'cache-control: no-cache' \
-H 'content-type: application/json' \
-H 'postman-token: e4839424-9e84-ff89-51fe-6202bf7f0c65' \
-d '{
"id": 1,
"make":"renault",
"model":"cleo"
}'
@RyanJulyan
RyanJulyan / bump_version.py
Last active September 4, 2023 15:18
Determine the appropriate version bump (major, minor, patch) for JSON schemas based on changes between versions. Handles attribute additions, deletions, type changes, and required fields, with special handling for 'Optional' types. Can compare multiple schema pairs to find the highest priority bump.
from typing import Any, List, Tuple, Dict, Optional
import subprocess
import sys
from determine_bump import determine_highest_bump
def bump_version(
schema_pairs: List[Tuple[Optional[Dict[Any, Any]], Optional[Dict[Any, Any]]]]
) -> None:
@RyanJulyan
RyanJulyan / fake_data_factory.py
Last active May 31, 2024 05:33
Generates fake data instances using attrs.
from typing import Any, get_args, get_origin, Type, Dict, Optional, List
import random
from datetime import date, datetime
import attr
from faker import Faker
def data_factory(
cls: Type,
@RyanJulyan
RyanJulyan / check_port_scanner.py
Last active August 16, 2023 12:27
Async Port Scanner to identify open ports that are not listed in the Expected Ports. Default expected ports are: "80,443"
import sys
import socket
import argparse
from datetime import datetime
from typing import Any, List, Set
import asyncio
import pyfiglet
@RyanJulyan
RyanJulyan / check_sensitive_information.py
Last active August 14, 2023 20:36
This Python script scans a code repository for sensitive information like keys and passwords extendable with custom regex patterns. It accepts a root directory and an optional text file, listing files or folders to ignore. Specific lines can be ignored with the comment "ignore: security check."
import os
import re
import sys
import argparse
from datetime import datetime
from concurrent.futures import as_completed
# from concurrent.futures.thread import ThreadPoolExecutor as PoolExecutor
from concurrent.futures.process import ProcessPoolExecutor as PoolExecutor
from typing import List
@RyanJulyan
RyanJulyan / class_creator.py
Last active August 21, 2023 17:55
A class to create and manage custom classes from JSON definitions defined by a valid `jsonschema` JSON class definition, as well as class data.
from typing import List, Dict, Tuple, Set, Optional, Union, Any, Type
import json
import attr
import jsonschema
class ClassCreator:
"""A class to create and manage custom classes from JSON definitions."""