Skip to content

Instantly share code, notes, and snippets.

@recalde
recalde / log_monitor.cs
Created September 26, 2024 14:46
log_monitor
using System;
using System.Collections.Concurrent;
using System.Collections.Generic;
using System.IO;
using System.Text.RegularExpressions;
using System.Threading;
using System.Threading.Tasks;
using k8s;
using k8s.Models;
using Npgsql;
@recalde
recalde / load_balancer.py
Created September 26, 2024 14:43
python_load_balancer
from fastapi import FastAPI, BackgroundTasks
from pydantic import BaseModel
import requests
import random
import time
import boto3
from boto3.dynamodb.conditions import Key
app = FastAPI()
@recalde
recalde / db_config.json
Created March 13, 2024 01:28
Export Data Script
{
"sql_server": {
"driver": "{SQL Server}",
"server": "your_server",
"database": "your_database",
"username": "your_username",
"password": "your_password"
},
"oracle": {
"dsn": "your_dsn",
import pandas as pd
import pyarrow.csv as pv
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
import calendar
from pathlib import Path
import os
def read_large_csv(filepath):
@recalde
recalde / auto_complete_chat.md
Created March 6, 2024 15:30
Auto Complete Chat

Question :

you are aws python developer using AWS who is interested in speed, performance, and memory optimization. you have been given a file with 366,000 distinct account names, sorted alphabetically. the files is 10mb total. you have a front end website that needs to auto-complete on a text field, using a string contains filter. there should be a lambda web api that returns data to the front end. you can use lambda, s3, dynamodb, in memory cache, and browser side cache to solve this problem. describe several options to solve this problem.

Answer :

Solving this problem involves efficiently handling a large dataset for auto-completion with constraints on speed, performance, and memory optimization. Here are several approaches using AWS services:

1. Lambda with S3 and In-Memory Cache

import os
import csv
import boto3
from concurrent.futures import ThreadPoolExecutor, as_completed
from FjaPb import FjaPb # Assuming this is the correct import for FjaPb
# Load AWS credentials and target bucket details from environment variables
ACCESS_KEY = os.getenv("AWS_ACCESS_KEY_ID")
SECRET_KEY = os.getenv("AWS_SECRET_ACCESS_KEY")
BUCKET_URL = os.getenv("S3_BUCKET_URL")
@recalde
recalde / auto_complete_acct_nm.py
Last active February 28, 2024 23:30
Account name cache
import boto3
import csv
from datetime import datetime, timedelta
from io import StringIO
def scan_dynamodb_table(dynamodb_table_name, last_execution_time=None):
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table(dynamodb_table_name)
# Calculate the start and end time for the time range (1 month ago from last execution time or now)
@recalde
recalde / summarize.py
Last active February 28, 2024 20:20
s3-stats
import pandas as pd
import matplotlib.pyplot as plt
import pyarrow.parquet as pq
import os
# Reading data using Apache Arrow
data_dir = '/path/to/your/data/directory'
file_paths = [os.path.join(data_dir, file) for file in os.listdir(data_dir) if file.endswith('.parquet')]
dfs = [pq.read_table(file).to_pandas() for file in file_paths]
df = pd.concat(dfs, ignore_index=True)
import os
import boto3
import json
import jwt
from jwt.algorithms import RSAAlgorithm
import requests
import logging
# Configure logging
logger = logging.getLogger()
import boto3
import requests
import json
import datetime
from dateutil import parser
from requests_ntlm import HttpNtlmAuth
# AWS clients
s3 = boto3.client('s3')
ssm = boto3.client('ssm')