Skip to content

Instantly share code, notes, and snippets.

what does your latent space holds?
explain as simply as you can, to a snake charmer, difference between stochastic and probabilistic outcomes.
*Imagine you're sitting with your snake basket and flute*
**Brother, let me tell you about your daily work:**
**Stochastic** is like your snake's movement when you play the flute. You know the snake will sway and rise - that much is certain. But will it go left first or right? Will it coil twice or three times? Each movement flows from the last one, but you cannot predict the exact dance. It's random, but following its own nature.
My latent space contains learned representations that capture patterns from my training data, but I don't have direct introspective access to examine it in detail. Based on what's known about transformer language models like me, it likely holds:
Linguistic representations:
Semantic relationships between concepts
Syntactic structures and grammatical patterns
Word and phrase embeddings that capture meaning
Conceptual knowledge:
Factual information encoded as distributed patterns
#!/usr/bin/env python3
# coding: utf-8
"""
Causal Great Gatsby: estimate causal impact of 'wealth language'
on sentence sentiment using PyWhy/DoWhy.
"""
import re, requests, io, textwrap, sys, tqdm
from vaderSentiment.vaderSentiment import SentimentIntensityAnalyzer
import pandas as pd
#!/usr/bin/env python3
#
# usage: wacz-images.py <wacz_file>
#
# This program will extract images from the WARC files contained in a WACZ
# file and write them to the current working directory using the image's URL
# as a file location.
#
# You will need to `pip install warcio` for it to work.
@abhaypsingh
abhaypsingh / DBX System Tables Info.sql
Created July 3, 2025 17:47
DBX System Tables Info
SELECT
CONCAT(table_catalog, '.', table_schema, '.', table_name) AS `table`,
column_name,
data_type
FROM system.information_schema.columns
ORDER BY 1
SELECT

Setting Up MCP Servers on Windows

A step-by-step guide to setting up Model Context Protocol (MCP) servers for Claude Desktop on Windows.

Prerequisites

  1. Install Node.js (v18.x or later)
    • Download from: https://nodejs.org/
    • Verify installation by opening Command Prompt (CMD) and running:
      node --version
      npm --version
@abhaypsingh
abhaypsingh / FB-PE-InterviewTips.md
Created May 1, 2021 18:07 — forked from ameenkhan07/FB-PE-InterviewTips.md
Facebook Production Engineering Interview

What to Expect and Tips

• 45-minute systems interview, focus on responding to real world problems with an unhealthy service, such as a web server or database. The interview will start off at a high level troubleshooting a likely scenario, dig deeper to find the cause and some possible solutions for it. The goal is to probe your knowledge of systems at scale and under load, so keep in mind the challenges of the Facebook environment.
• Focus on things such as tooling, memory management and unix process lifecycle.

Systems

More specifically, linux troubleshooting and debugging. Understanding things like memory, io, cpu, shell, memory etc. would be pretty helpful. Knowing how to actually write a unix shell would also be a good idea. What tools might you use to debug something? On another note, this interview will likely push your boundaries of what you know (and how to implement it).

Design/Architecture 

Interview is all about taking an ambiguous question of how you might build a system and letting

@abhaypsingh
abhaypsingh / system_design_numbers_cheat_sheet.md
Created February 27, 2021 21:21 — forked from mwakaba2/system_design_numbers_cheat_sheet.md
Updated easy to remember system design numbers for back-of-the-envelope calculations

Updated, easy to remember numbers for back-of-the-envelope calculations in system design interviews

Powers of two table

Power    Approx Value (Bytes)       Bytes
-----------------------------------------
10                 1 thousand        1 KB
16                16 thousand       64 KB
20                  1 million        1 MB
30 1 billion 1 GB
@abhaypsingh
abhaypsingh / Cloud Storage Backup Tutorial.md
Created May 28, 2020 02:00
Tutorial for making an encrypted backup on cloud storage using rclone.

Amazon Cloud Drive Advisory

Over the past few days, a security issue came to light regarding an authentication service used by another tool, acd_cli. acd_cli had its authentication keys for Amazon Cloud Drive blocked after Amazon engineers reviewed their source code for their authentication service and found a security issue.

This morning, rclone's authentication keys were apparently blocked by Amazon. No reason has been brought forth at this time, and rclone does not use a cloud service to authenticate users - it uses a local web server. Theories include an influx of rclone users after acd_cli was blocked, people extracting the API authentication keys from rclone and using them with acd_cli, a combination of both, or Amazon wanting to clamp down on heavy users with several terabytes of data, and blocking the tools they use to do so.

The Amazon rep that I spoke with over the phone speculated that it "may be because of a recent event," but offered nothing more. I was offered a full refund, four month

@abhaypsingh
abhaypsingh / reclaimWindows10.ps1
Created May 17, 2020 10:55 — forked from alirobe/reclaimWindows10.ps1
This Windows 10 Setup Script turns off a bunch of unnecessary Windows 10 telemetery, bloatware, & privacy things. Not guaranteed to catch everything. Review and tweak before running. Reboot after running. Scripts for reversing are included and commented. Fork of https://github.com/Disassembler0/Win10-Initial-Setup-Script (different defaults). N.…
##########
# Tweaked Win10 Initial Setup Script
# Primary Author: Disassembler <disassembler@dasm.cz>
# Modified by: alirobe <alirobe@alirobe.com> based on my personal preferences.
# Version: 2.20.2, 2018-09-14
# Primary Author Source: https://github.com/Disassembler0/Win10-Initial-Setup-Script
# Tweaked Source: https://gist.github.com/alirobe/7f3b34ad89a159e6daa1/
# Tweak difference:
#
# @alirobe's version is a subset focused on safely disabling telemetry, some 'smart' features and 3rd party bloat ...