Skip to content

Instantly share code, notes, and snippets.

View kshitizregmi's full-sized avatar

kshitizregmi

View GitHub Profile
# for grayscale pdf
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -sColorConversionStrategy=Gray -dProcessColorModel=/DeviceGray -dPDFSETTINGS=/ebook -dDownsampleColorImages=true -dColorImageResolution=144 -dDownsampleGrayImages=true -dGrayImageResolution=144 -dNOPAUSE -dQUIET -dBATCH -sOutputFile=cvttranscript.pdf transcripts.pdf
# for colored one
gs -sDEVICE=pdfwrite -dCompatibilityLevel=1.4 -dPDFSETTINGS=/ebook -dDownsampleColorImages=true -dColorImageResolution=144 -dDownsampleGrayImages=true -dGrayImageResolution=144 -dNOPAUSE -dQUIET -dBATCH -sOutputFile=converted.pdf original.pdf
@kshitizregmi
kshitizregmi / notebook_to_pdf_converter_python_script.txt
Last active August 25, 2024 15:04
Install and Convert Jupyter Notebooks to PDF
Install and Convert Jupyter Notebooks to PDF
This Gist provides a step-by-step guide to installing the necessary tools for converting Jupyter Notebooks (.ipynb) to PDF format using Python. It covers installing the required packages and tools (pip, nbconvert, notebook-as-pdf, pyppeteer, and playwright), and includes commands to convert notebooks either in bulk or individually to PDF format using the webpdf exporter.
Steps included:
1. Upgrade pip.
2. Install notebook-as-pdf and nbconvert.
3. Install pyppeteer and playwright for PDF rendering.
4. Convert all .ipynb files in a directory or a single notebook file to PDF.
from google.cloud import bigquery
import google.auth
class BigQuerySchema:
def __init__(self):
self.schema = []
def add_fields(self, fields):
for name, field_type in fields.items():
import pandas as pd
import numpy as np
import plotly.express as px
from sklearn.decomposition import PCA
from sklearn.cluster import KMeans
# Load data
df = pd.read_pickle('all_user_embd-allmpnet.pickle')
# Feature engineering

How to Update disk size in compute engine

gcloud compute instances stop <your compute engine instance name>
gcloud compute disks resize <your compute engine name/disk name> --zone <your compute engine zone>  --size 30GB
@kshitizregmi
kshitizregmi / bq_read.py
Last active October 18, 2023 11:39
Read BigQuery data in DataFrame. Don't forget to install pandas_gbq ``` pip install pandas-gbq -U ```
import pandas as pd
from google.auth import default
# Get default credentials and project ID
credentials, project_id = default()
def read_bigquery_table(query, use_bq_storage_api=True):
# Read a table from BigQuery using the BigQuery Storage API

Conda Environment

1. Create the Conda Environment:

Open a terminal or command prompt and run the following command to create a Conda environment with Python 3.11 (you can change the Python version to your preferred one):

conda create --name <your_env_name> python=3.11

Create a New Branch: Use the following command to create a new branch. Replace with the name you want for your new branch.

git checkout --orphan <branch-name>

This will create a new branch with no commit history or files from the previous branch.

Remove Existing Files: Although the branch is empty, there might be untracked files left from the previous branch. To remove all these untracked files, use the following commands

@kshitizregmi
kshitizregmi / migrate.sh
Last active August 3, 2023 10:30
Copy 2000 GCS bucket data from one bucket to other. src-dest
gsutil ls gs://gen-app-src/ | head -n 2000 | xargs -I '{}' gsutil cp '{}' gs://gen-app-dst/
@kshitizregmi
kshitizregmi / rowise_duplicate_compare.py
Last active March 10, 2023 10:27
Compare rowise if two rows are duplicate or not for pandas data frame
import numpy as np
a = df.iloc[:, 0].values
b = df.iloc[:, 1].values
# Find the indices of matching elements
matches = np.where(a == b)
# Compare the indices across the two arrays
for index in matches[0]:
if a[index] == b[index]: