Skip to content

Instantly share code, notes, and snippets.

@kemingy
kemingy / benchmark.md
Last active January 12, 2023 04:44
Tensorflow Serving, TensorRT Inference Server (Triton), Multi Model Server (MXNet)

Environments

  • CPU: Intel(R) Xeon(R) Gold 6130 CPU @ 2.10GHz
  • GPU: NVIDIA V100
  • Memory: 251GiB
  • OS: Ubuntu 16.04.6 LTS (Xenial Xerus)

Docker Images:

  • tensorflow/tensorflow:latest-gpu
  • tensorflow/serving:latest-gpu
@nchaigne
nchaigne / build-gcc-9.2.0-on-centos7.md
Last active June 23, 2024 08:47
Building GCC 9.2.0 on CentOS 7

Building GCC 9.2.0 on CentOS 7

Introduction

CentOS 7 distribution (as well as RHEL 7) ships with a somewhat outdated version of the GCC compiler (4.8.5 on CentOS 7.5), which may not be suitable to your compilation requirements. For example, C11 - which supersedes C99 - is fully supported only starting from GCC 4.9).

Additionally, recent versions of GCC (GCC6, GCC7, GCC8, GCC9) come with improvements which help detect issues at build time and offer suggestions on how to fix them. Sometimes, these are even actually helpful!

This note describes how to build the latest GCC (9.2.0 as of October 2019) from sources on CentOS 7. This should be applicable as is on RHEL 7. For other Linux distributions, adapt as needed.

@Mahedi-61
Mahedi-61 / CUDA_Toolkit_10.0_installation_on_CentOS_7.sh
Last active April 10, 2024 03:34
Step by step instructions for installing CUDA Toolkit 10.0 CentOS 7 Server machine for running Deep Learning projects
#!/bin/bash
## This gist contains step by step instructions to install cuda v10.1 and cudnn 7.6 in CentOS 7
### steps ####
# verify the system has a cuda-capable gpu
# download and install the nvidia cuda toolkit and cudnn
# setup environmental variables
# verify the installation
###
### to verify your gpu is cuda enable check
@gorenje
gorenje / README.md
Last active May 25, 2021 13:38
Retrieving Allocated Resources from a Kubernetes Node

Used to retrieve the allocatable resources of a Kubernetes cluster.

Assumes that this is being executed within the K8s cluster.

Tested using python 2.7 and requires the installation of two pip libraries:

pip install pint
pip install kubernetes
@kauffmanes
kauffmanes / install_anaconda.md
Last active July 18, 2024 21:15
Install Anaconda on Windows Subsystem for Linux (WSL)

Thanks everyone for commenting/contributing! I made this in college for a class and I no longer really use the technology. I encourage you all to help each other, but I probably won't be answering questions anymore.

This article is also on my blog: https://emilykauffman.com/blog/install-anaconda-on-wsl

Note: $ denotes the start of a command. Don't actually type this.

Steps to Install Anaconda on Windows Ubuntu Terminal

  1. Install WSL (Ubuntu for Windows - can be found in Windows Store). I recommend the latest version (I'm using 18.04) because there are some bugs they worked out during 14/16 (microsoft/WSL#785)
  2. Go to https://repo.continuum.io/archive to find the list of Anaconda releases
  3. Select the release you want. I have a 64-bit computer, so I chose the latest release ending in x86_64.sh. If I had a 32-bit computer, I'd select the x86.sh version. If you accidentally try to install the wrong one, you'll get a warning in the terminal. I chose `Anaconda3-5.2.0-Li
import boto3
def pull_s3_prefix(dst_dir, bucket, prefix):
client = boto3.client('s3')
resource = boto3.resource('s3')
download_dir(client, resource, prefix, prefix, dst_dir, bucket)
def download_dir(client, resource, prefix, start_prefix, local, bucket ):
paginator = client.get_paginator('list_objects')
for result in paginator.paginate(Bucket=bucket, Delimiter='/', Prefix=prefix):
@eshelman
eshelman / latency.txt
Last active May 7, 2024 17:49 — forked from jboner/latency.txt
HPC-oriented Latency Numbers Every Programmer Should Know
Latency Comparison Numbers
--------------------------
L1 cache reference/hit 1.5 ns 4 cycles
Floating-point add/mult/FMA operation 1.5 ns 4 cycles
L2 cache reference/hit 5 ns 12 ~ 17 cycles
Branch mispredict 6 ns 15 ~ 20 cycles
L3 cache hit (unshared cache line) 16 ns 42 cycles
L3 cache hit (shared line in another core) 25 ns 65 cycles
Mutex lock/unlock 25 ns
L3 cache hit (modified in another core) 29 ns 75 cycles
@wangruohui
wangruohui / Install NVIDIA Driver and CUDA.md
Last active June 29, 2024 09:06
Install NVIDIA Driver and CUDA on Ubuntu / CentOS / Fedora Linux OS
@vasanthk
vasanthk / System Design.md
Last active July 23, 2024 11:24
System Design Cheatsheet

System Design Cheatsheet

Picking the right architecture = Picking the right battles + Managing trade-offs

Basic Steps

  1. Clarify and agree on the scope of the system
  • User cases (description of sequences of events that, taken together, lead to a system doing something useful)
    • Who is going to use it?
    • How are they going to use it?
@karpathy
karpathy / min-char-rnn.py
Last active July 22, 2024 04:44
Minimal character-level language model with a Vanilla Recurrent Neural Network, in Python/numpy
"""
Minimal character-level Vanilla RNN model. Written by Andrej Karpathy (@karpathy)
BSD License
"""
import numpy as np
# data I/O
data = open('input.txt', 'r').read() # should be simple plain text file
chars = list(set(data))
data_size, vocab_size = len(data), len(chars)