Skip to content

Instantly share code, notes, and snippets.

View sdcharle's full-sized avatar

Mr. Steve Charlesworth sdcharle

View GitHub Profile
@veekaybee
veekaybee / normcore-llm.md
Last active July 12, 2024 10:47
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

@rain-1
rain-1 / LLM.md
Last active July 11, 2024 18:17
LLM Introduction: Learn Language Models

Purpose

Bootstrap knowledge of LLMs ASAP. With a bias/focus to GPT.

Avoid being a link dump. Try to provide only valuable well tuned information.

Prelude

Neural network links before starting with transformers.

Some remarks on Large Language Models

Yoav Goldberg, January 2023

Audience: I assume you heard of chatGPT, maybe played with it a little, and was imressed by it (or tried very hard not to be). And that you also heard that it is "a large language model". And maybe that it "solved natural language understanding". Here is a short personal perspective of my thoughts of this (and similar) models, and where we stand with respect to language understanding.

Intro

Around 2014-2017, right within the rise of neural-network based methods for NLP, I was giving a semi-academic-semi-popsci lecture, revolving around the story that achieving perfect language modeling is equivalent to being as intelligent as a human. Somewhere around the same time I was also asked in an academic panel "what would you do if you were given infinite compute and no need to worry about labour costs" to which I cockily responded "I would train a really huge language model, just to show that it doesn't solve everything!". We

@daniestevez
daniestevez / sprites.grc
Created April 14, 2019 10:02
GNU Radio pre-processing of Dwingeloo Sprites recording
<?xml version='1.0' encoding='utf-8'?>
<?grc format='1' created='3.7.13'?>
<flow_graph>
<timestamp>Tue Apr 2 21:50:17 2019</timestamp>
<block>
<key>options</key>
<param>
<key>author</key>
<value></value>
</param>
@HarshSingh16
HarshSingh16 / Surviving Titanic.R
Created October 15, 2018 20:19
Building a Predictive Model to predict survivals on the Titanic Data Set
########loading the Titanic Train Data Set
TitanicTrain<-train1
######Checking Missing Values in the Train Data Set
sapply(TitanicTrain, function(x)sum(is.na(x)))
#######Loading the Titanic Test Data Set
TitanicTest<-test11
#######Checking Missing Values in the Test Data Set
@knbknb
knbknb / csv2xts--and-more.R
Last active January 30, 2019 20:06
personal mini cheat sheet: R time series - xts from csv file, and more xts basics that I tend to forget
# see also:
# https://s3.amazonaws.com/assets.datacamp.com/blog_assets/xts_Cheat_Sheet_R.pdf
# Open csv file using read.zoo
my_tsdata <- read.zoo("my_tsdata.csv", sep = ",", FUN = as.Date, header = TRUE, index.column = 1)
my_tsdata <- as.xts(my_tsdata)
@terabyte
terabyte / amazon.md
Created December 6, 2017 02:27
Amazon's Build System

Prologue

I wrote this answer on stackexchange, here: https://stackoverflow.com/posts/12597919/

It was wrongly deleted for containing "proprietary information" years later. I think that's bullshit so I am posting it here. Come at me.

The Question

Amazon is a SOA system with 100s of services (or so says Amazon Chief Technology Officer Werner Vogels). How do they handle build and release?

@5agado
5agado / Pandas and Seaborn.ipynb
Created February 20, 2017 13:33
Data Manipulation and Visualization with Pandas and Seaborn — A Practical Introduction
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@max-mapper
max-mapper / datagovmetadata.json
Created February 14, 2017 21:54
EOP-GOV Metadata
{"help": "https://catalog.data.gov/api/3/action/help_show?name=package_search", "success": true, "result": {"count": 48, "sort": "views_recent desc", "facets": {}, "results": [{"license_title": "License not specified", "maintainer": "New Media", "relationships_as_object": [], "private": false, "maintainer_email": "newmedia@whitehouse.gov", "num_tags": 5, "id": "59694770-b6b6-4ae0-a4b9-4ae69c0be2f6", "metadata_created": "2016-07-02T10:06:26.199575", "metadata_modified": "2016-07-02T10:06:26.199575", "author": null, "author_email": null, "state": "active", "version": null, "creator_user_id": "47303a9e-1187-4290-85a3-1fc02dc49e4a", "type": "dataset", "resources": [{"cache_last_updated": null, "package_id": "59694770-b6b6-4ae0-a4b9-4ae69c0be2f6", "webstore_last_updated": null, "id": "3a8a0ad1-19e7-4153-bb2f-d70cf88aaaf8", "size": null, "state": "active", "hash": "", "description": "", "format": "CSV", "tracking_summary": {"total": 32, "recent": 1}, "last_modified": null, "url_type": null, "no_real_name": "True",
@tgh0831
tgh0831 / Simple RODBC Example.r
Last active September 2, 2020 19:58
This is a simple example using the RODBC package to return a query from a Microsoft SQL server to a data frame.
##### This is a simple RODBC example
##### The ODBCDriverName will be the driver name in ODBC Administrator
require(RODBC)
#open the ODBC connection
ch <- odbcConnect("ODBCDriverName")
##### Alternative ODBC connection for Microsoft SQL Server
ch <- odbcDriverConnect(