Skip to content

Instantly share code, notes, and snippets.

View ceteri's full-sized avatar

Paco Nathan ceteri

View GitHub Profile
@veekaybee
veekaybee / normcore-llm.md
Last active April 19, 2024 02:49
Normcore LLM Reads

Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.

Foundational Concepts

Screenshot 2023-12-18 at 10 40 27 PM

Pre-Transformer Models

@fperez
fperez / ProgrammaticNotebook.ipynb
Last active April 5, 2024 12:00
Creating an IPython Notebook programatically
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@vrilleup
vrilleup / spark-svd.scala
Last active August 9, 2023 17:32
Spark/mllib SVD example
import org.apache.spark.mllib.linalg.distributed.RowMatrix
import org.apache.spark.mllib.linalg._
import org.apache.spark.{SparkConf, SparkContext}
// To use the latest sparse SVD implementation, please build your spark-assembly after this
// change: https://github.com/apache/spark/pull/1378
// Input tsv with 3 fields: rowIndex(Long), columnIndex(Long), weight(Double), indices start with 0
// Assume the number of rows is larger than the number of columns, and the number of columns is
// smaller than Int.MaxValue
@zeeshanlakhani
zeeshanlakhani / monoid.py
Created October 13, 2011 15:49
Python Monoid
#Code from http://fmota.eu/, great!
class Monoid:
def __init__(self, null, lift, op):
self.null = null
self.lift = lift
self.op = op
def fold(self, xs):
if hasattr(xs, "__fold__"):
return xs.__fold__(self)
@jakevdp
jakevdp / Jupyter_vs_Mathematica.ipynb
Created April 8, 2018 05:01
Jupyter vs Mathematica Google Trends
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@porterjamesj
porterjamesj / hello_mesos.py
Last active March 6, 2018 20:43
the tiniest mesos scheduler
import logging
import uuid
import time
from mesos.interface import Scheduler
from mesos.native import MesosSchedulerDriver
from mesos.interface import mesos_pb2
logging.basicConfig(level=logging.INFO)
@ccsevers
ccsevers / AvroReadExample.java
Created October 29, 2012 18:27
cascading.avro wordcount example
package cascading.avro.examples;
import java.util.Properties;
import cascading.flow.Flow;
import cascading.flow.FlowDef;
import cascading.flow.hadoop.HadoopFlowConnector;
import cascading.operation.aggregator.Count;
import cascading.operation.regex.RegexFilter;
import cascading.operation.regex.RegexSplitGenerator;
@drewlanenga
drewlanenga / lm.pmml.xml
Created January 7, 2014 23:48
Exploring support for [transformations in PMML](http://www.dmg.org/v4-1/Transformations.html) with Pattern. (Environment notes: Running Vagrant with Cascading SDK 2.2 -- https://github.com/Cascading/vagrant-cascading-hadoop-cluster)
<?xml version="1.0"?>
<PMML version="4.1" xmlns="http://www.dmg.org/PMML-4_1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.dmg.org/PMML-4_1 http://www.dmg.org/v4-1/pmml-4-1.xsd">
<Header copyright="Copyright (c) 2014 lanenga" description="Linear Regression Model">
<Extension name="user" value="lanenga" extender="Rattle/PMML"/>
<Application name="Rattle/PMML" version="1.4"/>
<Timestamp>2014-01-07 15:33:34</Timestamp>
</Header>
<DataDictionary numberOfFields="4">
<DataField name="sepal_width" optype="continuous" dataType="double"/>
<DataField name="sepal_length" optype="continuous" dataType="double"/>
@ceteri
ceteri / cascalog_build.log
Last active December 14, 2015 22:29
Cascalog testing with Cascading 2.2-wip
bash-3.2$ lein do sub install, deps, compile, repl
Could not find artifact lein-newnew:lein-newnew:pom:0.3.5 in central (http://repo1.maven.org/maven2)
Retrieving lein-newnew/lein-newnew/0.3.5/lein-newnew-0.3.5.pom (3k)
from https://clojars.org/repo/
Could not find artifact stencil:stencil:pom:0.3.0 in central (http://repo1.maven.org/maven2)
Retrieving stencil/stencil/0.3.0/stencil-0.3.0.pom (3k)
from https://clojars.org/repo/
Retrieving org/clojure/clojure/1.3.0/clojure-1.3.0.pom (5k)
from http://repo1.maven.org/maven2/
Retrieving org/sonatype/oss/oss-parent/5/oss-parent-5.pom (4k)
@ceteri
ceteri / Cascalog.log
Last active December 11, 2015 18:28
City of Palo Alto Open Data app in Cascalog
bash-3.2$ lein version
Leiningen 2.0.0-preview10 on Java 1.6.0_43 Java HotSpot(TM) 64-Bit Server VM
bash-3.2$ hadoop version
Warning: $HADOOP_HOME is deprecated.
Hadoop 1.0.3
Subversion https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1335192
Compiled by hortonfo on Tue May 8 20:31:25 UTC 2012
From source with checksum e6b0c1e23dcf76907c5fecb4b832f3be
bash-3.2$ lein clean