Aaron Glover aglove2189

## normcore-llm.md

      
              1 file
            
          
              218 forks
            
          
              38 comments
            
          
              2781 stars
            
          
                veekaybee
                / normcore-llm.md
            
            
              Last active
              July 25, 2024 19:14
            
              
                Normcore LLM Reads
              
          
    Anti-hype LLM reading list

Goals: Add links that are reasonable and good explanations of how stuff works. No hype and no vendor content if possible. Practical first-hand accounts of models in prod eagerly sought.
Foundational Concepts


Pre-Transformer Models


## .gitlab-ci.yml
image: docker:latest

# When using dind, it's wise to use the overlayfs driver for
# improved performance.
variables:
  DOCKER_DRIVER: overlay
  GCP_PROJECT_ID: CHANGE-TO-GCP-PROJECT-ID
  IMAGE_NAME: image_id

services:

## residual_network.py
"""
Clean and simple Keras implementation of network architectures described in:
    - (ResNet-50) [Deep Residual Learning for Image Recognition](https://arxiv.org/pdf/1512.03385.pdf).
    - (ResNeXt-50 32x4d) [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/pdf/1611.05431.pdf).

Python 3.
"""

from keras import layers
from keras import models

## lmdb.tcl
# LVDB - LLOOGG Memory DB
# Copyriht (C) 2009 Salvatore Sanfilippo <antirez@gmail.com>
# All Rights Reserved

# TODO
# - cron with cleanup of timedout clients, automatic dump
# - the dump should use array startsearch to write it line by line
#   and may just use gets to read element by element and load the whole state.
# - 'help','stopserver','saveandstopserver','save','load','reset','keys' commands.
# - ttl with milliseconds resolution 'ttl a 1000'. Check ttl in dump!

## Pandas and Seaborn.ipynb

      
              1 file
            
          
              20 forks
            
          
              1 comment
            
          
              106 stars
            
          
                5agado
                / Pandas and Seaborn.ipynb
            
            
              Created
              February 20, 2017 13:33
            
              
                Data Manipulation and Visualization with Pandas and Seaborn — A Practical Introduction
              
          
        Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## README.md

      
              3 files
            
          
              1 fork
            
          
              0 comments
            
          
              4 stars
            
          
                pikesley
                / README.md
            
            
              Created
              October 22, 2016 15:55
            
              
                Pivertise your Pi
              
          
    How to Pivertise your Pi

You know how sometimes you connect your Raspberry Pi to a wifi network and then you have absolutely no way of finding out its IP address (Zeroes are particularly bad for this)? Well, I fixed it. Paste the run-once-at-boot-time systemd script into /etc/systemd/system/pivertiser.service and enable it with sudo systemctl enable pivertiser.service, paste the terrible bash hack into /home/pi/pivertiser.sh and chmod +x /home/pi/pivertiser.sh, then reboot your pi and it should show its face over here
Those addresses are stored in a hash keyed on hostname, so clearly if you never rename your pi from raspberrypi then this is an astonishingly terrible solution, but it's just solved a little problem for me. And you can of course run your own instance on Heroku if you want to

  
## showbest.js
// Paste in console
var getThings = function(){
    var divs = [...document.querySelectorAll("#manufacture__container > div")]
    var things = divs.map(function(e){
        var result = {e};
        var spans = [...e.getElementsByTagName("span")]
        spans.map(function(s){
            var str = s.innerText.replace(/[^/.0-9]/g, '');
            var a = str.split("/");
            var num = +a[0];

## conway.py
# Run `bin/bokeh serve` and in a new terminal run `python conway.py`.
# Based on https://github.com/thearn/game-of-life.

from bokeh.plotting import figure, curdoc
from bokeh.client import push_session

from numpy.fft import fft2, ifft2, fftshift
import numpy as np

def fft_convolve2d(x,y):

## profile.py
'''
Command line tool that takes a csv as input and exports
a statistical summary of the data points in html format.
'''

import pandas as pd
import pandas_profiling
import argparse
import os

## README.md

      
              2 files
            
          
              69 forks
            
          
              9 comments
            
          
              406 stars
            
          
                dannguyen
                / README.md
            
            
              Last active
              July 6, 2024 16:36
            
              
                Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data
              
          
    Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.
The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.
On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:
####### 1. A low-resolution photo of road signs
	image: docker:latest

	# When using dind, it's wise to use the overlayfs driver for
	# improved performance.
	variables:
	DOCKER_DRIVER: overlay
	GCP_PROJECT_ID: CHANGE-TO-GCP-PROJECT-ID
	IMAGE_NAME: image_id

	services:
	"""
	Clean and simple Keras implementation of network architectures described in:
	- (ResNet-50) [Deep Residual Learning for Image Recognition](https://arxiv.org/pdf/1512.03385.pdf).
	- (ResNeXt-50 32x4d) [Aggregated Residual Transformations for Deep Neural Networks](https://arxiv.org/pdf/1611.05431.pdf).

	Python 3.
	"""

	from keras import layers
	from keras import models
	# LVDB - LLOOGG Memory DB
	# Copyriht (C) 2009 Salvatore Sanfilippo <antirez@gmail.com>
	# All Rights Reserved

	# TODO
	# - cron with cleanup of timedout clients, automatic dump
	# - the dump should use array startsearch to write it line by line
	# and may just use gets to read element by element and load the whole state.
	# - 'help','stopserver','saveandstopserver','save','load','reset','keys' commands.
	# - ttl with milliseconds resolution 'ttl a 1000'. Check ttl in dump!
	// Paste in console
	var getThings = function(){
	var divs = [...document.querySelectorAll("#manufacture__container > div")]
	var things = divs.map(function(e){
	var result = {e};
	var spans = [...e.getElementsByTagName("span")]
	spans.map(function(s){
	var str = s.innerText.replace(/[^/.0-9]/g, '');
	var a = str.split("/");
	var num = +a[0];
	# Run `bin/bokeh serve` and in a new terminal run `python conway.py`.
	# Based on https://github.com/thearn/game-of-life.

	from bokeh.plotting import figure, curdoc
	from bokeh.client import push_session

	from numpy.fft import fft2, ifft2, fftshift
	import numpy as np

	def fft_convolve2d(x,y):
	'''
	Command line tool that takes a csv as input and exports
	a statistical summary of the data points in html format.
	'''

	import pandas as pd
	import pandas_profiling
	import argparse
	import os