Skip to content

Instantly share code, notes, and snippets.

View jeresig's full-sized avatar

John Resig jeresig

View GitHub Profile
@Asparagirl
Asparagirl / gist:6206247
Last active February 14, 2024 19:56
Have a WARC that you would like to upload to the Internet Archive so that it can eventually be included in their Wayback Machine? Here's how to upload it from the command line.

Do you have a WARC file of a website all downloaded and ready to be added to the Internet Archive? Great! You can do that with the Internet Archive's web-based uploader, but it's not ideal and it can't handle really big uploads. Here's how you can upload your WARC files to the IA from the command line, and without worrying about a size restriction.

First, you need to get your Access Key and Secret Key from the Internet Archive for the S3-like API. Here's where you can get that for your IA account: http://archive.org/account/s3.php Don't share those with other people!

Here's their documentation file about how to use it, if you need some extra help: http://archive.org/help/abouts3.txt

Next, you should copy the following files to a text file and edit them as needed:

export IA_S3_ACCESS_KEY="YOUR-ACCESS-KEY-FROM-THE-IA-GOES-HERE"
@potiuk
potiuk / libmemcached.rb
Last active December 29, 2020 10:07
Install libmemcached with brew in preview (build 13A558) version of OSX Mavericks with XCode 5 Developer Preview 6
require 'formula'
class Libmemcached < Formula
homepage 'http://libmemcached.org'
url 'https://launchpad.net/libmemcached/1.0/1.0.17/+download/libmemcached-1.0.17.tar.gz'
sha1 '1023bc8c738b1f5b8ea2cd16d709ec6b47c3efa8'
depends_on 'memcached'
def install
/*!
* jQuery JavaScript Library v2.1.1pre
* http://jquery.com/
*
* Includes Sizzle.js
* http://sizzlejs.com/
*
* Copyright 2005, 2014 jQuery Foundation, Inc. and other contributors
* Released under the MIT license
* http://jquery.org/license
@sevastos
sevastos / docker-mongo-virtualbox.md
Last active December 3, 2022 10:11
Boot2Docker (VirtualBox) MongoDB volume filesystem issue

Journey

I was using Boot2Docker 1.2 (OSX) and wanted to use volume for MongoDB. First nothing was happening because 1.2 has no Guest Additions and volumes don't work. There is a workaround by making a boot2docker.iso from master which has Guest Additions.

But then Mongo didn't like putting data on VirtualBox's shared folders:

[initandlisten] 	WARNING: This file system is not supported. For further information see:
[initandlisten] http://dochub.mongodb.org/core/unsupported-filesystems
@Asparagirl
Asparagirl / gist:c2f710724232f76187b3
Last active November 25, 2018 21:24
Grab a website with wpull and PhantomJS

Grab a website with wpull and PhantomJS

export USER_AGENT="Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27"
export DOMAIN_NAME_TO_SAVE="http://www.example.com/"
export DOMAINS_TO_INCLUDE="example.com,images.example.com,relatedwebsite.com"
# this one can be regex, or you can leave it out, whatever
export THINGS_TO_IGNORE="ignore-this,other-thing-to-ignore"
export WARC_NAME="Example.com_-_2014-10-15"
# these two are needed in case wpull quits or chokes and we need to restart where we left off
# Hello, and welcome to makefile basics.
#
# You will learn why `make` is so great, and why, despite its "weird" syntax,
# it is actually a highly expressive, efficient, and powerful way to build
# programs.
#
# Once you're done here, go to
# http://www.gnu.org/software/make/manual/make.html
# to learn SOOOO much more.
@dannguyen
dannguyen / README.md
Last active December 28, 2023 15:21
Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data

Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.

The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.

On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:

####### 1. A low-resolution photo of road signs

@rphlmr
rphlmr / clear-db.ts
Last active April 25, 2024 09:47
Drizzle snippets
// Credits to Louistiti from Drizzle Discord: https://discord.com/channels/1043890932593987624/1130802621750448160/1143083373535973406
import { sql } from "drizzle-orm";
const clearDb = async (): Promise<void> => {
const query = sql<string>`SELECT table_name
FROM information_schema.tables
WHERE table_schema = 'public'
AND table_type = 'BASE TABLE';
`;
@khalidx
khalidx / node-typescript-esm.md
Last active April 22, 2024 15:40
A Node + TypeScript + ts-node + ESM experience that works.

The experience of using Node.JS with TypeScript, ts-node, and ESM is horrible.

There are countless guides of how to integrate them, but none of them seem to work.

Here's what worked for me.

Just add the following files and run npm run dev. You'll be good to go!

package.json