John Resig jeresig

## gist:6206247

      
              1 file
            
          
              0 forks
            
          
              4 comments
            
          
              26 stars
            
          
                Asparagirl
                / gist:6206247
            
            
              Last active
              February 14, 2024 19:56
            
              
                Have a WARC that you would like to upload to the Internet Archive so that it can eventually be included in their Wayback Machine? Here's how to upload it from the command line.
              
          
    Do you have a WARC file of a website all downloaded and ready to be added to the Internet Archive?  Great!  You can do that with the Internet Archive's web-based uploader, but it's not ideal and it can't handle really big uploads.  Here's how you can upload your WARC files to the IA from the command line, and without worrying about a size restriction.
First, you need to get your Access Key and Secret Key from the Internet Archive for the S3-like API.  Here's where you can get that for your IA account: http://archive.org/account/s3.php  Don't share those with other people!
Here's their documentation file about how to use it, if you need some extra help: http://archive.org/help/abouts3.txt
Next, you should copy the following files to a text file and edit them as needed:
export IA_S3_ACCESS_KEY="YOUR-ACCESS-KEY-FROM-THE-IA-GOES-HERE"

  
## libmemcached.rb
require 'formula'

class Libmemcached < Formula
  homepage 'http://libmemcached.org'
  url 'https://launchpad.net/libmemcached/1.0/1.0.17/+download/libmemcached-1.0.17.tar.gz'
  sha1 '1023bc8c738b1f5b8ea2cd16d709ec6b47c3efa8'

  depends_on 'memcached'

  def install

## jquery.js
/*!
 * jQuery JavaScript Library v2.1.1pre
 * http://jquery.com/
 *
 * Includes Sizzle.js
 * http://sizzlejs.com/
 *
 * Copyright 2005, 2014 jQuery Foundation, Inc. and other contributors
 * Released under the MIT license
 * http://jquery.org/license

## docker-mongo-virtualbox.md

      
              1 file
            
          
              2 forks
            
          
              3 comments
            
          
              12 stars
            
          
                sevastos
                / docker-mongo-virtualbox.md
            
            
              Last active
              December 3, 2022 10:11
            
              
                Boot2Docker (VirtualBox) MongoDB volume filesystem issue 
              
          
    Journey

I was using Boot2Docker 1.2 (OSX) and wanted to use volume for MongoDB.
First nothing was happening because 1.2 has no Guest Additions and volumes don't work.
There is a workaround by making a boot2docker.iso from master which has Guest Additions.
But then Mongo didn't like putting data on VirtualBox's shared folders:
[initandlisten] 	WARNING: This file system is not supported. For further information see:
[initandlisten] http://dochub.mongodb.org/core/unsupported-filesystems


## gist:c2f710724232f76187b3

      
              1 file
            
          
              0 forks
            
          
              0 comments
            
          
              2 stars
            
          
                Asparagirl
                / gist:c2f710724232f76187b3
            
            
              Last active
              November 25, 2018 21:24
            
              
                Grab a website with wpull and PhantomJS
              
          
    Grab a website with wpull and PhantomJS

export USER_AGENT="Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/533.20.25 (KHTML, like Gecko) Version/5.0.4 Safari/533.20.27"
export DOMAIN_NAME_TO_SAVE="http://www.example.com/"
export DOMAINS_TO_INCLUDE="example.com,images.example.com,relatedwebsite.com"
# this one can be regex, or you can leave it out, whatever
export THINGS_TO_IGNORE="ignore-this,other-thing-to-ignore"
export WARC_NAME="Example.com_-_2014-10-15"
# these two are needed in case wpull quits or chokes and we need to restart where we left off

  
## Makefile
# Hello, and welcome to makefile basics.
#
# You will learn why `make` is so great, and why, despite its "weird" syntax,
# it is actually a highly expressive, efficient, and powerful way to build
# programs.
#
# Once you're done here, go to
# http://www.gnu.org/software/make/manual/make.html
# to learn SOOOO much more.

## README.md

      
              2 files
            
          
              69 forks
            
          
              9 comments
            
          
              406 stars
            
          
                dannguyen
                / README.md
            
            
              Last active
              December 28, 2023 15:21
            
              
                Using Python 3.x and Google Cloud Vision API to OCR scanned documents to extract structured data
              
          
    Using Python 3 + Google Cloud Vision API's OCR to extract text from photos and scanned documents

Just a quickie test in Python 3 (using Requests) to see if Google Cloud Vision can be used to effectively OCR a scanned data table and preserve its structure, in the way that products such as ABBYY FineReader can OCR an image and provide Excel-ready output.
The short answer: No. While Cloud Vision provides bounding polygon coordinates in its output, it doesn't provide it at the word or region level, which would be needed to then calculate the data delimiters.
On the other hand, the OCR quality is pretty good, if you just need to identify text anywhere in an image, without regards to its physical coordinates. I've included two examples:
####### 1. A low-resolution photo of road signs

  
## clear-db.ts
// Credits to Louistiti from Drizzle Discord: https://discord.com/channels/1043890932593987624/1130802621750448160/1143083373535973406

import { sql } from "drizzle-orm";

const clearDb = async (): Promise<void> => {
  const query = sql<string>`SELECT table_name
      FROM information_schema.tables
      WHERE table_schema = 'public'
        AND table_type = 'BASE TABLE';
    `;

## node-typescript-esm.md

      
              1 file
            
          
              47 forks
            
          
              30 comments
            
          
              541 stars
            
          
                khalidx
                / node-typescript-esm.md
            
            
              Last active
              April 22, 2024 15:40
            
              
                A Node + TypeScript + ts-node + ESM experience that works.
              
          
    The experience of using Node.JS with TypeScript, ts-node, and ESM is horrible.
There are countless guides of how to integrate them, but none of them seem to work.
Here's what worked for me.
Just add the following files and run npm run dev. You'll be good to go!
package.json
	require 'formula'

	class Libmemcached < Formula
	homepage 'http://libmemcached.org'
	url 'https://launchpad.net/libmemcached/1.0/1.0.17/+download/libmemcached-1.0.17.tar.gz'
	sha1 '1023bc8c738b1f5b8ea2cd16d709ec6b47c3efa8'

	depends_on 'memcached'

	def install
	/*!
	* jQuery JavaScript Library v2.1.1pre
	* http://jquery.com/
	*
	* Includes Sizzle.js
	* http://sizzlejs.com/
	*
	* Copyright 2005, 2014 jQuery Foundation, Inc. and other contributors
	* Released under the MIT license
	* http://jquery.org/license
	# Hello, and welcome to makefile basics.
	#
	# You will learn why `make` is so great, and why, despite its "weird" syntax,
	# it is actually a highly expressive, efficient, and powerful way to build
	# programs.
	#
	# Once you're done here, go to
	# http://www.gnu.org/software/make/manual/make.html
	# to learn SOOOO much more.
	// Credits to Louistiti from Drizzle Discord: https://discord.com/channels/1043890932593987624/1130802621750448160/1143083373535973406

	import { sql } from "drizzle-orm";

	const clearDb = async (): Promise<void> => {
	const query = sql<string>`SELECT table_name
	FROM information_schema.tables
	WHERE table_schema = 'public'
	AND table_type = 'BASE TABLE';
	`;