Skip to content

Instantly share code, notes, and snippets.

@hubgit
hubgit / deno-web-streams.ts
Last active December 10, 2023 22:31
Reader and Writer web streams for Deno
import { TextLineStream } from 'https://deno.land/std@0.153.0/streams/mod.ts'
// const input = await jsonLinesReader('input.jsonl.gz')
// const output = await jsonLinesWriter('output.jsonl.gz')
// for await (const item of input) {
//// do something
// await output.write(item)
// }
const escapeHTML = input => input.replace(/[<>&"']/g, char => `&#${char.charCodeAt(0)};`)
@hubgit
hubgit / youtube-gif.sh
Last active September 12, 2023 23:28
Convert a section of a YouTube video to an animated GIF
#!/bin/bash
# brew install x265
# brew install ffmpeg
# brew install youtube-dl
# brew install imagemagick
ID='U65_uY5N2WM' # YouTube video ID, i.e. https://www.youtube.com/watch?v={ID}
# fetch the video file with youtube-dl
@hubgit
hubgit / list-files-in-folder.js
Created September 20, 2012 11:20
List all files in a folder (Google Apps Script)
function listFilesInFolder() {
var folder = DocsList.getFolder("Maudesley Debates");
var contents = folder.getFiles();
var file;
var data;
var sheet = SpreadsheetApp.getActiveSheet();
sheet.clear();
@hubgit
hubgit / cache-proxy.php
Last active September 5, 2023 21:19
PHP caching proxy
<?php
if ($_SERVER['REQUEST_METHOD'] == 'OPTIONS') {
header('Access-Control-Allow-Origin: *');
header('Access-Control-Allow-Methods: GET, OPTIONS');
header('Access-Control-Allow-Headers: accept, x-requested-with, content-type');
exit();
}
$url = $_GET['url'];
@hubgit
hubgit / journal-feeds.csv
Created April 24, 2013 15:50
All the journal feeds in JournalTOCs
We can't make this file beautiful and searchable because it's too large.
publisher id,feed id,feed url,journal url,journal title
1050,28092,http://journals.uran.ua/eejet/gateway/plugin/WebFeedGatewayPlugin/rss,http://journals.uran.ua/eejet/,"Східно-Європейський журнал передових технологій : Eastern-European Journal of Enterprise Technologies"
1761,25094,http://feeds.feedburner.com/Archeomatica?format=xml,http://www.archeomatica.it/,Archeomatica
1739,24698,http://cerealchemistry.aaccnet.org/action/showFeed?ui=0&mi=3b39wk&ai=rs&jc=cchem&type=etoc&feed=rss,http://cerealchemistry.aaccnet.org/journal/cchem,"Cereal Chemistry"
1721,27750,http://journals.aau.dk/index.php/MIPO/gateway/plugin/WebFeedGatewayPlugin/rss,http://journals.aau.dk/index.php/MIPO,"Musikterapi i Psykiatrien Online"
1549,26667,http://ojs.statsbiblioteket.dk/index.php/bras/gateway/plugin/WebFeedGatewayPlugin/rss,http://ojs.statsbiblioteket.dk/index.php/bras,"Brasiliana - Journal for Brazilian Studies"
1549,27775,http://ojs.statsbiblioteket.dk/index.php/claw/gateway/plugin/WebFeedGatewayPlugin/rss,http://ojs.statsbiblio
@hubgit
hubgit / textract-pdf-tables.sh
Last active June 15, 2023 13:31
Extract tabular data from a PDF to CSV
# brew install awscli
# aws configure
aws s3 cp your-file.pdf s3://your-bucket/your-file.pdf
# https://pypi.org/project/amazon-textract-helper/
# https://github.com/aws-samples/amazon-textract-textractor/tree/master/helper
# pip install amazon-textract-helper
amazon-textract --input-document s3://your-bucket/your-file.pdf --features TABLES --pretty-print TABLES --pretty-print-table-format=csv
# https://aws.amazon.com/blogs/machine-learning/automatically-extract-text-and-structured-data-from-documents-with-amazon-textract/
@hubgit
hubgit / chat.ts
Last active May 11, 2023 08:35
Vercel Edge Function for an OpenAI API request
import type { NextRequest } from 'next/server'
import { createParser } from 'eventsource-parser'
export const config = {
runtime: 'edge',
}
export default async function handler(req: NextRequest) {
const encoder = new TextEncoder()
const decoder = new TextDecoder()
@hubgit
hubgit / blank-html-template
Created April 5, 2009 15:02
A blank HTML Strict template
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
<title>Title</title>
<link rel="stylesheet" href="style.css"/>
<style></style>
<script src="script.js"></script>
@hubgit
hubgit / json-ld.js
Created June 16, 2020 09:16
Fetch, extract, parse, expand, frame and compact JSON-LD
const { JSDOM } = require('jsdom')
const { compact, expand, frame } = require('jsonld')
const url = 'https://www.bbc.co.uk/schedules/p00fzl6p/2020/06/14'
// fetch and parse HTML
const { window: { document } } = await JSDOM.fromURL(url)
// select the script elements containing JSON-LD
const elements = document.querySelectorAll('script[type="application/ld+json"]')