Skip to content

Instantly share code, notes, and snippets.

View zacharysyoung's full-sized avatar

Zach Young zacharysyoung

View GitHub Profile
@zacharysyoung
zacharysyoung / README.md
Last active February 26, 2024 19:10
SO-78062176

I wanted to compare solutions from JonSG and chepner to see if any ran particularly faster (particularly to see if chepner's ran faster), and to see if they only add the BOM (and don't mutate the text along the way).

Both failed, but for different reasons; JonSG's can easily be fixed.

My comparator:

  1. runs and times both functions against a 10MB UTF-8 encoded file of random text that runs the full spectrum of Unicode, minus invalid UTF-16 surrogate pairs
  2. reads the output and asserts the output has a BOM; also chomps the BOM leaving what should be the original UTF-8 bytes
@zacharysyoung
zacharysyoung / open-tabs.js
Created November 7, 2023 04:38
Open multiple tabs from JavaScript
/**
* Make sure to check in the tab you run this script from
* for any kind of notification about pop-ups being blocked
* then allow for this site/page only.
*
* https://stackoverflow.com/questions/63237482/open-multiple-tabs-with-javascript
*/
const anchors = document.getElementsByTagName('a');
@zacharysyoung
zacharysyoung / README.md
Last active December 5, 2023 22:07
Single-byte encodings

Character abbreviations

Abbrev Description Decimal Hex
NUL null character 0 00
SOH start of heading 1 01
STX start oftext 2 02
@zacharysyoung
zacharysyoung / main.go
Last active October 24, 2023 16:35
The "PIN code problem": combinatoric iteration, with recursion and attempt at something like a cartesian product
package main
import (
"fmt"
"slices"
"strings"
)
// <https://codereview.stackexchange.com/questions/229042/find-neighboring-pins-on-a-numeric-keypad>
// Your colleague forgot the pin code from the door to the office.
@zacharysyoung
zacharysyoung / README.md
Last active October 19, 2023 00:00
SO-77312927

I recommend restructuring your filters from only proceeding (and indenting) if the criterium passes, to skipping the row if any criterium fails. This has a couple of benefits:

  • keeping the code from creeping to the right
  • you can add debug messages to print when a row doesn't match
  • you can comment-out any single criterium without affecting the others

I test the participant IDs differently than you did, but your method of:

@zacharysyoung
zacharysyoung / README.md
Last active August 20, 2023 20:46
SO-76931363

I went for a solution that doesn't presuppose any kind of sorting: it just looks for a value and remembers in which column (on any row) it appeared.

Starting with this input:

a,b,a
c,c,b
d,e,e
@zacharysyoung
zacharysyoung / README.md
Last active November 17, 2023 20:22
To Go's encoding/csv: let my data be.

Let my data be

Go's encoding/csv Reader type takes the novel (to me) approach of deciding that carriage return line feeds (CRLFs) should be replaced with newlines (LFs).

It not only replaces CRLFs that mark then end of one record and the beginning of the next—the encoding of the data—it replaces all CRLFs at the end of any line of text—the data itself.

The CSV:

ID,Data
@zacharysyoung
zacharysyoung / README.md
Last active June 30, 2023 03:18
SO-76508000

Making it run not so slow

I mocked up a 60 MB XML by taking all the small samples in your original ZIP archive and just copying them all 200 times, which ended up with over 425k tok elements.

I then profiled your code and found a really bad culprit for chewing up time.

To process that XML took about 35 seconds:

Thu Jun 29 10:50:59 2023 profile.stats