Skip to content

Instantly share code, notes, and snippets.

@DGrady
DGrady / scikit-learn-character-tokenization.ipynb
Created September 18, 2019 17:05
Demonstration of the `char_wb` tokenization strategy in scikit-learn
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@Joshua1989
Joshua1989 / Transferring files between Google Colab and Google Drive.md
Last active October 12, 2023 08:38
file transferring between Google Colab VM and Google Drive

There are several approaches

  • Mount Google Drive in local Colab VM
  • Upload and download via browser
  • Use colab_util.py in python script
@turicas
turicas / description.md
Last active January 22, 2018 15:16
lxml deletes data from malformed HTML

I'm extracting data from a website and was testing some XPath expressions in Chrome Developer Tools (using $x(...) in console). After creating the expressions I need, I've automated the process using lxml to extract this data using Python. Problem: the number of results in lxml is different from the number I've got using Developer Tools! It seems lxml delete some data and adds a lot of </table> in the end (doing the process of loading the HTML into an lxml.html.Element and then extracting it using lxml.html.tostring results in completely different HTMLs - the majority of the data is removed). The HTML is attached in this gist (e-SIC.html) and the XPath is the following: //table[@class="padrao"]. I've tested the XPath in Developer Tools by executing the code in console: $x('//table[@class="padrao"]').length - it returns 2496.

javascript:(function () {
//Served by rawgit: https://rawgit.com/
//From https://gist.github.com/theredpea/d08d5918a8c88889dfa26ad72dd17140#file-adapting_showobjectids-js
//To https://cdn.rawgit.com/theredpea/d08d5918a8c88889dfa26ad72dd17140/raw/9b32a0d0ac9e1a9cc85005fcbf83e20275d99fd4/adapting_showObjectIds.js
document.body.appendChild(document.createElement('script')).src = 'https://rawgit.com/abodelot/jquery.json-viewer/master/json-viewer/jquery.json-viewer.js';
var head = document.getElementsByTagName('head')[0];
$(document.createElement('link')).attr({
type: 'text/css',
href: 'https://rawgit.com/abodelot/jquery.json-viewer/master/json-viewer/jquery.json-viewer.css',
rel: 'stylesheet'
@zcaceres
zcaceres / Include-in-Sequelize.md
Last active July 27, 2024 13:21
using Include in sequelize

'Include' in Sequelize: The One Confusing Query That You Should Memorize

When querying your database in Sequelize, you'll often want data associated with a particular model which isn't in the model's table directly. This data is usually typically associated through join tables (e.g. a 'hasMany' or 'belongsToMany' association), or a foreign key (e.g. a 'hasOne' or 'belongsTo' association).

When you query, you'll receive just the rows you've looked for. With eager loading, you'll also get any associated data. For some reason, I can never remember the proper way to do eager loading when writing my Sequelize queries. I've seen others struggle with the same thing.

Eager loading is confusing because the 'include' that is uses has unfamiliar fields is set in an array rather than just an object.

So let's go through the one query that's worth memorizing to handle your eager loading.

The Basic Query

@AloofBuddha
AloofBuddha / associations.md
Last active February 27, 2024 08:25
Sequelize Relationships: hasOne vs belongsTo
@Rich-Harris
Rich-Harris / footgun.md
Last active October 8, 2024 15:14
Top-level `await` is a footgun

Edit — February 2019

This gist had a far larger impact than I imagined it would, and apparently people are still finding it, so a quick update:

  • TC39 is currently moving forward with a slightly different version of TLA, referred to as 'variant B', in which a module with TLA doesn't block sibling execution. This vastly reduces the danger of parallelizable work happening in serial and thereby delaying startup, which was the concern that motivated me to write this gist
  • In the wild, we're seeing (async main(){...}()) as a substitute for TLA. This completely eliminates the blocking problem (yay!) but it's less powerful, and harder to statically analyse (boo). In other words the lack of TLA is causing real problems
  • Therefore, a version of TLA that solves the original issue is a valuable addition to the language, and I'm in full support of the current proposal, which you can read here.

I'll leave the rest of this document unedited, for archaeological

@ericelliott
ericelliott / chat-reducer-factories.js
Created August 28, 2016 03:04
Chat reducer test factories
const createChat = ({
id = 0,
msg = '',
user = 'Anonymous',
timeStamp = 1472322852680
} = {}) => ({
id, msg, user, timeStamp
});
const createState = ({
@theredpea
theredpea / .block
Last active November 15, 2020 00:50
nate central limit theorem demo
license: mit
height: 2000
scrolling: yes
@evanwill
evanwill / gitBash_windows.md
Last active November 21, 2024 05:28
how to add more utilities to git bash for windows, wget, make

How to add more to Git Bash on Windows

Git for Windows comes bundled with the "Git Bash" terminal which is incredibly handy for unix-like commands on a windows machine. It is missing a few standard linux utilities, but it is easy to add ones that have a windows binary available.

The basic idea is that C:\Program Files\Git\mingw64\ is your / directory according to Git Bash (note: depending on how you installed it, the directory might be different. from the start menu, right click on the Git Bash icon and open file location. It might be something like C:\Users\name\AppData\Local\Programs\Git, the mingw64 in this directory is your root. Find it by using pwd -W). If you go to that directory, you will find the typical linux root folder structure (bin, etc, lib and so on).

If you are missing a utility, such as wget, track down a binary for windows and copy the files to the corresponding directories. Sometimes the windows binary have funny prefixes, so