Daniel Grady DGrady

## random_seed.py
# To reproduce a random sample, we need a fixed seed.

"{:_}".format(np.random.randint(np.iinfo(np.uint32).max))

## scikit-learn-character-tokenization.ipynb

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              1 star
            
          
                DGrady
                / scikit-learn-character-tokenization.ipynb
            
            
              Created
              September 18, 2019 17:05
            
              
                Demonstration of the `char_wb` tokenization strategy in scikit-learn
              
          
      Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## template-xgboost.ipynb

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                DGrady
                / template-xgboost.ipynb
            
            
              Last active
              June 24, 2019 23:01
            
              
                A template for XGBoost models
              
          
      Loading

      Sorry, something went wrong. Reload?
      Sorry, we cannot display this file.
      Sorry, this file is invalid so it cannot be displayed.
      
          Viewer requires iframe.
      
    
## README.org

      
              3 files
            
          
              0 forks
            
          
                0 comments
              
            
              2 stars
            
          
                DGrady
                / README.org
            
            
              Last active
              March 11, 2019 19:10
            
              
                Pretty printing delimited text files at the command line
              
          
    Pretty printing delimited text files at the command line

Sometimes, you’d like to look at delimited files on the command line:
cat test.csv


## frequency_histogram.py
import numpy as np
import pandas as pd


def frequency_histogram(
    data: pd.DataFrame,
    n_bins=20,
    bins=None,
    log_bins=False,
    normalize=False,

## find_project_dir.py
import cytoolz.curried as tz
from pathlib import Path

def find_project_dir(here: Path = None) -> Path:
    """
    Get the path to the project directory

    “Project directory” means the nearest parent directory of the
    current directory that contains a `.git` directory. If there
    is no such directory, returns this directory.

## oracle-query.org

      
              1 file
            
          
              8 forks
            
          
                10 comments
              
            
              38 stars
            
          
                DGrady
                / oracle-query.org
            
            
              Last active
              November 27, 2024 13:53
            
              
                Example of querying an Oracle database using Python, SQLAlchemy, and Pandas
              
          
    Query Oracle databases with Python and SQLAlchemy

N.B. SQLAlchemy now incorporates all of this information in its documentation; I’m leaving this post here, but recommend referring to SQLAlchemy instead of these instructions.
Install requirements


  We’ll assume you already have SQLAlchemy and Pandas installed; these are included by default in many Python distributions.
  Install the cx_Oracle package in your Python environment, using either pip or conda, for example:


## flatten_spark_schema.py
"""
The schemas that Spark produces for DataFrames are typically
nested, and these nested schemas are quite difficult to work with
interactively. In many cases, it's possible to flatten a schema
into a single level of column names.
"""

import typing as T

import cytoolz.curried as tz

## remove_input_cells.py
"""
Remove the input cells from an HTML document generated from a Jupyter notebook

Reads from either STDIN or the named file, and writes to STDOUT
"""

import fileinput
from bs4 import BeautifulSoup

text = "".join(fileinput.input())

## 2017-09-23-fonts-for-nerds.org

      
              1 file
            
          
              0 forks
            
          
                0 comments
              
            
              0 stars
            
          
                DGrady
                / 2017-09-23-fonts-for-nerds.org
            
            
              Last active
              October 10, 2017 00:48
            
              
                List of coding fonts
              
          
    Fonts for nerds

One of the things you end up with when you spend too much time reading Hacker News is a folder of very slick monospaced fonts designed for code editors. Are any of these fonts measurably better than whatever’s already installed on your system? Nope! Here’s my list.
Spark by After the Flood

This one is kind of a gimmick, but an incredibly clever one. It translates sequences of characters like 123{30,60,90}456 into spark lines, using some fancy features of the OTF format. See also their source code repository for the project. I haven’t used this nearly enough to tell if it works well in practice, but I will now be on the constant lookout for use cases.
Consolas by Luc(as) de Groot for Microsoft
	# To reproduce a random sample, we need a fixed seed.

	"{:_}".format(np.random.randint(np.iinfo(np.uint32).max))
	import numpy as np
	import pandas as pd


	def frequency_histogram(
	data: pd.DataFrame,
	n_bins=20,
	bins=None,
	log_bins=False,
	normalize=False,
	import cytoolz.curried as tz
	from pathlib import Path

	def find_project_dir(here: Path = None) -> Path:
	"""
	Get the path to the project directory

	“Project directory” means the nearest parent directory of the
	current directory that contains a `.git` directory. If there
	is no such directory, returns this directory.
	"""
	The schemas that Spark produces for DataFrames are typically
	nested, and these nested schemas are quite difficult to work with
	interactively. In many cases, it's possible to flatten a schema
	into a single level of column names.
	"""

	import typing as T

	import cytoolz.curried as tz
	"""
	Remove the input cells from an HTML document generated from a Jupyter notebook

	Reads from either STDIN or the named file, and writes to STDOUT
	"""

	import fileinput
	from bs4 import BeautifulSoup

	text = "".join(fileinput.input())