This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""Log-odds computations.""" | |
from libc.math cimport log, sqrt | |
from libc.stdint cimport int64_t | |
ctypedef int64_t int64 | |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
/* | |
* Tolerance Principle calculator, based on: | |
* | |
* C. Yang (2005). On productivity. Language Variation Yearbook 5:333-370. | |
* | |
* Definition: | |
* | |
* The number of data points consistent with a rule R is given by N, and the | |
* number of exceptions to it by m. By Tolerance, R is productive iff: | |
* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
syn on | |
set hlsearch | |
set ruler | |
" tab stuff | |
set expandtab | |
set tabstop=4 | |
" scrolling | |
set scrolloff=5 | |
" backspace over everythign | |
set backspace=indent,eol,start |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
"""English function words. | |
Sets of English function words, based on | |
E.O. Selkirk. 1984. Phonology and syntax: The relationship between | |
sound and structure. Cambridge: MIT Press. (p. 352f.) | |
The categories are of my own creation. | |
""" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
"""LNRE calculator. | |
This script computes a number of statistics characterizing LNRE data: | |
* N: corpus size | |
* V: vocabulary size | |
* V(1): the number of _hapax legomena_ (symbols occuring once) | |
* V(2): the number of _dis legomena_ (symbols occurring twice) | |
* V/N: vocabulary growth rate |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<epsilon> 0 | |
<SOH> 1 | |
<STX> 2 | |
<ETX> 3 | |
<EOT> 4 | |
<ENQ> 5 | |
<ACK> 6 | |
<BEL> 7 | |
<BS> 8 | |
<HT> 9 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env python | |
import fileinput | |
import nltk | |
if __name__ == "__main__": | |
for line in fileinput.input(): | |
print(" ".join(nltk.word_tokenize(line))) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# Copyright (c) 2013-2022 Kyle Gorman | |
# | |
# Permission is hereby granted, free of charge, to any person obtaining a | |
# copy of this software and associated documentation files (the | |
# "Software"), to deal in the Software without restriction, including | |
# without limitation the rights to use, copy, modify, merge, publish, | |
# distribute, sublicense, and/or sell copies of the Software, and to | |
# permit persons to whom the Software is furnished to do so, subject to | |
# the following conditions: | |
# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#!/usr/bin/env Rscript | |
# WALS131A.R | |
# Kyle Gorman <kylebgorman@gmail.com> | |
# | |
# Tests the hypothesis that vigesimal (base-20) number systems are more common | |
# at tropical latitudes. Thanks to Richard Sproat for suggesting this | |
# hypothesis. | |
# | |
# The data is read directly from WALS (#131A): | |
# |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
# autoloess.R: compute loess metaparameters automatically | |
# Kyle Gorman <gormanky@ohsu.edu> | |
aicc.loess <- function(fit) { | |
# compute AIC_C for a LOESS fit, from: | |
# | |
# Hurvich, C.M., Simonoff, J.S., and Tsai, C. L. 1998. Smoothing | |
# parameter selection in nonparametric regression using an improved | |
# Akaike Information Criterion. Journal of the Royal Statistical | |
# Society B 60: 271–293. |
NewerOlder