Skip to content

Instantly share code, notes, and snippets.

Johann-Mattis List LinguList

Block or report user

Report or block LinguList

Hide content and notifications from this user.

Learn more about blocking users

Contact Support about this user’s behavior.

Learn more about reporting abuse

Report abuse
View GitHub Profile
@LinguList
LinguList / README.md
Created Jul 11, 2019
Waterman-Eggert Illustration and Patch for LingPy
View README.md

Waterman-Eggert algorithm for Sentence Alignment

This is a short patch for LingPy's Waterman-Eggert implementation and an illustration how the algorithm can be used to carry out the alignment of two sentences provided in phonetic transcription in linguistics. To test this script, make sure to install LingPy and run the following in your terminal:

$ python code.py
@LinguList
LinguList / Bodth-2019-664.tsv
Created Jun 26, 2019
Checking the intersection of concept lists with `pyconcepticon`
View Bodth-2019-664.tsv
ID NUMBER ENGLISH CONCEPTICON_ID CONCEPTICON_GLOSS
Bodth-2019-664-1 1 1sg 1209 I
Bodth-2019-664-2 2 2pl.excl 1213 YOU
Bodth-2019-664-3 3 2pl.incl 1131 WE (INCLUSIVE)
Bodth-2019-664-4 4 2sg 1215 THOU
Bodth-2019-664-5 5 3sg 262 HE OR SHE OR IT
Bodth-2019-664-6 6 ablative
Bodth-2019-664-7 7 above, top 2379 UP OR ABOVE
Bodth-2019-664-8 8 achieve, obtain 694 GET
Bodth-2019-664-9 9 aconite
@LinguList
LinguList / README.md
Created Mar 26, 2019
A Primer on Automatic Inference of Sound Correspondence Patterns (3): Extended Experiments with Alignments from the Tableaux Phonétiques des Patois Suisses Romands
View README.md

A Primer on Automatic Inference of Sound Correspondence Patterns (3): Extended Experiments with Alignments from the Tableaux Phonétiques des Patois Suisses Romands

To run the script provided here, make sure to download the GIST, and install the requirements for LingRex. Then, simply type:

$ python code.py
@LinguList
LinguList / README.md
Created Feb 27, 2019
A Primer on Automatic Inference of Sound Correspondence Patterns (2): Initial Experiments with Alignments from the Tableaux Phonétiques des Patois Suisses Romands
View README.md

A Primer on Automatic Inference of Sound Correspondence Patterns (2): Initial Experiments with Alignments from the Tableaux Phonétiques des Patois Suisses Romands

To run the script provided here, make sure to download the data from Zenodo, and unpack the folder multiple.zip. Then cd into the folder, and run the script as follows:

$ python to_wordlist.py

To install all requirements, just type:

@LinguList
LinguList / README.md
Created Feb 24, 2019
Automatic morpheme segmentation (Open problems in computational diversity linguistics 1)
View README.md

Automatic morpheme segmentation (Open problems in computational diversity linguistics 1)

This little repository contains the analyses I have done to test the Morfessor software on sparse data. It should be mentioned that I just used the defaults for the computation, so it is quite possible, that the results could be further enhanced.

Requirements

To install Morfessor, just type:

$ pip install morfessor
@LinguList
LinguList / README.md
Created Dec 11, 2018
Merging datasets with LingPy and the CLDF curation framework
View README.md
@LinguList
LinguList / README.md
Created Nov 6, 2018
Inferring consonant clusters from CLICS data with LingPy: Data and Code
View README.md

Inferring consonant clusters from CLICS data with LingPy: Data and Code

This GIST accompanies the blogpost explaining the code, which you can finde here.

To install and run the code, run the following in your terminal:

$ pip install -r pip-requirements.txt
$ git clone https://github.com/clld/concepticon-data.git
$ cd concepticon-data
@LinguList
LinguList / README.md
Last active Jul 16, 2018
Exporting Sublists form a Wordlist with LingPy and Concepticon
View README.md

Exporting Sublists form a Wordlist with LingPy and Concepticon

This gist describes, how you can extract sublists from a wordlist in LingPy with help of the pyconcepticon API. See https://calc.hypotheses.org/date/2018/07 for details on the code and additional explanations.

@LinguList
LinguList / README.md
Created Jun 28, 2016
Vowel Purity and Rhyme Evidence in Old Chinese Reconstruction
View README.md

Vowel Purity and Rhyme Evidence in Old Chinese Reconstruction

Data

Data contains the rhyme network (in YAML-format), the different character readings (missing characters indicated by a "?", and the vowel annotations in JSON.

Code

To run the code, make sure you have Python3 installed, as well as a recent version of NetworkX and the community-extension for NetworkX.

@LinguList
LinguList / README.md
Last active Aug 29, 2015
PhylogeneticNetworkApproaches
View README.md

Test Sets for Phylogenetic Network Approaches in Historical Linguistics

This GIST offers test sets for phylogenetic networks approaches. All data is given in different formats. The following formats are distinguished:

  • tree-representation of the underlying taxa using the Newick format (nwk-file)
  • csv-representation of the presence-absence patterns of the data (csv-file)
  • nexus-representation of the presence-absence matrix of the data (nex-file)
  • wordlist representation of the data which is important for additional linguistic analyses (qlc-format)

At the moment, only one testset is offered in these formats. This testset was the bases of our network analysis of 40 Indo-European languages (see https://gist.github.com/LinguList/7475830). Here, it is offered in the formats specified above. In this dataset, known borrowings have been deliberately reintroduced into the data, in order to see

You can’t perform that action at this time.