Automatic morpheme segmentation (Open problems in computational diversity linguistics 1)
This little repository contains the analyses I have done to test the Morfessor software on sparse data. It should be mentioned that I just used the defaults for the computation, so it is quite possible, that the results could be further enhanced.
Requirements
To install Morfessor, just type:
$ pip install morfessor
Run the experiments
To compute the first model, type:
$ morfessor-train --encoding=ISO_8859-15 --traindata-list -s model.bin -d ones baayen.txt
To compute the second model, type:
$ morfessor-train --encoding=ISO_8859-15 --traindata-list -s model2.bin -d ones german.txt
To apply the models to the test data, type:
$ morfessor-segment -l model.bin test.txt
$ morfessor-segment -l model2.bin test.txt