This is the final report of the work that was done for our text-extraction project in the framework of the Google Summer of Code 2019 (GSoC 2019).
In this project, I developed a user-friendly desktop GUI to extract various linguistic features from texts, using existing NLP packages. The application was developed using Electron, ReactJS, MaterialUI CSS framework and MongoDB for the database.
The application's main target groups are students and scientists of computational glossology, who lack programming skills and need an easy to use tool to perform their analysis. Within the application, the user can import texts, select the indices he wants to calculate and export the generated results. Additionally, the application is flexible and modular, offering to the user the ability to add custom scripts to be executed upon the selected texts.
For more information about the project, the used technologies and instructions on how to install and operate, visit the project wiki.
All of my work can be found at the project's repository, along with the code of the tool. My commits are here.
Since this project was developed under GSoC 2019 program, I kept weekly reports on my progress, which can be found at the relevant wiki page
The current version of the tool can be considered alpha. It is functional but contains many bugs and has many areas for improvement. Planned and suggested future work can be found here