ckarageorgkaneen/GG-Extraction_FinalReport.md

## GG-Extraction_FinalReport.md

      
    Raw
  

              GG-Extraction_FinalReport.md
            
          
    Final report of the work done for the GG Extraction Project for GFOSS – Open Technologies Alliance as part of GSoC 2018.
Abstract

The purpose of this project was the identification and extraction of Government Directorates and Divisions
with the responsibilities assigned to them in a machine-readable format, as well as the extraction of related metadata.

To accomplish it, three main sets of functionalities were implemented:

ML RespA Classifiers (https://github.com/eellak/gsoc2018-GG-extraction/wiki/Implementation#respa-classifiers)
Unit - RespAs Extraction (https://github.com/eellak/gsoc2018-GG-extraction/wiki/Implementation#extraction-methods)
Data / Metadata Extraction (https://github.com/eellak/gsoc2018-GG-extraction/wiki/Implementation#metadata-extraction)

General info

Repo: (https://github.com/eellak/gsoc2018-GG-extraction)

A general outline: (https://github.com/eellak/gsoc2018-GG-extraction/blob/master/README.md)
Detailed info

A detailed outline regarding Implementation, Usage and more:

(https://github.com/eellak/gsoc2018-GG-extraction/wiki)
Progress

My progress can be found at the Projects tab:

(https://github.com/eellak/gsoc2018-GG-extraction/projects)
Future Work

As mentioned here: (https://github.com/eellak/gsoc2018-GG-extraction/wiki/Improvement-Ideas)

Resolve metadata extractor issues (mentioned in Issues)
Add db support
Debug and fix signee extraction
Extend RespA section detection in RespA Decision Issues

(ΦΕΚ Αποφάσεων που περιέχουν αναθέσεις αρμοδιοτήτων/καθηκόντων)
Devise a non-manual detection scheme using only the ML classifiers
Attempt a merge with one or more great relevant projects such as:

diavgeiaRedefined
3gm