Organization - Apache Fineract
Project Title - Scorecard for Credit Risk Assessment
Mentor - Lalit Mohan, Aashish Sawhney
Project Summary - Code
- The project consisted of providing an AI powered solution to the users for credit assessment of loans. The project covered various aspects from classical AI, considering various statistical models, to the modern-day neural network. The project is enriched with various credit modeling techniques, giving access to the user to choose one or any from them. It also takes care of the different data sources from which data can be fetched and has been fully incorporated to handle data coming from various sources like JSON/XML or SQL.
During the previous GSoC, interns worked on configuring different sections, namely, rule based, statistical based and machine learning based models for credit risk assessment. A more detailed run the previous work can found here
For the summer of 2021, we are looking at 2 major milestones for this project;
- First and most importantly, making the previous implementations production ready.
- Second, improve credit scoring by introducing scoring with H20.ai.
In the lights of bringing the project to production, I started an API module with django and django rest framework. The API module consist of 4 basic models to record and track transactions with the API;
- Endpoint Model: A model to define API endpoints for our various classifiers
- MLAlgorithm Model: A model to record all implemented models so they can be reused in the system.
- MLAlgorithmStatus Model: This model tracks the status of the implemented algorithm, that is whether they are in production or in testing.
- MLRequest Model: This model is to track every prediction request to system and maintain feedback on the outcome of the prediction.
These models are meant to bring a high level of modularity and flexibility to this ML module so future implementations can easily be used in production with minimal effort.
Another major implementation from the previous GSoC was rule based scoring abilities based on feature configurations and criteria. At this stage, the Rule Based Scoring configurations and algorithms were integrated in the Fineract Server application and web-app ui in a classic loan-product - loan cycle.
At this stage, statistical scoring algorithms are yet to be operational but the basic architecture for using statistical scoring has benn developed in the Fineract server and web-app ui in the classic loan-product and loan cycle
-
More than 80% of the previous GSoC work has been successfully migrated and tested in this new module with unit tests and integration tests.
-
The algorithms maintained in the system right now were trained on the german dataset and most of the algorithms achieve an approximate 67% accuracy with this dataset.
-
The API module is at a stage where it can handle basic requests and perform scoring operations based on the algorithms implemented during GSoC'20.
-
The algortithm for rule-based scoring is implemented and tested.
After achieving this proof of concept, the next steps will be to:
- Perform updates at the UI level (mifos web-app & community-app) for credit scoring
- Research ways to improve the performance of the ML models.
- Extensive documentation of the module.
- Integrate orchestration tools such as airflow to perform schedule model training operations so the implementation remains relevant.
- Research and integrate H20.ai for better scoring.
- Extend documentation to scope the new changes with H20.ai.
- The algorithm for statistical scoring is still to be implemented