Scoring system for MD Finance

fintech

Summary

I had great experience with Andrii during the project of credit score build for our company. He has provided quick and best quality results with a ready to support approach. He has developed a model with requested AUC level using Python and prepared deployment into our system. I can frankly recommend Andrii as the right data scientist for the project with any level of difficulty and challenge!” - MD Finance.


The project aimed to build a credit scoring model that predicts the likelihood of loan defaults specifically in the Romanian market. The goal was to improve financial decision-making by assessing potential risk for lenders more accurately. By predicting default risks, the model could help optimize loan approval processes, reduce the likelihood of losses, and guide more informed lending strategies, ultimately supporting better risk management and increasing the profitability and stability of the financial institution involved.

Project overview

Data Preprocessing


To ensure the model’s accuracy and robustness, key data preprocessing strategies were implemented:

  • Handling missing values.

  • Filtering continuous features based on missing value thresholds.

  • Generating new features to improve predictive power.


Tools & Techniques


  • LightGBM: This gradient boosting framework was used for model training, providing both speed and high performance for large datasets, ideal for credit scoring.

  • Bayesian Optimization: Applied to optimize hyperparameters for better model generalization.

  • Python: The main programming language used for preprocessing, model building, and evaluation.

  • Docker & Flask: Docker was used to containerize the model, ensuring scalability and ease of deployment, while Flask was used to create an API for seamless integration into MD Finance’s existing system.

Results

  • Expected Performance: The goal was to achieve a GINI score over 40 and an AUC over 70%, ensuring that the model could distinguish defaulting borrowers from non-defaulting ones with a high degree of confidence.

  • Actual Performance: The model achieved a GINI score around 40 and an AUC of approximately 70%, which was slightly below the expected target but still within a reasonable range for further refinement. This indicates the model was performing well, but additional data or fine-tuning could further enhance its accuracy.


To evaluate the model's performance effectively, its predictions were grouped into bins based on the likelihood of default. These bins allow the model to classify clients from the highest risk (bad clients) to the lowest risk (good clients).


Why is it good that the proportion of bad clients decreases in each bin?


  1. Risk Separation: In the first bins, bad clients (higher risk) should be concentrated, and as you move to the later bins, the proportion of bad clients should decrease, indicating the model is correctly separating clients by risk.

  2. Improved Accuracy: A decreasing proportion of bad clients as risk decreases shows the model is more accurate in predicting bad clients in high-risk bins and good clients in low-risk bins.

  3. Minimized Errors: If the proportion decreases, it means the model is less likely to misclassify good clients as bad, which reduces potential financial losses.

  4. Better Decision Making: Accurately categorizing clients helps in better decision-making, such as offering loans or restricting services to higher-risk clients.

Below is a graph showing the decreasing proportion of bad clients across bins, which visually demonstrates the model’s capability to accurately separate high-risk and low-risk borrowers, helping make better lending decisions:



Project duration:

1 month

Team

1 person

1 Data Scientist

Technologies

Python, Machine Learning, Mathematical Modelling, REST

Tech challenge

  • Scoring with fast response users in real times

  • Best solution AUC between all other contractors

  • Easy-to-deploy solution

Solution

We provided an end-to-end outsource of credit scoring system for Romania and created the REST API for the new scoring system. As a result, our solution showed the best AUC score among the other contractors contacted by the company.

Let's talk about your case

Email: andrii.rohovyi@postdata.ai

Let's talk about your case

Email: andrii.rohovyi@postdata.ai

Let's talk about your case

Email: andrii.rohovyi@postdata.ai