Alternative Credit Scoring and Classification Employing Machine Learning Techniques on a Big Data Platform

No Thumbnail Available

Date

2019

Authors

Hindistan, Yavuz Selim
Kiyakoğlu, Burhan Yasin
Rezaeinazhad, Arash Mohammadian
Korkmaz, Halil Ergun
Dağ, Hasan

Journal Title

Journal ISSN

Volume Title

Publisher

Institute of Electrical and Electronics Engineers Inc.

Open Access Color

Green Open Access

No

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Average
Influence
Average
Popularity
Top 10%

Research Projects

Journal Issue

Abstract

With the bloom of financial technology and innovations aiming to deliver a high standard of financial services, banks and credit service companies, along with other financial institutions, use the most recent technologies available in a variety of ways from addressing the information asymmetry, matching the needs of borrowers and lenders, to facilitating transactions using payment services. In the long list of FinTechs, one of the most attractive platforms is the Peer-to-Peer (P2P) lending which aims to bring the investors and borrowers hand in hand, leaving out the traditional intermediaries like banks. The main purpose of a financial institution as an intermediary is of controlling risk and P2P lending platforms innovate and use new ways of risk assessment. In the era of Big Data, using a diverse source of information from spending behaviors of customers, social media behavior, and geographic information along with traditional methods for credit scoring prove to have new insights for the proper and more accurate credit scoring. In this study, we investigate the machine learning techniques on big data platforms, analyzing the credit scoring methods. It has been concluded that on a HDFS (Hadoop Distributed File System) environment, Logistic Regression performs better than Decision Tree and Random Forest for credit scoring and classification considering performance metrics such as accuracy, precision and recall, and the overall run time of algorithms. Logistic Regression also performs better in time in a single node HDFS configuration compared to a non-HDFS configuration.

Description

Keywords

Big data, Credit Risk Scoring, Crowd-funding, Hadoop, Machine Learning, P2P, Peer-to-Peer lending, Machine Learning, Big data, P2P, Hadoop, Crowd-funding, Credit Risk Scoring, Peer-to-Peer lending

Turkish CoHE Thesis Center URL

Fields of Science

0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology

Citation

WoS Q

N/A

Scopus Q

N/A
OpenCitations Logo
OpenCitations Citation Count
4

Source

2019 4th International Conference on Computer Science and Engineering (UBMK)

Volume

Issue

Start Page

731

End Page

734
PlumX Metrics
Citations

CrossRef : 3

Scopus : 7

Captures

Mendeley Readers : 62

SCOPUS™ Citations

7

checked on Feb 07, 2026

Web of Science™ Citations

6

checked on Feb 07, 2026

Page Views

6

checked on Feb 07, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
2.83073931

Sustainable Development Goals

7

AFFORDABLE AND CLEAN ENERGY
AFFORDABLE AND CLEAN ENERGY Logo

8

DECENT WORK AND ECONOMIC GROWTH
DECENT WORK AND ECONOMIC GROWTH Logo

11

SUSTAINABLE CITIES AND COMMUNITIES
SUSTAINABLE CITIES AND COMMUNITIES Logo