Lending club Loan data analysis with Keras and TensorFlow – AI Project for Loans Data

Deep Learning with Keras and TensorFlow
Project – Lending Club Loan Data Analysis
Objective: Create a model that predicts whether or not a loan will be default using historical data.
Problem Statement:
For companies like Lending Club correctly predicting whether or not a loan will be a default is very important. In this project, using historical data from 2007 to 2015, you have to build a deep learning model to predict the chance of default for future loans. As you will see later, this dataset is highly imbalanced and includes a lot of features that make this problem more challenging.
Domain: Finance
Analysis to be done: Perform data preprocessing and build a deep learning prediction model.
Content: Dataset columns and definition:
● credit.policy: 1 if the customer meets the credit underwriting criteria of LendingClub.com, and 0 otherwise.
● purpose: The purpose of the loan (takes values “credit_card”, “debt_consolidation”, “educational”, “major_purchase”, “small_business”, and “all_other”).
● int.rate: The interest rate of the loan, as a proportion (a rate of 11% would be stored as 0.11). Borrowers judged by LendingClub.com to be more risky are assigned higher interest rates.
● installment: The monthly installments owed by the borrower if the loan is funded.
● log.annual.inc: The natural log of the self-reported annual income of the borrower.
● dti: The debt-to-income ratio of the borrower (the amount of debt divided by annual income).
● fico: The FICO credit score of the borrower.
● days.with.cr.line: The number of days the borrower has had a credit line.
● revol.bal: The borrower’s revolving balance (the amount unpaid at the end of the credit card billing cycle).
● revol.util: The borrower’s revolving line utilization rate (the amount of the credit line used relative to total credit available).
● inq.last.6mths: The borrower’s number of inquiries by creditors in the last 6 months.
● delinq.2yrs: The number of times the borrower has been 30+ days past due on a payment in the past 2 years.
● pub.rec: The borrower’s number of derogatory public records (bankruptcy filings, tax liens, or judgments).
● not.fully.paid: 0 → The loan was fully paid. 1 → The loan was not fully paid (i.e., defaulted, charged off, or missed payments).
Steps to perform:
Perform exploratory data analysis and feature engineering and then apply feature engineering. Follow up with a deep learning model to predict whether or not the loan will be default using the historical data.
Tasks:
1. Feature Transformation

Transform categorical values into numerical values (discrete)
2. Exploratory data analysis of different factors in the dataset.
3. Additional Feature Engineering

You will check the correlation between features and drop those features that have a strong correlation.

This will help reduce the number of features and leave you with the most relevant features.
4. Modeling

After applying EDA and feature engineering, you are now ready to build the predictive models.

In this part, you will create a deep learning model using Keras with Tensorflow backend.

 

Solution – lending_club_loan_default_predictionipynb

# Data handling & visualization
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
# Scikit-learn tools
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler
from sklearn.metrics import classification_report, confusion_matrix
# TensorFlow & Keras (modern, clean style)
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Input, Dense, Dropout, BatchNormalization
from tensorflow.keras.utils import plot_model
# Replace with actual file path
data = pd.read_csv(“loan_data.csv”)
print(data.columns)
data.head()

Index([‘credit.policy’, ‘purpose’, ‘int.rate’, ‘installment’, ‘log.annual.inc’, ‘dti’, ‘fico’, ‘days.with.cr.line’, ‘revol.bal’, ‘revol.util’, ‘inq.last.6mths’, ‘delinq.2yrs’, ‘pub.rec’, ‘not.fully.paid’], dtype=’object’)

credit.policy purpose int.rate installment log.annual.inc dti fico days.with.cr.line revol.bal revol.util inq.last.6mths delinq.2yrs pub.rec not.fully.paid
0 1 debt_consolidation 0.1189 829.10 11.350407 19.48 737 5639.958333 28854 52.1 0 0 0 0
1 1 credit_card 0.1071 228.22 11.082143 14.29 707 2760.000000 33623 76.7 0 0 0 0
2 1 debt_consolidation 0.1357 366.86 10.373491 11.63 682 4710.000000 3511 25.6 1 0 0 0
3 1 debt_consolidation 0.1008 162.34 11.350407 8.10 712 2699.958333 33667 73.2 1 0 0 0
4 1 credit_card 0.1426 102.92 11.299732 14.97 667 4066.000000 4740 39.5 0 1 0 0

You may also like...

Popular Posts

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.