Introduction
Enhancing LendingClub Returns
Motivation:
The main objective of this project is to provide a model to guide a potential investor in the selection of funding opportunities in the popular online lending platform LendingClub. Online lending platforms have experienced substantial growth over the past 10 years and are marketed as an opportunity for borrowers to avoid the excessive fees of the bank market and give investors exposure to an interesting new asset class. An asset class that was, until now, very difficult for them to get exposure to.
Initially online lending was marketed as a low-cost alternative to bank lending. Unfortunately, the reality has been somewhat different. On the borrower side the fees are quite steep and average about 4% of the total payments made. On the investor side fees are much more reasonable with LendingClub fees equal 1% of payments made. All in, the net fee structure is worse than a bank would charge for a personal loan but better than a credit card where APRs (Annual Percentage interest Rates) are often 20% or higher. LendingClub managed to achieve exceptional growth in its stock price and at one point had a valuation of more than 10B$. Obviously at these valuations someone must be paying fees!
On a very different note, the U.S. banking system has shown clear signs of racial discrimination. The rise of internet type lenders was hoped to be a way around this type of discrimination grounded, as it often is, in the physical representation of the clientele (i.e. the bank is able to see the person’s racial background). During the course of our work we were not been able to get to the bottom of this interesting and important issue. Much of the data that would normally have helped us ascertain demographics, if only at a macro level, was not made available. We have, however, been able to broadly scope out some of the regional differences by Zip Code. Further analysis would be possible if we were to have access to the full 5-digit zip code data (we only received 3-digit zip codes data) as well as some basic biographical data.
Problem Statement:
Our objective is to generate a model that can be used by an investor to select a portfolio of loans that earn the highest overall return, net of credit costs and, if possible, with positive discrimination, i.e. lending to under-served portions of the population. We continue to believe that including under-served portions of the population (who are generally those who have been discriminated against) may prove to be both ethically and commercially rewarding.
A careful analysis of the LendingClub business model and the data that we were provided with shows that the key to realizing the highest possible return would be understanding: 1) How to avoid those borrowers likely to default - "bad apples", and 2) Once the "bad apples" have been removed, target the more profitable borrowers.
Overview of this Document:
This document is composed of five parts:
- Home Page: covers motivation, problem statement and document overview;
- Data Wrangling: covers data cleansing and wrangling;
- EDA: a review of the Exploratory Data Analysis including findings related to discrimination;
- Model Development: an overview of the modeling and analysis that we conducted; and
- Summary of Results: the key lessons learned for the potential investor.