AU2019101189A4 - A financial mining method for credit prediction - Google Patents
A financial mining method for credit prediction Download PDFInfo
- Publication number
- AU2019101189A4 AU2019101189A4 AU2019101189A AU2019101189A AU2019101189A4 AU 2019101189 A4 AU2019101189 A4 AU 2019101189A4 AU 2019101189 A AU2019101189 A AU 2019101189A AU 2019101189 A AU2019101189 A AU 2019101189A AU 2019101189 A4 AU2019101189 A4 AU 2019101189A4
- Authority
- AU
- Australia
- Prior art keywords
- data
- random forest
- forest algorithm
- classification
- default
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Ceased
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Business, Economics & Management (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- Engineering & Computer Science (AREA)
- Development Economics (AREA)
- Economics (AREA)
- Marketing (AREA)
- Strategic Management (AREA)
- Technology Law (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
How to evaluate and identify the potential default risk of the borrower or
calculate the default probability of the borrower before issuing the loan
the basis and significant link of the credit risk management of modem
financial institutions.This paper mainly studies the statistical analysis of
the historical loan data of Banks and other financial institutions by means
of the idea of non-equilibrium data classification, and establish the loan
default prediction model which is employing the Random Forest
algorithm.The results showed that the Random Forest algorithm was
better than the decision tree and logistic regression algorithm in
predicting performance. In addition, by using the Random Forest
algorithm to sort the importance of the features, it is possible to obtain a
feature that has a greater impact on the final breach of contract. Therefore,
to make a more effective judgment of lending risk in the financial
field.Index Terms-Random Forest, loan default prediction, data mining
Introduces unbalanced
data classification and
Random Forest algorithm
Data preprocessing and
data analysis
Compares models of
three different algorithms
Conclusion: Random
Forest algorithm has
better performance
Summarizes the paper
Figure 1
Fi 2 lass results I
X2~ocs voerotheptm A classification results 2clsica
x~~ ~X -4 ""'a''°" """'in'rp
M"
classification results
Figure 2
Description
Introduces unbalanced data classification and Random Forest algorithm
Data preprocessing and data analysis
Compares models of three different algorithms
Conclusion: Random Forest algorithm has better performance
Summarizes the paper
Figure 1
Fi 2 lass results I
X2~ocs A classification results 2clsica voerotheptm
M" x~~ ~X -4 ""'a''°" """'in'rp
classification results
Figure 2
A financial mining method for credit prediction
This invention is in the field of Financial Big Data
With the vigorous development of world economy and China's reform
and opening up gradually in-depth, whether the development of the
enterprise or from the change of people consumption idea, loan has
become the enterprises and individuals an important way to solve the
problem of economy. With the introduction of a variety of bank loans
business and the expansion of the growing demand, non-performing loans,
that is, the probability of default also proliferated. To avoid default,
Banks and other financial institutions when they make loans to evaluate
the borrower's credit risk or score, to predict the probability of default and
whether lending judgment according to the results. How effective
evaluation before granting loans and identify potential borrowers default
risk, is the basis of the financial institutions to credit risk management
and the important link, with a scientific model and system to determine
the risk of loan defaults can minimise risk and profit maximization.
This paper mainly studies how to use ideas of unbalanced data
classification of the history of Banks and other financial institutions loan
data analysis, and based on Random Forest classification model to predict
the likelihood of default. The first section mainly introduced in this paper
the unbalanced data classification and Random Forest algorithm; the
second section mainly for data preprocessing and data analysis. The third
section mainly constructs a model of Random Forest classification
forecast loan defaults, and get the results of this model and AUC values,
through the Random Forest algorithm compared with decision tree model
and logistic regression algorithm, getting the Random Forest algorithm
better conclusions. Finally, to evaluate the importance of each feature,
and draw which characteristics influence the results for the final default.
The fourth section summarizes the full text.
Table 1 Default classification based on Random Forests Default classification based on Random Forests
T-train set Ntree =the number of decision trees M=the number of expected variables in each sample Mtry =the number of variables participating in split in each tree nodes
Ssampsize=the sample size of Bootstrap The computation process: For(itree=O;14tre ! Ntree; tree + +) { 1. Generating a Bootstrap sample with size Ssampsize by using train set T to 2. Building an untrimmed tree tree by using Bootstrap. Choosing randomly Mtry variables and the best one to be a branch based on Value Gini in the process of generating tree. }
Output: Regression problems: the predicted result based on the average of all returned values. Classification Problems: the predicted result based on the classification outcome of the majority of decision trees.
Table 2 Data set variable case Variable name Variable description Type
SeriousDlqin2yrs Whether default Y/N The total amount of credit card and personal credit loan (excluding mortgages, installment payments like car loans, etc.) divided by the sum of RevolvingUtilizationOfUnsecuredLines credit lines Percentage
age Borrower age Integer
The number of times the borrower has been overdue for NumberOfTime3-59DaysPastDueNotWorse3O-59 days in the past two years Integer
Monthly debt repayments, alimony, living costs, etc. divided DebtRatio by total monthly income Percentage
MonthlyIncome monthly income Real Number of open loans (instalments such as car loans and mortgages) and credit lines NumberOfOpenCreditLinesAndLoans (such as credit cards) Integer
The number of times the borrower has been overdue for 90 days and over in the past two NumberOfTimes9ODaysLate years Integer
Mortgage and real estate loans NumberRealEstateLoansOrLines with mortgage-backed credits Integer
The number of times the borrower has overdue 60-89 NumberOfTime6-89DaysPastDueNotWorsedays in the past two years Integer
Number of people (spouse, children, etc.) who need to be raised in the family, excluding NumberOfDependents themselves Integer
Table 3 Table of frequency distribution of variable age Age Number of People Percentage of Number of people who Percentage of age interval defaulted defaulters within age interval Lower than 25 3028 2.02% 338 11.16% 26-35 18458 12.30% 2053 11.12% 36-45 29819 19.90% 2628 8.80% 46-55 36690 24.50% 2786 7.60% 56-65 33406 22.30% 1531 4.60% Higher than 65 28599 19.10% 690 2.40%
Table 4 variables NumberOfTime30-59 dayspastduenotworse frequency
distribution table
NumberRealEstateLoans Number of Ratio Number of Percentage of defaults OrLines people defaults in this range
Below 5 149207 99.47% 9884 6.6%
6-10 699 0.47% 121 17.3% 11-15 70 0.05% 16 22.8% 16-20 14 0.009% 3 21.4% Below 5 10 0.007% 2 20%
Table 5 frequency distribution table of variable
numberoftime30-59dayspastduenotworse NumberOfTime3O-59Days Number of Ratio Number of The percentage of default in PastDueNotWorse people defaulters this interval
126018 84% 5041 4% 1 16032 10.70% 2409 15% 2 4598 3.10% 1219 26.50% 3 1754 1.20% 618 35.20% 4 747 0.50% 318 42.60% 342 0.23% 154 45% 6 140 0.09% 74 52.90% 7or older 104 0.07% 50 48.07%
Table 6 Random Forests and the comparison of other algorithms Algorithm AUC value Random Forest 0.86 Decision Tree 0.8 Logistic Regression 0.8
Table 7 feature importance of each variable
Variable featureimportance RevolvingUtilizationOfUnsecuredLines 0.3411
NumberOfTime3O-59DaysPastDueNotWorse 0.1694 NumberOfTime90DaysLate 0.1594 NumberOfTime60-89DatysPastDueNotWorse 0.0727 age 0.0677 DebtRatio 0.0625 MonthlyIncome 0.0488 NumberOfOpenCreditLinesAndLoans 0.0442 NumberRealEstateLoansOrLines 0.0223 NumberOfDependents 0.0117
Figure 1 Analysis flow chart of credit forecast
Figure 2 Random Forests
Figure 3 Modeling flowcharts
Random Forest Algorithm
Imbalanced data classification
Imbalanced data which means the number of some data (the majority) far
exceeds the other (the minority) is universally existing in network
intrusion detection,financial transaction fraud,text classifier and etc. And
most of the time we are more interested in the classification of the
minority.Imbalanced data classification can be solved by punishment
weight of positive and negative sample. In detail, the approach is to give
different weights for classification of different sample sizes in algorithm
implementation process where small sample size has high weight and large sample size has low weight in general, and then we can compute and make modeling.
Introduction of Random Forest
Random Forest building a forest by random techniques is a combined
algorithm based on random decision trees. The main method is to select
randomly some variables or features to generate the split and then repeat
several times and guarantee the independence between these trees. After
getting Random Forest, a new sample will be judged by each decision
tree when it enters in the forest and belongs to which classification gets
the highest score(process visualized in figure 2)
Random Forest algorithm principle and characteristics
Random Forests algorithm, include classification and regression problems,
its algorithm steps are as follows:
Random Forests have the following features: Process can be seen from
the above algorithm, the randomness of the Random Forest is mainly
manifested in two aspects: The randomness of the data space by Bagging
(Bootstrap Aggregating) implementation, the feature space of the
randomness of Random sample (Random Subspace). For classification
problems, each decision tree in a Random Forests is classified and
predicted for new samples. The decision results of these trees are then
somehow grouped together to give the final classification of the sample.
1, The data in the rows (data records) and columns (variables) two
random introduction, so that the Random Forest is not easy to fall into
overfitting.
2,.Random Forest has a good anti-noise ability.
3.When there are a large number of missing values in the data set,
Random Forests can effectively estimate and process the missing values.
4. the ability to adapt to the data set is strong: can process both discrete
data, but also to process continuous data, the data set does not need to be
normalized.
5.It can be able to the importance of the variable sorting, easy to explain
the variable. There are two methods for calculating the importance of
variables in Random Forests: one based on the average decline accuracy
of OOB (Out of Bag). That is, in the process of growing the decision tree,
the OB sample is tested and recorded the wrong sample, and then the
value order of a column variable in the Bootstrap sample is randomly
disrupted, the decision tree is re-predicted, and the number of misdivided
samples is recorded again. The number of two prediction errors divided
by the total number of OOB samples is the change in error rate of this
decision tree, and the average rate of average decline is obtained by
summarizing the error rate change sourof for all trees in the Random
Forest. The other is based on the GINI drop method at the time of division, the Random Forest in the growth decision tree is in accordance with the GINI non-purity decline in the node split, all the selected a variable in the forest as a split variable of the node summary to obtain gINI drop.
Random Forest in imbalanced data classification
The default of weight for each category is 1 in Random Forest which
predicts that all wrong cost is equivalent. In scikitlearn, Random Forest
supplies the parameter of weight(list or dict)and mutually specifies
weights for different sorts.If the parameter is 'balanced', each weight has
a negative relationship with input frequency since Random Forest
automatically adjusts weights by using the value y.
The calculation formula is
n _samplesl(n _classes* np.bincount(y)) (1)
'Balanced subsample' is similar to the 'balanced',which uses sample size
of retracted sampling instead of using total number of samples. Therefore,
we can solve the unbalanced data classification problem by this approach.
Data preprocessing and data analysis
Data Set
The data Set used by this paper: The loan default data set is 250000
samples that included 150000 training set and 100000 test set.
This training set contains 150,000 historical data of borrowers, among
which 10026 default samples account for 6.684% of the total sample,
6.684% of the loan default rate, and 139974 non-default samples account
for 93.316% of the total sample. It can be seen that this data set is a
typical highly unbalanced data. The data set includes the borrower's age,
income, family, etc., and loan conditions, with a total of 11 variables,
among which SeriousDlqin2yrs is the label's tag, and the other 10
variables are predictive characteristics. The following table lists variable
names and data types:
Data Analysis
The experimental environment used in this paper is
Anaconda3+Python3.Firstly, the data were preliminarily analyzed. This
experiment mainly analyzed the distribution of default rate on each
independent variable, and generated the frequency distribution table as
shown in Table 3 (decimals were rounded).
It can be seen from Table3 that the default rate of people younger than 25
years old and people aged 26-35 years old is more than 10%.Default rates
fall as people age.
Table 4 variables NumberOfTime30-59 dayspastduenotworse frequency
Table 4 shows that the number of real estate and mortgage loans of 99.47%
borrowers is less than 5, but the default rate of borrowers with more than
5 loans increases significantly, among which the default rate of borrowers
with more than 10 loans is above 20%.
It can be seen from Table 5 that the default rate of borrowers who have
not defaulted for 30-59 days is only about 4%, but with the increase of
the number of delinquencies, the default rate increases significantly. For
the other two variables, the frequency distribution table of 60-89 days
overdue and 90 or more overdue times of borrowers also shows the same
trend as Table 5. Therefore, it can be concluded that the more
delinquencies occur, the higher the default rate.
With 10 variables, this study using data set our statistical analysis of each
variable and get the frequency distribution table as shown above, in
addition to the variable Number Of Open Credit Lines And Loans (open
the number of loans and credit loans) and default rate has no obvious
correlation, other variables are related to whether the borrower default
eventually.
2.3 Data Pre-processing
A preliminary exploration of the data may reveal missing values in the
Monthly Income and Number Of Dependents variables, which are 29731
and 3924, respectively.
Outliers: the minimum value in the age variable is 0, which is an outlier.
NumberOfTime30-59DaysPastDueNotWorse
NumberOfTime60-89DaysPastDueNotWorse
NumberOfTimes9DaysLate three overdue days variables, there are a
small number of 96,98 values, may be abnormal values or some code.
Data preprocessing: when reading data using the pandas library in Python.
Set the navalues parameter in the function pd.readcsv() to our own
definition list, treat 0 of the age variable and 96,98 of the three overdue
variables as NaN values, then using sklearn. Preprocessing. Imputer
library will data set all NaN replaced with mean value of the
corresponding columns.
Buliding models and experiment result
Random Forest Model
In this experiment, we uses a package of Python-sklearn (more
specifically, sklean.ensemble.RandomForestClassifier)-to build a
Random Forest model.
Here are the parameters and their settings:
n-estimators: The number of Decision Tree, which is set to 100.
oobscore: Whether to use out-of-bag data, set to True.
minsamplessplit: The minimum number used to yield an internal
node, set to 2.
min-samplesleaf: The minimum number of a leaf node, set to 50.
njobs: The number of jobs for computer to run parallely, set to -1.
classweight: Used to control weights of each class, set to
"balancedsubsample".
bootstrap: Whether boostrap samples are used for generating trees, set
to True.
Model Assessment
We use AUC as a indicator to assess the model in this experiment. AUC
is defined as area under the curve of ROC(Receiver Operating
Characteristic), and apparently the value of this curve is not more than 1.
The x-axis of ROC is FPR(False Postive Rate), and the y-axis is
TPR(True Positive Rate). Because normally the ROC curve is above line
y--x, the value of AUC is between 0.5 and 1. We use AUC as a standard
for evaluating models because ROC curve cannot help us clearly
determining which classifier is better. On the other hand, as a numerical
value, AUC can tell us which classifier is superior more specifically.
Results of comparing three models: Random Forest, Decision Tree and
logistic regression, are as follows:
From the Table 6, the Random Forest Algorithm has the greatest AUC
value among these three algorithms. Hence, Random Forest's predicting
performance is better than the other two algorithms.
Feature Importances of Variables
We use featureimportances in the
sklearn.ensemble.RandomForestClassifier class for this experiment, and
the feature importances for each feature are as follows:
From the Table 7, we can notice that these three variables:
RevolvingUtilizationOfLinsecuredLines,
NumberOfTime3O-59DaysPastDueNotWorse and
NumberOfTime9ODaysLate have top three feature importances, which
have greater impact on determining who may break contracts and bring
economic loss to companies. Hence, while companies grant a loan, they
can consider these features of an applicant to lower the risk.
Conclusion
This paper mainly studied the loan defaults of common problems in the
financial sector, and using the Random Forest of unbalanced data
classification method to predict default model is established, the basic
idea of Random Forest is in the process of a single tree structure, some
random variables or characteristics involved in tree node, repeated many
times and ensure the independence between the trees, in view of the
unbalanced data, by the method of parameter adjustment makes Random
Forest weights can be adjusted according to the y value automatically,
thus effectively solve the problem of unbalanced data classification.
Experiments show that Random Forest algorithm than the decision tree
and the classification of the logistic regression model performance is
better, to loan defaults in the field of financial prediction problem has
important reference meaning. In addition, based on the importance of the
characteristics of the measurement, in this experiment can be lending a
person's age, debt ratio and number of real estate and mortgage of the
three characteristics of the final is greatly influenced by default, the
feature importance measure method is the other feature selection problem
in data mining to have the important reference significance.
2019101189
There is one page of the claims only.
Claims (1)
1. A financial mining method for credit prediction, wherein the
experimental environment used in this experiment is
Anaconda3+Python3; First, the data is initially analyzed, this
experiment mainly analyzes the distribution of default rate on each
independent variable, and generates a frequency distribution table;
Data preprocessing: When reading data using the pandas library in
Python, set the navalues parameter in the function pd.read-csv()
to our own defined list, 0 in the age variable and three overdue
variables 96,98 is treated as a NaN value, then the
sklearn.preprocessing, imputer library is used to replace all NaNs
in the dataset with the average of the corresponding columns.
Introduces unbalanced data classification and Random Forest algorithm
Data preprocessing and data analysis 2019101189
Compares models of three different algorithms
Conclusion: Random Forest algorithm has better performance
Summarizes the paper
Figure 1
Figure 2
Compares models of three different algorithms 2019101189
Inputs data to Python
Uses sklearn to build models based on three different algorithms
Calculate AUC value (area under the ROC curve)
Compares AUC value of three models
Conclusion: Random Forest algorithm has better performance
Figure 3
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU2019101189A AU2019101189A4 (en) | 2019-10-02 | 2019-10-02 | A financial mining method for credit prediction |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| AU2019101189A AU2019101189A4 (en) | 2019-10-02 | 2019-10-02 | A financial mining method for credit prediction |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| AU2019101189A4 true AU2019101189A4 (en) | 2020-01-23 |
Family
ID=69160470
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| AU2019101189A Ceased AU2019101189A4 (en) | 2019-10-02 | 2019-10-02 | A financial mining method for credit prediction |
Country Status (1)
| Country | Link |
|---|---|
| AU (1) | AU2019101189A4 (en) |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113393169A (en) * | 2021-07-13 | 2021-09-14 | 大商所飞泰测试技术有限公司 | Financial industry transaction system performance index analysis method based on big data technology |
| CN113792935A (en) * | 2021-09-27 | 2021-12-14 | 武汉众邦银行股份有限公司 | Small micro enterprise credit default probability prediction method, device, equipment and storage medium |
| CN114119211A (en) * | 2021-12-09 | 2022-03-01 | 武汉众邦银行股份有限公司 | Method for screening high-latitude variable of credit variable data |
| CN114328668A (en) * | 2021-12-28 | 2022-04-12 | 浙江惠瀜网络科技有限公司 | Method and device for generating deposit risk control strategy, terminal and storage medium |
| CN115408499A (en) * | 2022-11-02 | 2022-11-29 | 思创数码科技股份有限公司 | Automatic analysis and interpretation method and system for government affair data analysis report chart |
| CN116364178A (en) * | 2023-04-18 | 2023-06-30 | 哈尔滨星云生物信息技术开发有限公司 | Somatic cell sequence data classification method and related equipment |
-
2019
- 2019-10-02 AU AU2019101189A patent/AU2019101189A4/en not_active Ceased
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN113393169A (en) * | 2021-07-13 | 2021-09-14 | 大商所飞泰测试技术有限公司 | Financial industry transaction system performance index analysis method based on big data technology |
| CN113393169B (en) * | 2021-07-13 | 2024-03-01 | 大商所飞泰测试技术有限公司 | Financial industry transaction system performance index analysis method based on big data technology |
| CN113792935A (en) * | 2021-09-27 | 2021-12-14 | 武汉众邦银行股份有限公司 | Small micro enterprise credit default probability prediction method, device, equipment and storage medium |
| CN113792935B (en) * | 2021-09-27 | 2024-04-05 | 武汉众邦银行股份有限公司 | Method, device, equipment and storage medium for predicting credit default probability of small micro-enterprises |
| CN114119211A (en) * | 2021-12-09 | 2022-03-01 | 武汉众邦银行股份有限公司 | Method for screening high-latitude variable of credit variable data |
| CN114328668A (en) * | 2021-12-28 | 2022-04-12 | 浙江惠瀜网络科技有限公司 | Method and device for generating deposit risk control strategy, terminal and storage medium |
| CN115408499A (en) * | 2022-11-02 | 2022-11-29 | 思创数码科技股份有限公司 | Automatic analysis and interpretation method and system for government affair data analysis report chart |
| CN116364178A (en) * | 2023-04-18 | 2023-06-30 | 哈尔滨星云生物信息技术开发有限公司 | Somatic cell sequence data classification method and related equipment |
| CN116364178B (en) * | 2023-04-18 | 2024-01-30 | 哈尔滨星云生物信息技术开发有限公司 | Somatic cell sequence data classification method and related equipment |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| AU2019101189A4 (en) | A financial mining method for credit prediction | |
| AU2020100709A4 (en) | A method of prediction model based on random forest algorithm | |
| Tang et al. | Applying a nonparametric random forest algorithm to assess the credit risk of the energy industry in China | |
| Coşer et al. | PREDICTIVE MODELS FOR LOAN DEFAULT RISK ASSESSMENT. | |
| Sadatrasoul et al. | Credit scoring in banks and financial institutions via data mining techniques: A literature review | |
| CN109949152A (en) | A Personal Credit Default Prediction Method | |
| AU2020101475A4 (en) | A Financial Data Analysis Method Based on Machine Learning Models | |
| WO2012018968A1 (en) | Method and system for quantifying and rating default risk of business enterprises | |
| Chern et al. | A decision tree classifier for credit assessment problems in big data environments | |
| Valavan et al. | Predictive-Analysis-based Machine Learning Model for Fraud Detection with Boosting Classifiers. | |
| Barman et al. | A complete literature review on financial fraud detection applying data mining techniques | |
| Liashenko et al. | Machine learning and data balancing methods for bankruptcy prediction | |
| Naik | Predicting credit risk for unsecured lending: A machine learning approach | |
| Chen et al. | Mixed credit scoring model of logistic regression and evidence weight in the background of big data | |
| Yang et al. | An evidential reasoning rule-based ensemble learning approach for evaluating credit risks with customer heterogeneity | |
| Bambico et al. | Characterizing delinquency and understanding repayment patterns in Philippine microfinance loans | |
| Van Trung et al. | Development of a credit scoring model using machine learning for commercial banks in Vietnam | |
| CN112926989B (en) | A bank loan risk assessment method and equipment based on multi-view integrated learning | |
| Chen et al. | Financial distress prediction using data mining techniques | |
| Wang et al. | Credit Risk Assessment for Small and Microsized Enterprises Using Kernel Feature Selection‐Based Multiple Criteria Linear Optimization Classifier: Evidence from China | |
| Rodrigo et al. | Personal Loan Default Prediction and Impact Analysis of Debt-to-Income Ratio | |
| Jin et al. | Financial credit default forecast based on big data analysis | |
| Desta et al. | Data mining application in predicting bank loan defaulters | |
| Zurada | Rule Induction Methods for Credit Scoring | |
| Hu | Development of a Machine Learning-Based Financial Risk Control System |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| FGI | Letters patent sealed or granted (innovation patent) | ||
| MK22 | Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry |