CN118429080A - Repayment capability assessment method and system based on bank flow data - Google Patents
Repayment capability assessment method and system based on bank flow data Download PDFInfo
- Publication number
- CN118429080A CN118429080A CN202410581408.6A CN202410581408A CN118429080A CN 118429080 A CN118429080 A CN 118429080A CN 202410581408 A CN202410581408 A CN 202410581408A CN 118429080 A CN118429080 A CN 118429080A
- Authority
- CN
- China
- Prior art keywords
- data
- index
- transaction
- enterprise
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/03—Credit; Loans; Processing thereof
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0639—Performance analysis of employees; Performance analysis of enterprise or organisation operations
- G06Q10/06393—Score-carding, benchmarking or key performance indicator [KPI] analysis
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q40/00—Finance; Insurance; Tax strategies; Processing of corporate or income taxes
- G06Q40/02—Banking, e.g. interest calculation or account maintenance
Landscapes
- Business, Economics & Management (AREA)
- Engineering & Computer Science (AREA)
- Human Resources & Organizations (AREA)
- Strategic Management (AREA)
- Development Economics (AREA)
- Finance (AREA)
- Economics (AREA)
- Accounting & Taxation (AREA)
- General Physics & Mathematics (AREA)
- Marketing (AREA)
- Physics & Mathematics (AREA)
- General Business, Economics & Management (AREA)
- Theoretical Computer Science (AREA)
- Entrepreneurship & Innovation (AREA)
- Educational Administration (AREA)
- Technology Law (AREA)
- Operations Research (AREA)
- Quality & Reliability (AREA)
- Tourism & Hospitality (AREA)
- Game Theory and Decision Science (AREA)
- Financial Or Insurance-Related Operations Such As Payment And Settlement (AREA)
Abstract
The invention belongs to the technical field of financial big data, and discloses a repayment capability assessment method and a repayment capability assessment system based on bank flow data, wherein the repayment capability assessment system comprises the following steps: text analysis is carried out on the bank flow PDF through the regular expression, so that transaction information and flow data are obtained; collecting enterprise business and judicial data according to the transaction information, and cleaning and assembling the flow data, the business and judicial data into a data message; constructing 5 evaluation modules such as operation capacity, operation stability, profitability, life cycle management, industry adjustment coefficients and the like, and carding and developing indexes of corresponding modules; and constructing an expert strategy model according to the policy trend, the business scene, the expert experience design index and the module weight, and calculating the scoring, the grading and the repayment capability evaluation of the enterprise. The invention analyzes and scores the enterprise from multiple dimensions based on the enterprise bank flow data, and simultaneously integrates the scoring condition of the multiple dimensions, thereby scientifically, accurately and efficiently evaluating the repayment capability of the enterprise.
Description
Technical Field
The invention belongs to the technical field of financial big data, and particularly relates to a repayment capability assessment method and system based on bank flow data.
Background
Currently, with the promotion of national policies and the development of emerging technologies such as big data, artificial intelligence, cloud computing and the like, various industries are undergoing digital transformation. In the financial industry, financial big data is one of core technologies, and can convert financial activities such as financing, guarantee and the like into intelligent big data processing, so that personal affective interference is reduced, risk assessment and early warning capability are improved, and repayment capability prediction is improved.
In the field of corporate loans, accurate assessment of the repayment capabilities of a corporation is critical to the risk control and funds security of a financial institution. The repayment capability assessment method in the current market mainly relies on analysis of tax stamp data. And judging the business operation condition and repayment capacity of the enterprise by analyzing profit sheets, cash flow sheets, asset liability sheets, tax return information, billing information and the like through the financial and tax ticket data. However, these methods have certain limitations.
First, tax and invoice data can only cover the business of invoicing, and not all businesses, thus limiting the scope of evaluation. Finally, the financial statement can only reflect accounts which have occurred at present, and it is difficult to accurately predict future repayment capability of the enterprise. Therefore, there is an urgent need in the financial market for a repayment capability assessment method that can embody the actual receivables in the future, with wide coverage, high accuracy, and low risk.
The limitation of the traditional method can be overcome by analyzing the bank flow data of the enterprise by utilizing the financial big data technology. The bank flow data can update data in real time, and all enterprises have bank flow, so that a wider enterprise group can be covered. In addition, bank flow data can also analyze the future accounts payable and provide more accurate repayment capability prediction.
In conclusion, the repayment capability assessment method based on the bank flow data has wide application prospect in the field of large financial data, and can provide more scientific, accurate and efficient enterprise repayment capability assessment service for financial institutions. The method can fully utilize the financial big data technology, make up the defects of the traditional assessment method, and meet the requirements of the financial market on the repayment capability assessment method which can be widely covered, has high accuracy and low risk for the future practical receipts.
Through the above analysis, the problems and defects existing in the prior art are as follows:
at present, technical hysteresis exists in the field of financial big data on the assessment of the repayment capability of bank flow data, and the existing technology cannot fully utilize the financial big data technology to meet the assessment of the repayment capability of enterprises.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention provides a repayment capability assessment method and a repayment capability assessment system based on bank flow data.
The invention is realized in that a repayment capability assessment system based on bank flow data is characterized in that the system comprises: the bank flow analysis module is used for carrying out text analysis on the bank flow PDF through the regular expression and extracting transaction information and flow data; the evaluation module construction unit is used for constructing evaluation modules such as operation capacity, operation stability, profitability, life cycle management, industry adjustment coefficients and the like, and carding and developing indexes of the corresponding modules; and the weight and score calculating module is used for calculating the scores, the ratings and the repayment capability assessment of the enterprises according to the policy trend, the business scene, the expert experience design index and the module weight.
Further, the system further comprises: the data standardization and storage module is used for carrying out standardization processing on the analyzed stream data, eliminating the fields with the deletion rate exceeding the threshold value, and storing standardized data messages in a database in a specific format; and the data interface development module is used for developing a data interface, taking enterprise information as an interface access parameter and providing data access and interaction functions.
Further, the system further comprises: the index calculation script generation module is used for combing calculation logic of each index according to business meaning and data characteristics, and reproducing the calculation logic by using programming language to generate an index calculation script; and the index result calculation module is used for calculating the result values of all indexes of the enterprise by executing the index calculation script and providing data support for repayment capability assessment.
Further, the system further comprises: the model weight adjusting module dynamically adjusts the weights of the modules and the indexes according to the policy trend, the change of the service scene and the expert experience so as to improve the accuracy and the adaptability of repayment capability assessment; and the rating and repayment capability output module is used for outputting the rating and repayment capability evaluation results of each enterprise according to the calculated model partition and the designed scoring partition, and providing scientific basis for credit decision of the financial institutions.
Another object of the present invention is to provide a repayment capability assessment method and system based on bank flow data, wherein the system performs the following steps:
s1: text analysis is carried out on the bank flow PDF through the regular expression, so that transaction information and flow data are obtained;
s2: text analysis is carried out on the bank flow PDF through the regular expression, so that transaction information and flow data are obtained;
S3: constructing 5 evaluation modules such as operation capacity, operation stability, profitability, life cycle management, industry adjustment coefficients and the like, and carding and developing indexes of corresponding modules;
S4: and constructing an expert strategy model according to the policy trend, the business scene, the expert experience design index and the module weight, and calculating the scoring, the grading and the repayment capability evaluation of the enterprise.
Further, the step S1 specifically includes the following steps:
S11: reading and analyzing text content and text structures of a bank flowing PDF;
S12: setting a start mark and an end mark;
s13: extracting the data in a cutting mode according to the step S12;
S14: positioning a transaction date by using a regular expression method;
S15: the transaction date consists of four digits plus a connector plus one or two digits finally, so that a regular expression is written according to the data rule;
The transaction dates obtained are as follows: date1, date2, date3, datek, daten;
s16: sequentially matching the transaction time, transaction amount, transaction type, transaction place, balance, lending, account of the other party, abstract and other data of each running line according to the transaction date data, wherein the data are separated by spaces;
s17: cutting to obtain data of all fields of the kth running water by taking the transaction date of the kth running water as a start mark and the transaction date of the kth+1th running water as an end standard;
S18: splitting the kth data according to the space and sequentially matching the data such as transaction time, transaction amount, transaction type, transaction place, balance, borrowing, account of the other party, abstract and the like.
Further, in the step S2:
s21: aggregating according to the account of the other party and the transaction amount in the S1;
S22: retrieving transaction opponent data including, but not limited to, 'company', 'part', 'society', and the like;
s23: confirming the first five trading opponents according to the sorting from big to small of the aggregated trading amount;
S24: acquiring the industrial and commercial and judicial data of the main enterprise and the first five trade opponents on the public platforms such as 'Qijiebao', 'penmanship investigation', and the like by utilizing a big data acquisition technology;
s25: the business data comprises basic information of business, stakeholder information, business change information and the like;
s26: judicial data includes open-office notices, filing books, referees books, executed, trusted disbelieves executed, limit high consumption, trial information, etc.
S27: calculating the deletion rate of each field in S1 and industry and commerce judicial;
s28: the loss rate calculation formula is as follows:
wherein mRate is the deletion rate, mNum is the field deletion number, total is the field overall number;
s29: setting a deletion rate threshold value and eliminating fields larger than the threshold value;
s210: the missing values of the remaining fields are replaced by special characters or numbers;
S211: storing the standardized data fields in the form of tables; integrating the tables into client data messages in json format and storing the client data messages in a database;
S212: the specific format of the data message is that the name of the main body enterprise and the names of the first five trade opponents are used as keys, the bank running, business and judicial data of the main body enterprise are used as the values of the keys of the main body enterprise, and the business and judicial data of the first five trade opponents are used as the values of the keys of the trade opponents;
s213: the data interface is developed and enterprise information is used as an interface entry, such as enterprise name, legal name and the like.
Further, in the step S3:
S31: constructing 5 evaluation modules such as operation capacity, operation stability, profitability, life cycle management, industry adjustment factors and the like;
S32: calculating characteristic indexes of each module by combing, wherein the characteristic indexes are shown in an attached table 1;
S33: calculation logic for characteristic indexes according to business meanings of indexes and data carding, for example, the running water income of enterprises in recent 12 months:
Firstly, all stream data of the last year are matched according to the transaction time, then the data which is checked in the stream are matched, then the data of transaction types including 'back', 'borrow', 'lend' and the like are removed, and finally summation is carried out according to the transaction amount.
S34: code reproduction is carried out on the calculation logic of each index by using a programming language, so that an index calculation script is obtained;
s35: and calculating the result value of each index of the enterprise by the index combining script.
Further, in the step S4:
S41: and designing weights and scoring areas of each index and module according to policy trend, business scene and expert experience.
S42: the policy trend represents a series of push-to-fusion policies that are outsourced by a central or local government, such as:
the key field support forces such as scientific innovation, "special essence special new", green low-carbon and industrial foundation reconstruction engineering and the like are mentioned in the notification about strengthening financial support measure to assist civil economy development and development.
S43: the repayment capability of the enterprise can be accurately evaluated by adjusting the related modules and the index weights according to the policy trend.
S44: the business scene represents the scene implemented by a financial or financial-like institution, different institution risk preferences and guest groups are not communicated, and the related index weight is also required to be adjusted according to the business scene so as to achieve the effect of accurate evaluation.
S45: the expert experience indicates that the field expert gives a corresponding weight strategy to the model according to years of practical experience, so that each strategy weight has the fact demonstration and strong interpretability, and meanwhile, the expert experience can acutely explore risks and adjust the model weight, so that the model has the iteratability.
S46: matching corresponding index scores according to the index values and the score regions;
s47: calculating a final score of the index according to the index score and the index weight;
s48: summing the index scores according to the modules to obtain scores of the modules;
s49: and calculating and summing according to the designed module weight to obtain a final model score, wherein the implementation formula is as follows:
Wherein moduleVal is a module score, index is an index score, w index is an index weight, modelVal is a model score, and w model is a module weight.
S410: the rating and repayment capability of each enterprise is obtained through the score intervals, and the relationship is shown in the attached table 2.
In combination with the technical scheme and the technical problems to be solved, the technical scheme to be protected has the following advantages and positive effects:
Firstly, the invention takes bank flow data of enterprises as a main part and industrial and scientific data as an auxiliary part, constructs 5 analysis dimensions of enterprise operation capacity, operation stability, profitability, life cycle management, industry adjustment factors and the like, and evaluates repayment capacity of the enterprises.
The invention adopts the statistical analysis technology to deeply analyze the bank flow data and constructs the evaluation index which has strong interpretation, high practicability, science and accuracy.
The invention analyzes and scores the enterprise from multiple dimensions based on the enterprise bank flow data, and simultaneously integrates the scoring condition of the multiple dimensions, thereby scientifically, accurately and efficiently evaluating the repayment capability of the enterprise.
(1) The expected benefits and commercial values after the technical scheme of the invention is converted are as follows:
The invention can assist the financial institutions such as banks, guarantees, small credits and the like to evaluate the repayment capability of enterprises or individuals, and effectively reduces credit risk and rating policy assistance.
(2) The technical scheme of the invention fills the technical blank in the domestic and foreign industries:
The invention takes the bank flow data of the enterprise as the main part and takes the industrial and commercial and judicial data as the auxiliary part to construct 5 analysis dimensions such as the business operation capability, the business stability, the profit capability, the life cycle management, the industry adjustment coefficient and the like of the enterprise, thereby realizing scientific, accurate and efficient assessment of the repayment capability of the enterprise.
(3) The technical scheme of the invention solves the technical problems that people are always desirous of solving but are not successful all the time:
The invention adopts bank flow data to evaluate the repayment capability of enterprises, and provides a new wind control strategy with data evaluation dimension and high practicability.
Secondly, the invention provides a repayment capability assessment system based on bank flow data, which effectively solves several technical problems existing in the traditional assessment method and makes remarkable technical progress:
Technical problem of # # prior art:
1) The data processing efficiency is low: when a large amount of bank flow data is processed by the traditional method, the data processing efficiency is low due to a plurality of manual processing links.
2) The evaluation accuracy is not high: because of the lack of effective data analysis tools and algorithms, the traditional repayment capability assessment method often cannot accurately analyze and utilize information in bank flowing water, and the assessment result has low accuracy.
3) The adaptability is poor: the traditional evaluation model is often relatively fixed, and is difficult to quickly adjust according to market change or specific industry characteristics, so that deviation of an evaluation result and an actual situation occurs.
# # Significant technical advances of the present invention:
1) Automated data processing: the bank flow analysis module automatically analyzes the bank flow data in the PDF format by using the regular expression, so that the efficiency and accuracy of data processing are greatly improved.
2) The intelligent evaluation module is constructed: and an evaluation module construction unit is adopted to construct a complex evaluation model based on multidimensional indexes such as operation capacity, profitability and the like, so that the comprehensiveness and accuracy of evaluation are improved.
3) Dynamic weight adjustment: the model weight adjustment module can dynamically adjust the weight of the model according to the latest policy trend, the change of the service scene and the experience feedback of the expert, so that the evaluation result is more adaptive and prospective.
4) Standardized data management: the data normalization and storage module performs normalization processing and storage on the acquired data, ensures the data quality and provides reliable data support for evaluation.
5) And (3) automating index calculation: the index calculation script generation module and the index result calculation module automatically execute the calculation of the index, so that the manual intervention is reduced, and the calculation speed and accuracy are improved.
6) Scientific credit decision support: the rating and repayment capability output module provides scientific rating and repayment capability assessment results according to the model scores and preset scoring areas, and provides effective credit decision support for financial institutions.
In general, the invention obviously improves the efficiency, accuracy and adaptability of repayment capability assessment by means of integrated, automatic and intelligent technical means, and provides more scientific and accurate decision basis for financial institutions.
Drawings
FIG. 1 is a flow chart of a system implementation provided by an embodiment of the present invention;
Fig. 2 is a system design diagram provided by an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the following examples in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The following are two specific application embodiments describing how to apply the bank-based payability assessment method to different business and financial environments:
Example 1: credit assessment for small and medium enterprises
Small and medium enterprises often face credit assessment problems when seeking loans, particularly emerging enterprises lacking traditional financial reports or incomplete historical financial records.
The operation steps are as follows:
1. And (3) data collection:
the enterprise provides PDF files of its banking pipeline.
And analyzing the PDF files by using the regular expression to extract transaction information and stream data.
2. And (3) data processing:
The extracted data is aggregated, with particular attention to transactions with major suppliers and customers.
On-line platforms such as "enterprise checks" gather the business and judicial information of these suppliers and customers.
3. Credit assessment:
Five modules of the evaluation method were used: business capability, business stability, profitability, lifecycle management, industry adjustment factors, and comprehensive scoring of the enterprise.
Based on the scoring results, the financial institution determines the loan amount and interest rate.
4. Results application:
the enterprise obtains the corresponding loan according to the obtained rating.
Financial institutions reduce credit risk by accurate assessment.
Example 2: risk management and monitoring
The financial institution needs to effectively monitor the business status of the post-loan business in order to discover potential credit risks in time.
The operation steps are as follows:
1. and (3) continuously monitoring:
the latest bank flow PDF file is obtained from the loan enterprise regularly.
And analyzing the files to extract key transaction data.
2. Risk assessment:
And analyzing the variation trend of the business operation condition of the enterprise by comparing the data of the continuous periods.
Attention is paid to signals such as opponent changes, reduced transaction frequency, abrupt decreases in high volume transactions, etc.
3. Risk response:
and adjusting credit strategies according to the changes of enterprises, such as adjusting credit line, increasing guarantee requirements and the like.
The risk management department is reminded of paying attention to the specific enterprise through the risk early warning system.
4. Results application:
The credit rating of the enterprise is updated in real time to reflect the latest repayment capability and risk status.
Financial institutions can reduce bad loans and optimize capital configurations.
These two embodiments demonstrate how the bank-pipelining-data-based repayment capability assessment method can be applied to specific financial business scenarios, aiming at improving the accuracy of decisions and the efficiency of risk management through a data-driven decision support tool.
The invention provides a repayment capability assessment method and a repayment capability assessment system based on bank flow data, wherein the repayment capability assessment system comprises the following implementation steps:
s1: text analysis is carried out on the bank flow PDF through the regular expression, so that transaction information and flow data are obtained;
s2: text analysis is carried out on the bank flow PDF through the regular expression, so that transaction information and flow data are obtained;
S3: constructing 5 evaluation modules such as operation capacity, operation stability, profitability, life cycle management, industry adjustment coefficients and the like, and carding and developing indexes of corresponding modules;
S4: and constructing an expert strategy model according to the policy trend, the business scene, the expert experience design index and the module weight, and calculating the scoring, the grading and the repayment capability evaluation of the enterprise.
The repayment capability evaluation system based on bank flow data provided by the embodiment of the invention specifically comprises the following detailed data signal processing process:
1) And a bank running water analysis module:
the functions are as follows: the module uses a regular expression to carry out text analysis on a PDF file of bank pipelining, and the main aim is to accurately extract transaction information and pipelining data from intensive bank pipelining records.
The treatment process comprises the following steps: first, a PDF document is converted to a text format. And then, a predefined regular expression matching mode is applied to identify and extract key information such as transaction date, transaction amount, transaction party, transaction description and the like.
2) The evaluation module construction unit:
the functions are as follows: the unit is responsible for constructing several key assessment modules including business capability, business stability, profitability, lifecycle management, and industry adjustment factors.
The treatment process comprises the following steps: and developing an evaluation module according to the transaction frequency, the amount of money, the opponent of the transaction, the periodic mode and the like in the bank running water data. These modules will analyze the transaction data using statistical and machine learning methods to extract metrics that measure the operational health of the enterprise.
3) Weight and score calculation module:
The functions are as follows: and designing indexes and weights of all evaluation modules according to the policy trend, the business scene and the expert experience, and finally calculating the scoring, the grading and the repayment capability of the enterprise.
The treatment process comprises the following steps: first, a corresponding weight is given according to the importance of each evaluation module. Then, scores are calculated according to the performance of the enterprise in each module, and finally the scores are integrated and weighted and averaged according to weights to obtain overall scores and ratings.
The system can provide deep analysis on the repayment capability of enterprises for financial institutions through detailed data processing and complex model calculation, and helps decision makers to make more scientific and accurate judgment in the loan approval process.
Further, the step S1 specifically includes the following steps:
S11: reading and analyzing text content and text structures of a bank flowing PDF;
S12: setting a start mark and an end mark;
s13: extracting the data in a cutting mode according to the step S12;
S14: positioning a transaction date by using a regular expression method;
S15: the transaction date consists of four digits plus a connector plus one or two digits finally, so that a regular expression is written according to the data rule;
The transaction dates obtained are as follows: date 1,date2,date3,...,datek,...,daten;
s16: sequentially matching the transaction time, transaction amount, transaction type, transaction place, balance, lending, account of the other party, abstract and other data of each running line according to the transaction date data, wherein the data are separated by spaces;
s17: cutting to obtain data of all fields of the kth running water by taking the transaction date of the kth running water as a start mark and the transaction date of the kth+1th running water as an end standard;
S18: splitting the kth data according to the space and sequentially matching the data such as transaction time, transaction amount, transaction type, transaction place, balance, borrowing, account of the other party, abstract and the like.
Further, in the step S2:
s21: aggregating according to the account of the other party and the transaction amount in the S1;
S22: retrieving transaction opponent data including, but not limited to, 'company', 'part', 'society', and the like;
s23: confirming the first five trading opponents according to the sorting from big to small of the aggregated trading amount;
S24: acquiring the industrial and commercial and judicial data of the main enterprise and the first five trade opponents on the public platforms such as 'Qijiebao', 'penmanship investigation', and the like by utilizing a big data acquisition technology;
s25: the business data comprises basic information of business, stakeholder information, business change information and the like;
s26: judicial data includes open-office notices, filing books, referees books, executed, trusted disbelieves executed, limit high consumption, trial information, etc.
S27: calculating the deletion rate of each field in S1 and industry and commerce judicial;
s28: the loss rate calculation formula is as follows:
wherein mRate is the deletion rate, mNum is the field deletion number, total is the field overall number;
s29: setting a deletion rate threshold value and eliminating fields larger than the threshold value;
s210: the missing values of the remaining fields are replaced by special characters or numbers;
S211: storing the standardized data fields in the form of tables; integrating the tables into client data messages in json format and storing the client data messages in a database;
S212: the specific format of the data message is that the name of the main body enterprise and the names of the first five trade opponents are used as keys, the bank running, business and judicial data of the main body enterprise are used as the values of the keys of the main body enterprise, and the business and judicial data of the first five trade opponents are used as the values of the keys of the trade opponents;
s213: the data interface is developed and enterprise information is used as an interface entry, such as enterprise name, legal name and the like.
Further, in the step S3:
S31: constructing 5 evaluation modules such as operation capacity, operation stability, profitability, life cycle management, industry adjustment factors and the like;
S32: calculating characteristic indexes of each module by combing, wherein the characteristic indexes are shown in an attached table 1;
S33: calculation logic for characteristic indexes according to business meanings of indexes and data carding, for example, the running water income of enterprises in recent 12 months:
Firstly, all stream data of the last year are matched according to the transaction time, then the data which is checked in the stream are matched, then the data of transaction types including 'back', 'borrow', 'lend' and the like are removed, and finally summation is carried out according to the transaction amount.
S34: code reproduction is carried out on the calculation logic of each index by using a programming language, so that an index calculation script is obtained;
s35: and calculating the result value of each index of the enterprise by the index combining script.
Further, in the step S4:
S41: and designing weights and scoring areas of each index and module according to policy trend, business scene and expert experience.
S42: the policy trend represents a series of push-to-fusion policies that are outsourced by a central or local government, such as:
the key field support forces such as scientific innovation, "special essence special new", green low-carbon and industrial foundation reconstruction engineering and the like are mentioned in the notification about strengthening financial support measure to assist civil economy development and development.
S43: the repayment capability of the enterprise can be accurately evaluated by adjusting the related modules and the index weights according to the policy trend.
S44: the business scene represents the scene implemented by a financial or financial-like institution, different institution risk preferences and guest groups are not communicated, and the related index weight is also required to be adjusted according to the business scene so as to achieve the effect of accurate evaluation.
S45: the expert experience indicates that the field expert gives a corresponding weight strategy to the model according to years of practical experience, so that each strategy weight has the fact demonstration and strong interpretability, and meanwhile, the expert experience can acutely explore risks and adjust the model weight, so that the model has the iteratability.
S46: matching corresponding index scores according to the index values and the score regions;
s47: calculating a final score of the index according to the index score and the index weight;
s48: summing the index scores according to the modules to obtain scores of the modules;
s49: and calculating and summing according to the designed module weight to obtain a final model score, wherein the implementation formula is as follows:
Wherein moduleVal is a module score, index is an index score, w index is an index weight, modelVal is a model score, and w model is a module weight.
S410: the rating and repayment capability of each enterprise is obtained through the score intervals, and the relationship is shown in the attached table 2.
As shown in FIG. 1, the system implementation flow chart of the invention is based on the running water analysis of banks, and the construction of an evaluation module is carried out through a data structure, so that the construction of a model is realized, and the repayment capability evaluation system of a certain enterprise is obtained.
The system design diagram of the invention shown in fig. 2 is divided into five blocks, namely an acquisition block, a data block, an index block, a model block and a decision block, wherein data such as enterprise bank flow and account information are acquired in the acquisition block, data access is carried out in the data block, data analysis and data cleaning are carried out, and interface transmission of books are carried out in the index block, wherein index logic, development, test, operation and score are carried out according to the technology mentioned in the system steps, and the repayment capability assessment result of an enterprise is obtained in the model block based on the index weight and the module weight of the decision block.
The application embodiments provided by the invention are as follows: and firstly, carrying out text analysis on the bank flow PDF through the regular expression to obtain transaction information and flow data. And secondly, carrying out text analysis on the bank flow PDF through the regular expression to obtain transaction information and flow data. And then 5 evaluation modules such as operation capacity, operation stability, profitability, life cycle management, industry adjustment factors and the like are constructed, and indexes of the corresponding modules are carded and developed. And finally, constructing an expert strategy model according to the policy trend, the business scene, the expert experience design index and the module weight, calculating the scoring, the grading and the repayment capability evaluation of the enterprise, and obtaining a repayment capability evaluation result of the enterprise.
It should be noted that the embodiments of the present invention can be realized in hardware, software, or a combination of software and hardware. The hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory and executed by a suitable instruction execution system, such as a microprocessor or special purpose design hardware. Those of ordinary skill in the art will appreciate that the apparatus and methods described above may be implemented using computer executable instructions and/or embodied in processor control code, such as provided on a carrier medium such as a magnetic disk, CD or DVD-ROM, a programmable memory such as read only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The device of the present invention and its modules may be implemented by hardware circuitry, such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, etc., or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., as well as software executed by various types of processors, or by a combination of the above hardware circuitry and software, such as firmware.
The embodiments of the present invention have some positive effects in the development or use process, and do have great advantages over the prior art, and are described below in connection with data, tables, etc. of the test process.
Step S1: and carrying out text analysis on the bank flowing water PDF through a regular expression, wherein the PDF is the bank flowing water data of a certain enterprise in the last 1 year as shown in the following graph.
Bank pipelining PDF report:
Setting a summary as a start mark, setting a page number as an end mark, extracting data in a cutting mode, and positioning a transaction date by using a regular expression, wherein the obtained transaction date is as follows: 2022-12-13, 2022-12-14, 2022-12-14,...,2023-09-16,...,2023-12-18.
And sequentially matching the transaction time, transaction amount, transaction type, transaction place, balance, lending, opposite side account, abstract and other data of each running line according to the transaction date data, and separating the data by spaces to obtain bank running data and transaction information as shown in the following table.
And (3) analyzing the flow data:
and analyzing transaction data:
| Account name | Transaction amount | Transaction time | Transaction type |
| Hainanli Limited | 15000.00 | 2022-12-24 | Payment on line |
| Shenzhen Qingx Limited | 3000.00 | 2022-12-26 | Payment on line |
| .... | .... | .... | .... |
| Shenzhen Shu Co., ltd | 5500.00 | 2023-11-18 | Payment on line |
Step S2: and collecting enterprise business and judicial data according to the transaction information, and cleaning and assembling the flow data, the business and judicial data into a data message.
The method comprises the specific operations of carrying out aggregation according to account names and transaction amounts and sorting according to the amounts in a reverse order, taking out transaction opponent data of the first five accounts, wherein the account names comprise but are not limited to 'company', 'part', 'society', and the like, then acquiring business and judicial data of a main enterprise and the first five enterprises through 'enterprise search', cleaning and assembling to obtain data messages, and finally carding to obtain the data messages as shown in the following table.
Data message:
Step S3: and 5 evaluation modules such as operation capacity, operation stability, profitability, life cycle management, industry adjustment factors and the like are constructed, and indexes of corresponding modules are carded and developed.
The specific operation is to analyze bank flow data, for example, count flow income of about 12 months, design and obtain flow income homonymous increase rate of about 12 months, flow expense homonymous increase rate of about 12 months, etc. through the mode of derivation and fusion, obtain characteristic indexes of each module based on the same, comb calculation logic of indexes according to business meaning, then use programming languages such as python to carry out code reproduction, obtain specific values of each index, and the results are shown in the following table. Index calculation result:
The step S4: and constructing an expert strategy model according to the policy trend, the business scene, the expert experience design index and the module weight, and calculating the scoring, the grading and the repayment capability evaluation of the enterprise.
The method comprises the steps of specifically operating, designing weights and scoring areas of each index and each module according to policy trends, business scenes and expert experiences, calculating scores of each module and each index in a model according to a model calculation formula, finally obtaining final scores of enterprises, and obtaining ratings and repayment capacity of the enterprises according to model score mapping. For example, the business capability module firstly designs the score interval and the index weight of each index, then calculates the score of the module, designs the weight of the module to obtain the score of the profitability module, and the specific implementation process is as follows. Weights of the metrics and modules and scoring intervals:
The index calculation obtains that the running water income congruent increment rate of the enterprise for nearly 12 months is 0.5416, the index score is 8, the index final score is 9.6 according to interval mapping, the final scores of the other indexes of the module are calculated in sequence and added to obtain 87.25, the operation capacity module is multiplied by the module weight to obtain 21.8125, the score of the other modules is calculated according to the calculation, the model is finally obtained and divided into 78.2964, the enterprise rating is C according to mapping in table 2, and the repayment capacity is stronger. And (3) injection: the above indexes and weights of the modules and the data between the score partitions are structural data, and are only used for explaining the calculation process of the present invention.
According to the repayment capability assessment method and system based on the bank flow data, provided by the embodiment of the invention, the bank flow data of an enterprise is taken as a main part, and the industrial and scientific data are taken as an auxiliary part, so that 5 analysis dimensions such as enterprise operation capability, operation stability, profitability, life cycle management, industry adjustment factors and the like are constructed, and the repayment capability of the enterprise is accurately and efficiently assessed.
The foregoing is merely illustrative of specific embodiments of the present invention, and the scope of the invention is not limited thereto, but any modifications, equivalents, improvements and alternatives falling within the spirit and principles of the present invention will be apparent to those skilled in the art within the scope of the present invention.
The attached table:
Table 1 characteristic index table of each module
Table 2 rating and repayment capability table for each enterprise
| Inter-score partition | Rating of | Repayment capability |
| (90,100] | A | Extremely strong |
| (80,90] | B | Strong strength |
| (70,80] | C | Stronger (stronger) |
| (60,70] | D | In general |
| (55,60] | E | Poor quality |
| (0,55] | F | Difference of difference |
Claims (9)
1. A bank pipeline data based repayment capability assessment system, the system comprising:
the bank flow analysis module utilizes the regular expression to perform text conversion and analysis on the bank flow PDF file so as to extract the transaction date, the transaction amount, the transaction party and the transaction description key information;
the evaluation module construction unit is used for constructing an operation capacity, operation stability, profitability, life cycle management and industry adjustment coefficient evaluation module according to the extracted bank flow data, and analyzing transaction data by using a statistical and machine learning method so as to evaluate the operation health index of an enterprise;
and the weight and score calculating module is used for designing indexes and weights of the evaluation modules based on policy trend, business scene and expert experience, and calculating scores of the modules and carrying out weighted average by combining the weights so as to obtain overall scores and ratings of enterprises, thereby evaluating repayment capacity of the enterprises.
2. The bank line data based repayment capability assessment system according to claim 1, further comprising: the data standardization and storage module is used for carrying out standardization processing on the analyzed stream data, eliminating the fields with the deletion rate exceeding the threshold value, and storing standardized data messages in a database in a specific format; and the data interface development module is used for developing a data interface, taking enterprise information as an interface access parameter and providing data access and interaction functions.
3. The bank line data based repayment capability assessment system according to claim 1, further comprising: the index calculation script generation module is used for combing calculation logic of each index according to business meaning and data characteristics, and reproducing the calculation logic by using programming language to generate an index calculation script; and the index result calculation module is used for calculating the result values of all indexes of the enterprise by executing the index calculation script and providing data support for repayment capability assessment.
4. The bank line data based repayment capability assessment system according to claim 1, further comprising: the model weight adjusting module dynamically adjusts the weights of the modules and the indexes according to the policy trend, the change of the service scene and the expert experience so as to improve the accuracy and the adaptability of repayment capability assessment; and the rating and repayment capability output module is used for outputting the rating and repayment capability evaluation results of each enterprise according to the calculated model partition and the designed scoring partition, and providing scientific basis for credit decision of the financial institutions.
5. The repayment capability assessment method based on bank flow data is characterized by comprising the following steps of:
s1: text analysis is carried out on the bank flow PDF through the regular expression, so that transaction information and flow data are obtained;
s2: text analysis is carried out on the bank flow PDF through the regular expression, so that transaction information and flow data are obtained;
S3: constructing 5 evaluation modules such as operation capacity, operation stability, profitability, life cycle management, industry adjustment coefficients and the like, and carding and developing indexes of corresponding modules;
S4: and constructing an expert strategy model according to the policy trend, the business scene, the expert experience design index and the module weight, and calculating the scoring, the grading and the repayment capability evaluation of the enterprise.
6. The bank pipelining data-based repayment capability assessment method and system according to claim 5, wherein the specific operation of step S1 is as follows:
S11: reading and analyzing text content and text structures of a bank flowing PDF;
S12: setting a start mark and an end mark;
s13: extracting the data in a cutting mode according to the step S12;
S14: positioning a transaction date by using a regular expression method;
S15: the transaction date consists of four digits plus a connector plus one or two digits finally, so that a regular expression is written according to the data rule;
The transaction dates obtained are as follows: date1, date2, date3, datek, daten;
s16: sequentially matching the transaction time, transaction amount, transaction type, transaction place, balance, lending, account of the other party, abstract and other data of each running line according to the transaction date data, wherein the data are separated by spaces;
s17: cutting to obtain data of all fields of the kth running water by taking the transaction date of the kth running water as a start mark and the transaction date of the kth+1th running water as an end standard;
S18: splitting the kth data according to the space and sequentially matching the data such as transaction time, transaction amount, transaction type, transaction place, balance, borrowing, account of the other party, abstract and the like.
7. The bank pipelining-based repayment capability assessment method and system according to claim 5, wherein in step S2:
s21: aggregating according to the account of the other party and the transaction amount in the S1;
S22: retrieving transaction opponent data including, but not limited to, 'company', 'part', 'society', and the like;
s23: confirming the first five trading opponents according to the sorting from big to small of the aggregated trading amount;
S24: acquiring the industrial and commercial and judicial data of the main enterprise and the first five trade opponents on the public platforms such as 'Qijiebao', 'penmanship investigation', and the like by utilizing a big data acquisition technology;
s25: the business data comprises basic information of business, stakeholder information, business change information and the like;
s26: judicial data includes open-office notices, filing books, referees books, executed, trusted disbelieves executed, limit high consumption, trial information, etc.
S27: calculating the deletion rate of each field in S1 and industry and commerce judicial;
s28: the loss rate calculation formula is as follows:
wherein mRate is the deletion rate, mNum is the field deletion number, total is the field overall number;
s29: setting a deletion rate threshold value and eliminating fields larger than the threshold value;
s210: the missing values of the remaining fields are replaced by special characters or numbers;
S211: storing the standardized data fields in the form of tables; integrating the tables into client data messages in json format and storing the client data messages in a database;
S212: the specific format of the data message is that the name of the main body enterprise and the names of the first five trade opponents are used as keys, the bank running, business and judicial data of the main body enterprise are used as the values of the keys of the main body enterprise, and the business and judicial data of the first five trade opponents are used as the values of the keys of the trade opponents;
s213: the data interface is developed and enterprise information is used as an interface entry, such as enterprise name, legal name and the like.
8. The bank pipelining-based repayment capability assessment method and system according to claim 5, wherein in step S3:
S31: constructing 5 evaluation modules such as operation capacity, operation stability, profitability, life cycle management, industry adjustment factors and the like;
S32: calculating characteristic indexes of each module by combing, wherein the characteristic indexes are shown in an attached table 1;
S33: calculation logic for characteristic indexes according to business meanings of indexes and data carding, for example, the running water income of enterprises in recent 12 months:
Firstly, all stream data of the last year are matched according to the transaction time, then the data which is checked in the stream are matched, then the data of transaction types including 'back', 'borrow', 'lend' and the like are removed, and finally summation is carried out according to the transaction amount.
S34: code reproduction is carried out on the calculation logic of each index by using a programming language, so that an index calculation script is obtained;
s35: and calculating the result value of each index of the enterprise by the index combining script.
9. The bank pipelining-based repayment capability assessment method and system according to claim 5, wherein in step S4:
S41: and designing weights and scoring areas of each index and module according to policy trend, business scene and expert experience.
S42: the policy trend represents a series of push-to-thawing policies that are outsourced by a central or local government;
s43: the repayment capability of the enterprise can be accurately evaluated by adjusting the related modules and the index weights according to the policy trend.
S44: the business scene represents the scene implemented by a financial or financial-like institution, different institution risk preferences and guest groups are not communicated, and the related index weight is also required to be adjusted according to the business scene so as to achieve the effect of accurate evaluation.
S45: the expert experience indicates that the field expert gives a corresponding weight strategy to the model according to years of practical experience, so that each strategy weight has the fact demonstration and strong interpretability, and meanwhile, the expert experience can acutely explore risks and adjust the model weight, so that the model has the iteratability.
S46: matching corresponding index scores according to the index values and the score regions;
s47: calculating a final score of the index according to the index score and the index weight;
s48: summing the index scores according to the modules to obtain scores of the modules;
s49: and calculating and summing according to the designed module weight to obtain a final model score, wherein the implementation formula is as follows:
Wherein moduleVal is a module score, index is an index score, w index is an index weight, modelVal is a model score, and w model is a module weight.
S410: the rating and repayment capability of each enterprise is obtained through the score intervals, and the relationship is shown in the attached table 2.
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410581408.6A CN118429080A (en) | 2024-05-11 | 2024-05-11 | Repayment capability assessment method and system based on bank flow data |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN202410581408.6A CN118429080A (en) | 2024-05-11 | 2024-05-11 | Repayment capability assessment method and system based on bank flow data |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN118429080A true CN118429080A (en) | 2024-08-02 |
Family
ID=92315491
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN202410581408.6A Pending CN118429080A (en) | 2024-05-11 | 2024-05-11 | Repayment capability assessment method and system based on bank flow data |
Country Status (1)
| Country | Link |
|---|---|
| CN (1) | CN118429080A (en) |
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120387727A (en) * | 2025-04-14 | 2025-07-29 | 北京仁和诚信科技有限公司 | A supplier service quality assessment management system and method based on dynamic weight optimization |
-
2024
- 2024-05-11 CN CN202410581408.6A patent/CN118429080A/en active Pending
Cited By (1)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN120387727A (en) * | 2025-04-14 | 2025-07-29 | 北京仁和诚信科技有限公司 | A supplier service quality assessment management system and method based on dynamic weight optimization |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Veganzones et al. | Corporate failure prediction models in the twenty-first century: a review | |
| US7191150B1 (en) | Enhancing delinquent debt collection using statistical models of debt historical information and account events | |
| Bordeianu et al. | Analysis models of the bankruptcy risk | |
| US20230186385A1 (en) | Computer-implemented system and method of facilitating artificial intelligence based lending strategies and business revenue management | |
| CN109740792A (en) | Data predication method, system, terminal and computer storage medium | |
| Biswas et al. | Automated credit assessment framework using ETL process and machine learning | |
| CN118429080A (en) | Repayment capability assessment method and system based on bank flow data | |
| CN118761838A (en) | A risk assessment system for financial institutions based on big data analysis | |
| CN116739753A (en) | Digital twin construction method for retail business of bank | |
| Theuri et al. | The impact of Artficial Intelligence and how it is shaping banking | |
| Mokhova et al. | Liquidity, probability of bankruptcy and the corporate life cycle: the evidence from Czech Republic | |
| CN111105305A (en) | Machine learning-based receivable and receivable cash cashing risk control method and system | |
| CN118446828A (en) | Futures user behavior analysis system based on cloud computing | |
| Chen et al. | Interest rate liberalization and firm leverage in China: Effects and channels | |
| Singireddy | Finance 4.0: Predictive Analytics for Financial Risk Management Using AI | |
| CN112884259A (en) | Cross-enterprise credit rating and risk assessment method and system | |
| CN114429395A (en) | Enterprise credit rating method, system and storage medium based on semi-supervised learning | |
| Malempati | The evolution of credit data and the role of machine learning in modern credit scoring | |
| Giannouli et al. | Towards an improved credit scoring system with alternative data: the greek case | |
| CN112819335A (en) | Method, device, equipment and medium for evaluating business flexibility of small and micro enterprises | |
| Singh | PEER-TO-PEER LOAN DEFAULT PROPHECY IN FINTECH: A COMPARATIVE ANALYSIS OF THE PREDICTIVE PERFORMANCE OF MACHINE LEARNING MODELS | |
| Buitrón et al. | Machine Learning in Finance: An Application of Predictive Models to Determine the Payment Probability of a Client | |
| Zhang et al. | Risk assessment of financial loan based on fuzzy cluster analysis | |
| Sadatrasoul et al. | Investigating Revenue Smoothing Thresholds That Affect Bank Credit Scoring Models: An Iranian Bank Case Study | |
| Burayzat | Analyzing Default on Overdraft Accounts for Large Corporates and SMEs: An Analysis of Credit Risk Factors |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination |