CN119724545B

CN119724545B - A depression disease management system based on targeted metabolomics and machine learning models

Info

Publication number: CN119724545B
Application number: CN202411781535.7A
Authority: CN
Inventors: 冯冬; 安波; 贲畅
Original assignee: Nanjing Likang Pharmaceutical Technology Co ltd
Current assignee: Nanjing Likang Pharmaceutical Technology Co ltd
Priority date: 2024-12-05
Filing date: 2024-12-05
Publication date: 2025-09-23
Anticipated expiration: 2044-12-05
Also published as: CN119724545A

Abstract

The present invention relates to a depression disease management system based on targeted metabolomics and machine learning models. The system comprises an intelligent diagnostic model module, a differential diagnostic model module, a hierarchical diagnostic model module, a companion diagnostic model module, a treatment endpoint outcome prediction model module, a relapse prediction model module, and a comprehensive judgment module. The system provided by the present invention can address current clinical challenges in depression diagnosis, treatment outcomes, and relapse, enabling accurate diagnosis, timely treatment, and improving the mental health of the general population.

Description

Depression disease management system based on targeted metabonomics and machine learning model

Technical Field

The invention relates to the field of disease management systems, in particular to a depression disease management system based on a targeted metabonomics and machine learning model.

Background

Depression severely afflicts the life and work of patients, bringing a heavy burden to families and society.

One prominent problem in the diagnosis and treatment of depression is the low rate of identification of depression by the medical system, mainly because clinical diagnosis of depression is primarily judged by medical history, clinical symptoms, and course of the disease.

In addition, another prominent problem with respect to diagnosis and treatment of depression is the current lack of a complete diagnostic system for depression.

Disclosure of Invention

In view of the above-mentioned outstanding problems in the diagnosis and treatment of depression in current medical systems, the present invention aims to provide a combination of metabolic molecular markers related to the overall disease process management of depression and a depression disease management system.

The technical aim of the invention is realized by the following technical scheme that the depression disease management system based on the targeted metabonomics and machine learning model comprises an intelligent diagnosis model module, a differential diagnosis model module, a grading diagnosis model module, an accompanying diagnosis model module, a treatment end point return prediction model module, a recurrence prediction model module and a comprehensive judgment module;

The diagnosis system comprises an intelligent diagnosis model module, a grading diagnosis model module, a concomitance diagnosis model module, a treatment end point return prediction model module and a recurrence prediction model module, wherein the intelligent diagnosis model module is used for distinguishing depression patients, the differentiation diagnosis model module is used for distinguishing bipolar depression and unipolar depression, the grading diagnosis model module is used for distinguishing mild depression and moderate depression, the concomitance diagnosis model module is used for evaluating whether the depression patients take effect after treatment, the treatment end point return prediction model module is used for intelligently predicting clinical treatment returns of the patients and providing basis for optimizing treatment schemes, the recurrence prediction model module is used for predicting recurrence of diseases of patients in a stationary phase and timely taking intervention measures, and the comprehensive judgment module is used for integrating results of diagnosis, distinction and prediction models of other modules to construct a depression disease management system.

Further, the method for using the depressive disorder management system comprises the following steps:

s1, constructing an intelligent diagnosis model through an intelligent diagnosis model module, and identifying a patient suffering from depression;

S2, constructing a differential diagnosis model through a differential diagnosis model module, and distinguishing the bipolar depression and the unipolar depression;

s3, constructing a hierarchical diagnosis model through a hierarchical diagnosis model module, and distinguishing the mild depression from the moderate depression;

S4, constructing an accompanying diagnosis model through an accompanying diagnosis model module, and evaluating whether the treatment of the depression patient is effective or not;

s5, constructing a treatment end point return prediction model through a treatment end point return prediction model module, intelligently predicting the clinical treatment return of the patient, and providing a basis for optimizing a treatment scheme;

s6, constructing a recurrence prediction model through a recurrence prediction model module, predicting the recurrence of the disease of the patient in the stationary phase, and taking intervention measures in time;

s7, synthesizing each diagnosis, identification and prediction model through a comprehensive judgment module, and constructing a depression disease management system.

Further, the specific operation of constructing the intelligent diagnosis model through the intelligent diagnosis model module in the step S1 is that blood samples are collected, and the combination of the metabolic molecular markers such as proline, betaine, alanine, tryptophan, kynurenine, 5-hydroxytryptamine, creatine, succinic acid, taurine and 2-hydroxybutyric acid is detected, and the combination of the metabolic molecular markers under the optimal model is screened through different machine learning models to identify depression patients.

Further, the specific operation of constructing the differential diagnosis model through the differential diagnosis model module in the step S2 is that blood samples are collected, and the combination of the metabolic molecular markers such as proline, ornithine, kynurenine, 5-hydroxytryptamine, succinic acid, 2-hydroxybutyric acid, xanthine, bilirubin, gamma-aminobutyric acid, allantoin, deoxyribose, pyruvic acid, linoleic acid and uric acid is detected, and the optimal model is screened for identifying patients suffering from unipolar depression and bipolar depression through different machine learning models.

Further, the specific operation of constructing the hierarchical diagnosis model through the hierarchical diagnosis model module in the step S3 is to collect blood samples and detect the combination of the metabolic molecular markers of proline, valine, ornithine, tryptophan, 5-hydroxytryptamine, creatine, glutamine, serine, methionine, guanine, hypoxanthine, androstenedione, cytosine, bilirubin, gamma-aminobutyric acid, uric acid, adipic acid and pseudouridine, and construct the optimal model through different machine learning models to identify mild depression and moderate and severe depression.

Further, the specific operation of constructing the companion diagnostic model through the companion diagnostic model module in the S4 is to collect blood samples and detect the combination of the metabolic molecular markers such as ornithine, betaine, alanine, kynurenine, 5-hydroxytryptamine, glutamic acid, lysine, methionine, phenylalanine, guanine, hypoxanthine, gamma-aminobutyric acid, allantoin and uric acid, construct the optimal model through different machine learning models, and evaluate whether the model works after treatment of patients with depression, and help the screening of treatment drug schemes of patients with different courses.

The specific operation of constructing the treatment end point prognosis model through the treatment end point prognosis model module in the S5 is to track the treatment condition and recovery condition of a patient suffering from the depression to be treated, detect the following metabolic molecular marker combinations in different stages, namely valine, betaine, tryptophan, kynurenine, creatine, taurine, phenylalanine, tyrosine, histidine, aspartic acid, threonine, guanine, gamma-aminobutyric acid, allantoin and pseudouridine, construct through different machine learning models, screen an optimal model, intelligently predict the clinical treatment prognosis of the patient, and provide basis for optimizing the treatment scheme.

Further, the specific operation of constructing the recurrence prediction model through the recurrence prediction model module in the step S6 is to track the illness state of a patient suffering from depression, detect the following metabolic molecular marker combinations in different stages, namely proline, ornithine, alanine, tryptophan, kynurenine, creatine, 2-hydroxybutyric acid, androstenedione, cytosine, xanthine, bilirubin, gamma-aminobutyric acid, allantoin, deoxyribose, pyruvic acid and linoleic acid, construct through different machine learning models, screen an optimal model, and use the optimal model for predicting the recurrence of the illness of the patient in a stationary phase, and take intervention measures in time.

Further, the specific operation of the diagnosis, identification and prediction model is that the comprehensive judgment module is used for integrating the specific operations of the diagnosis, identification and prediction models in S7, namely, peripheral blood metabolite concentration data of different stages are collected by tracking the disease development of a patient suffering from depression, a model system for diagnosis, grading diagnosis, differential diagnosis, accompanying diagnosis, treatment outcome prediction and stationary phase recurrence prediction based on a plurality of metabolic molecular markers is developed by using a machine learning algorithm, wherein the disease management system provides an interactive interface which is convenient for a user to operate based on the peripheral blood metabolite expression concentration data, and corresponding results and disease management decision suggestions are output according to different purposes.

Further, the machine learning model includes logistic regression, lasso regression, decision trees, neural networks, support vector machines, extreme gradient boosting, random forests, principal component analysis, bayesian networks, and linear regression.

Further, the sample to be tested for the combination of metabolic molecular markers is human serum, plasma, or dried blood spots.

In conclusion, the marker combination and management system provided by the invention can be used for realizing early screening prediction of depression diseases, disease treatment prognosis prediction and disease recurrence prediction in a stationary phase. The technical scheme of the invention is used for diagnosing depression and guiding medication, and has higher specificity, sensitivity and accuracy. The invention can make the diagnosis of depression no longer depend on the experience of the clinician and the subjective judgment of questionnaire scale, and greatly improves the diagnosis accuracy. The invention has the following beneficial effects:

(1) The depression disease management system based on the targeted metabonomics and the machine learning model can solve the problems in the fields of clinical diagnosis, treatment prognosis, recurrence and the like, realize accurate diagnosis and timely treatment, and improve the national psychological health level.

(2) The depression disease management system based on the targeted metabonomics and machine learning model can raise the diagnosis and treatment level of depression by a new height.

(3) The method is widely applied to large data information analysis technologies such as artificial intelligence, develops an iterative artificial intelligence platform for updating in real time, utilizes a depressive disorder queue under a multi-group sample and multi-time diagnosis and treatment data, and can possibly deeply reveal the etiology and pathology mechanism of the depressive disorder, thereby making pioneering contribution for establishing a system for objectively diagnosing, accurately treating and evaluating the depressive disorder, developing brand-new depressive disorder intervention technology and means, and currently solving the leading-edge problem of domestic and foreign researches in the field.

(4) From the aspect of teenager population, the influence factors and mechanisms of gene-environment and interaction between the gene-environment on depression onset trend change are clarified, real-time data support is provided for the national establishment of depression prevention and treatment policies, and the method is also suitable for the urgent national demand for teenager depression prevention and treatment.

Drawings

FIG. 1 is a general flow chart of an embodiment of the present invention.

FIG. 2 is a schematic diagram of the construction of an intelligent model according to an embodiment of the present invention.

Fig. 3 is a ROC curve using a Generalized Linear Model (GLM) to distinguish healthy versus depressed patients.

Fig. 4 is a ROC curve for logistic regression to distinguish between unipolar and bipolar depression patients.

Fig. 5 is a ROC curve of Naive Bayes (Naive Bayes) distinguishing patients with mild and moderate major depressive disorder.

Fig. 6 is a ROC curve of Random Forest (RF) distinguishing between onset and no significant effect after treatment in depressed patients.

Fig. 7 is an ROC curve of an artificial Neural Network (NNET) distinguishing post-treatment return and non-return depression patients.

Fig. 8 is a ROC curve for Support Vector Machine (SVM) to distinguish between stationary phase relapsing and non-relapsing depressive patients.

Detailed Description

The invention is described in further detail below with reference to fig. 1-8.

Example 1 Metabolic marker screening and model building for depression diagnosis

1.1 Subjects

Samples of depression and healthy controls were constructed separately as follows, inclusion criteria and exclusion criteria.

Inclusion criteria for depressive samples (1) all patients with included depression met ICD-10 diagnostic criteria, (2) the score of the Hamilton Depression (HDRS) 17 term scale was 18 minutes or greater, (3) patients with first-onset depression were not administered any antidepressant, and (4) were older than 18 years and less than 60 years. Exclusion criteria (1) those who are currently associated with or have had history of other mental disorders, (2) those who are associated with history of organic and severe brain trauma or with heart, liver, kidney disease, diabetes and other severe somatic disorders, (3) those who are routinely checked in the laboratory for abnormalities (normative blood, liver function, normative urine), (4) those who are female study subjects in pregnancy, lactation, menstrual period, (5) history of drug and substance abuse.

The inclusion criteria for healthy controls were no history of neuropsychiatric disease, no history of drug abuse or dependence, no systemic somatic disease, and no obvious abnormalities in routine laboratory tests.

1.2 Preparation of dried blood spots and biomarker extraction

(1) The method comprises the following steps of respectively sampling blood and taking samples aiming at an included depression sample and a healthy control sample, and preparing a dried blood spot sample, wherein the specific process is as follows:

Preparing a card with dried blood spots, wherein the card is clean, pollution-free and mould-free, and the card is sealed before use. Each card should have independent identification information (name, age, gender, code). The card has a blood collection area thereon. The card for dry blood spots is divided into 2 parts, wherein the first part is a blood collection area with an inner ring with the radius of 3mm and an outer ring with the radius of 5mm, the area is forbidden to contact, and the second part is a holding area, and clean medical gloves are needed to be worn when the area is held. The dried blood spot card contains antioxidant (VC), enzyme inactivating agent 1-Aminobenzotriazole (ABT).

Blood is collected and sampled by using capillary blood and venous blood, namely, the capillary blood of the fingertip is collected, the fingertip is cleaned in advance, nonvolatile disinfectants such as iodophor and the like cannot be used, and then, a drop of blood is dripped into each inner ring of the dried blood spot card, so that the whole ring is covered, the filter paper is completely permeated, and meanwhile, the blood is prevented from being stained to the outer ring. The collecting process avoids strong light so as not to influence the photosensitive substance. After blood spots are collected, the blood spots are prevented from being exposed to air for a long time, and are quickly dried by adopting a nitrogen blowing or vacuum instrument mode. The collected dried blood spots were rapidly cooled in liquid nitrogen to inactivate enzymes, and then air-dried. The cards can not be stacked, and the cards are packaged by using independent sealing bags and then are frozen in a refrigerator at-80 ℃.

(2) A10 mm diameter small disc was cut along the outer circle line in the blood collection area with a punch, placed in a 1.5ml centrifuge tube, and 500ml of eluent/precipitant containing an internal standard was added. Vortex with shaking for about 5 minutes. Centrifuging at 15000g for 10min, collecting supernatant, centrifuging (or filtering with filter membrane), and collecting supernatant. Then adding 500 mu L of methanol-water solution (metabolite extraction solvent) in the depression detection kit, and oscillating at 1450rpm for 45 minutes at 20 ℃;

(3) 10. Mu.L of the extract was taken and 90. Mu.L of water was added. 200. Mu.L of the internal standard working solution, 50. Mu.L of 1MNaHCO ₃ solution, 100. Mu.L of 1mg/mL of derivatizing reagent and 30℃were added in this order, and the mixture was shaken at 1450rpm for 30 minutes.

1.3 Detection of biomarkers

The detected metabolites include methionine, phenylalanine, guanine, hypoxanthine, gamma-aminobutyric acid, allantoin, uric acid, histidine, aspartic acid, threonine, androstenedione, cytosine, xanthine, bilirubin, pyruvic acid, linoleic acid, glutamine, serine, adipic acid, pseudouridine, tyrosine, proline, valine, ornithine, betaine, alanine, tryptophan, leucine, kynurenine, 5-hydroxytryptamine, creatine, succinic acid, taurine, 2-hydroxybutyric acid, glutamic acid, glucose, lysine, arginine. The method comprises the following specific steps:

(1) Preparing a sample, namely taking the sample to be analyzed, and ensuring that the sample is within a required concentration range. Dilution of the sample (usually with mobile phase) is performed as needed. The sample is filtered to remove possible particulate matter, typically using a 0.45 μm or 0.2 μm filter.

(2) The mobile phase is selected, metabolic extract is injected into an extended C18 chromatographic column (ZORBAX RR extended-C18, 80A, 4.6X105 mm,3.5 mu m, USA) through an automatic sampler, metabolic products in a blood sample are separated, and specific liquid chromatography conditions are that the sample injection amount is 5 mu L, the flow rate is 0.6mL/min, the column temperature is 40 ℃, the temperature of the automatic sampler is 4 ℃, the mobile phase A is 0.1% (volume ratio) of aqueous solution of formic acid, the mobile phase B is methanol, the elution program is 0-1 min, the elution program is linearly changed from 10% B phase to 40% B, 1-1.5 min, the elution program is linearly changed from 40% B phase to 50% B phase, the elution program is linearly changed from 1.5min to 2.0min, the elution is linearly changed from 50% B phase, the elution is 2.0min to 3.2min, the elution is balanced between 80% B phase, the elution is 3.2min to 3.6min, the elution program is linearly changed from 80% B phase to 10% B phase, and the elution program is linearly from 3.7 to 4.5min to 10% B phase.

(3) The equipment is set up to check the various parts of the HPLC system including pump, sample injector, chromatographic column and detector. Ensuring that they are all in normal operation. Introducing the chromatographically separated metabolite into triple quadrupole mass spectrum, scanning and detecting the metabolite by adopting a multi-reaction detection mode, and adopting an electrospray ionization source (ESI), wherein the positive and negative ions are switched instantaneously, and the scanning time is that 0min～4.5min.CUR：20;CAD：Medium;IS：5000;TEM：500;GS1：30;GS2：50;Interface heater(ihe)：on;DP：±36;CE：±26;EP：±15;CXP：±10.Dwell Time：20ms;Resolution Q1：Unit;Resolution Q3：Unit;Pause between mass：5.007ms.

(4) Sample introduction-the introduction of the prepared sample into the system using an injector. Typical sample injection amounts are 10-100. Mu.L. Ensure the smooth flow of the liquid of the sample and record the sample injection time.

(5) After data acquisition, the metabolic markers of each diagnostic marker are quantified by adopting specific ion pairs, the peak area of the acquired signal is compared with the working curve of the corresponding standard, and an internal standard solution is added for correction, so that the concentration value of the metabolic markers is obtained, and the content of each metabolic marker in the original blood is obtained.

(6) The metabolites in the extract of the dried blood spot paper were quantitatively determined.

1.4 Stability test

After the dried blood spot paper is dried, the dried blood spot paper is respectively stored at-80 ℃ and-20 ℃ at room temperature. The content of substances in the dried blood spots is measured by adopting liquid chromatography-mass spectrometry, the sample stored at-80 ℃ is measured once a month, the sample stored at-20 ℃ is measured once a month, the sample stored at room temperature is measured once every other day, the Relative Standard Deviation (RSD) is calculated, the initial measurement mean value is taken as an initial value, the recovery rate is calculated, the results show that the precision is within 10 percent, the recovery rate is 80-110 percent, the dried blood spot paper sheet can be stably stored for at least one year under the condition of-80 ℃, the dried blood spot paper sheet can be stably stored for at least 3 months under the condition of-20 ℃ and the dried blood spot paper sheet can be stably stored for 15 days under the condition of room temperature.

1.5 Detection results

150 Normal human dry blood spot samples are collected, the metabolic substances in the dry blood spot samples are quantified, and the normal reference range of the marker is determined. The chromatogram of the metabolic marker is shown in FIG. 1. During the dry blood spot sample analysis, every 10 study samples were interspersed with one QC sample to assess the variability of the derivatization process and the instrumental analysis. QC samples are critical to ensure reproducibility, reliability, accuracy and robustness of quantitative analysis of metabolites.

Samples of 150 healthy people and 60 patients suffering from depression are collected and detected at the same time, and the sample test results show that different levels of metabolites can well distinguish normal people from patients suffering from depression, namely proline, betaine, alanine, tryptophan, kynurenine, 5-hydroxytryptamine, creatine, succinic acid, taurine and 2-hydroxybutyric acid (table 1). Based on the determined plasma metabolites, principal Component Analysis (PCA) was performed, and as a result, it was found that QC samples were tightly clustered with respect to other samples of dried blood spots, indicating that the method of this example had good reproducibility. The QC sample here is BSA (bovine serum albumin solution). The result of the orthogonal partial least squares-discriminant analysis (OPLS-DA) analysis shows that normal and depressive samples can be well distinguished, indicating that there is a significant difference in metabolism between depressed patients and normal people.

TABLE 1 differential expression of metabolites in depressed versus healthy groups

Metabolites and methods of use	Patients suffering from depression	Healthy people	FC value	VIP value	P value
						Proline (mu mol/L)	81.48±15.57	98.71±16.94	0.83	1.56	<0.001
5-Hydroxytryptamine (ng/mL)	43.00±6.71	100.94±9.10	0.43	5.61	<0.001
						Creatine (mg/dL)	0.95±0.23	0.81±0.21	1.17	3.15	<0.001
Betaine (ng/mL)	5674±1840	5017±1679	1.13	2.42	0.0175
						Alanine (mu mol/L)	328.37±105.55	375.79±88.67	0.87	3.71	0.0024
Tryptophan (mu mol/L)	56.11±6.87	89.48±9.41	0.63	4.11	<0.001
						Kynurenine (mu mol/L)	2.75±0.82	2.23±0.45	1.23	1.87	<0.001
Succinic acid (ng/mL)	334.02±100.8	519.31±141.87	0.64	4.30	<0.001
						Taurine (ng/mL)	14451±3199	11814±4429	1.22	1.13	<0.001
2-Hydroxybutyric acid (ng/mL)	6490±2624	4946±1883	1.31	1.91	<0.001

1.6 Machine learning and feature selection

(1) Model construction method

The inventors aimed at identifying metabolite features using machine learning techniques and predicting physiological states from metabolome data to better distinguish between depressed patients. The metabolite biomarkers for distinguishing the patients with depression are screened out by adopting a generalized linear model, linear discriminant analysis, K-nearest neighbor algorithm, logistic regression, lasso regression, decision tree, artificial neural network, support vector machine, extreme gradient lifting, random forest, principal component analysis, bayesian network and linear regression machine learning method. To compensate for missing values, a DBB strategy is employed. To evaluate the model performance, receiver Operating Characteristics (ROC) and balance accuracy were calculated. ROC curves were constructed using pROC software package and Area Under Curve (AUC) values were calculated. And calculating the probability statistics of the ROC according to a predictive score formula obtained by PLS analysis. ROC curves for the whole metabolite profile can be compared to different models according to the corresponding AUC values. Furthermore, the Variable Importance (VIP) values in the projections were also analyzed, reflecting the relative importance of each metabolite in the projection model. A total of 10 metabolites were identified by a method combining Machine Learning (ML) with bioinformatics. To verify whether the selected metabolites are capable of building a good classifier that distinguishes between different populations, ROC curves were quantified to assess their diagnostic performance.

(2) Analysis of experimental results

The identification of plasma biomarkers using machine learning to distinguish between depressed patients, screening for potential specific biomarkers to better distinguish between depressed patients, and finding that the symptoms of these diseases are somewhat correlated with an objective clinical scale. The inventors used machine learning to determine promising metabolite markers and predict physiological states from proteomic data. First, different ML methods for classification were constructed (table 2), and the performance of the different models was evaluated by analysis of the confusion matrix and ROC curves. As a result, it was found that in 13 machine learning methods, models were further constructed using different algorithms, the best model was a Generalized Linear Model (GLM), the significantly varying metabolites were subject working characteristic curves (ROCs), the area under the curves (AUC) values were 1, the accuracy was 99.30%, the sensitivity was 100%, and the specificity was 98.96%, see fig. 3. The model is shown to have good diagnostic value. The combination of the metabolites proline, betaine, alanine, tryptophan, kynurenine, 5-hydroxytryptamine, creatine, succinic acid, taurine, 2-hydroxybutyric acid shows excellent specificity and sensitivity.

TABLE 2 comparative analysis of results of different models for depression diagnosis

Model	AUC values	Accuracy (%)	Sensitivity (%)	Specificity (%)
					Generalized linear model	1.000	99.30%	100.00%	98.96%
Linear discriminant analysis	0.9982	98.60%	97.92%	100.00%
					K-nearest neighbor algorithm	0.9576	94.41%	97.92%	87.23%
Logistic regression	0.9976	97.90%	97.92%	97.87%
					Lasso regression	0.9981	97.20%	98.96%	93.62%
Decision tree	0.9983	97.20%	96.87%	97.87%
					Artificial neural network	0.9749	93.71%	93.75%	93.62%
Support vector machine	0.9978	97.20%	97.92%	95.74%
					Extreme gradient lifting	0.9989	99.30%	98.96%	100.00%
Random forest	0.9996	97.90%	96.87%	100.00%
					Principal component analysis	0.9980	97.90%	97.92%	97.87%
Bayesian network	0.9844	97.90%	96.87%	100.00%
					Linear regression	0.9953	99.30%	98.96%	100.00%

Experimental example 1 evaluation of the Metabolic marker kit and diagnostic model established in example 1 diagnostic Effect on depression

Dry blood spot samples from a hospital clinic or hospitalized patient, as well as the recruited healthy control (total of 300 samples, all of which collected definitive diagnosis data by the specialist according to ICD-10 standard and signed informed consent) were randomly taken and the ratio of the ion abundance of the biomarker (proline, betaine, alanine, tryptophan, kynurenine, 5-hydroxytryptamine, creatine, succinic acid, taurine, 2-hydroxybutyric acid) to the internal standard was directly detected by the method of example 1, and the data was entered into the diagnostic model established in example 1 to obtain diagnostic results, which were compared with the results of the 300 samples by the specialist for depression diagnosis by ICD-10, as shown in Table 3.

TABLE 3 evaluation of the efficacy of Metabolic marker kits for diagnosis of depression

As can be seen from table 3, the accuracy of diagnosis of depression was 100%, the sensitivity was 100% and the specificity was 100% using the marker combination of example 1 and the diagnostic model thus established.

Example 2 Metabolic marker screening and model establishment for differential diagnosis of unipolar and bipolar depression

In the same manner as in example 1, a single phase and a bipolar depression patient were identified, and the test results were as follows:

Samples of 150 cases of unipolar depression patients and 60 cases of bipolar depression patients were collected and tested, and the results of the sample tests showed that the differences in the metabolite levels were able to well distinguish unipolar depression patients from bipolar depression patients, proline, ornithine, kynurenine, 5-hydroxytryptamine, succinic acid, 2-hydroxybutyric acid, xanthine, bilirubin, gamma-aminobutyric acid, allantoin, deoxyribose, pyruvic acid, linoleic acid and uric acid (Table 4). Further, different algorithms were used to construct models (table 5), the best model was based on logistic regression, and further, significantly varying metabolites were used as the subject's working characteristics curve (ROC), with an area under the curve (AUC) value of 1, 100% accuracy, 100% sensitivity, 100% specificity, see fig. 4. The model has good differential diagnosis value. The combination of the metabolites proline, ornithine, kynurenine, 5-hydroxytryptamine, succinic acid, 2-hydroxybutyric acid, xanthine, bilirubin, gamma-aminobutyric acid, allantoin, deoxyribose, pyruvic acid, linoleic acid and uric acid shows excellent specificity and sensitivity.

TABLE 4 differential expression of metabolites in bipolar depressed groups compared to unipolar depressed groups

TABLE 5 comparative analysis of results of different models for differential diagnosis of unipolar and bipolar depression

Model	AUC values	Accuracy (%)	Sensitivity (%)	Specificity (%)
					Generalized linear model	0.9896	96.50%	97.92%	93.62%
Linear discriminant analysis	0.9991	99.30%	98.96%	100.00%
					K-nearest neighbor algorithm	0.9996	99.30%	98.96%	100.00%
Logistic regression	1.000	100.00%	100.00%	100.00%
					Lasso regression	0.9856	96.50%	97.92%	93.62%
Decision tree	0.9531	94.40%	98.96%	85.11%
					Artificial neural network	0.9894	96.50%	97.92%	93.62%
Support vector machine	0.9802	95.80%	100.00%	87.23%
					Extreme gradient lifting	0.9880	95.10%	97.92%	89.36%
Random forest	0.9807	95.10%	97.92%	89.36%
					Principal component analysis	0.9957	95.80%	98.96%	89.36%
Bayesian network	0.9914	98.60%	100.00%	95.74%
					Linear regression	0.9938	95.80%	98.96%	89.36%

Experimental example 2 evaluation of the differential diagnosis Effect of the Metabolic marker combinations and diagnostic models established in example 2 on unipolar depression and bipolar depression

A total of 220 cases of dry blood spot samples derived from a hospital clinic or hospitalized patient, including patients diagnosed primarily as monophasic or biphasic, were randomly taken, all samples were collected from the differential diagnosis results based on medical history by the specialist, and informed consent was signed. The ratio of the ion abundance of the biomarker combinations (proline, ornithine, kynurenine, 5-hydroxytryptamine, succinic acid, 2-hydroxybutyric acid, xanthine, bilirubin, gamma-aminobutyric acid, allantoin, deoxyribose, pyruvic acid, linoleic acid and uric acid) to the internal standard was directly measured by the method of example 2, and the data was input into the diagnostic model established in example 2 to obtain a diagnosis result, which was compared with the differential diagnosis result of a specialist, and the results are shown in table 6.

TABLE 6 evaluation of the Effect of Metabolic marker kit for differential diagnosis of unipolar depression and bipolar depression

As can be seen from table 6, using the marker combination of example 2 and the diagnostic model thus established, the accuracy of identifying unipolar and bipolar depression was 97.27%, the sensitivity was 97.10% and the specificity was 97.56%.

Example 3 Metabolic marker screening and model establishment for hierarchical diagnosis of mild and moderate major depressive disorder

As with example 1, patients with mild and moderate major depressive disorder were identified and tested as follows:

Samples of 120 patients with mild depression and 50 patients with moderate depression were collected and tested, and the results of the sample tests showed that the differences in the metabolite levels were able to distinguish between mild and moderate depression patients well from proline, valine, ornithine, tryptophan, 5-hydroxytryptamine, creatine, glutamine, serine, methionine, guanine, hypoxanthine, androstenedione, cytosine, bilirubin, gamma-aminobutyric acid, uric acid, adipic acid, pseudouridine (Table 7). Further, different algorithms are adopted to construct a model (table 8), the optimal model is naive bayes (native_ bayes), a significantly-changed metabolite is further adopted as a subject work characteristic curve (ROC), the area under the curve (AUC) value is 0.98, the accuracy is 98.60%, the sensitivity is 97.87%, and the specificity is 98.96%. See fig. 5. The model is shown to have good hierarchical diagnostic value. The combination of the metabolites proline, valine, ornithine, tryptophan, 5-hydroxytryptamine, creatine, glutamine, serine, methionine, guanine, hypoxanthine, androstenedione, cytosine, bilirubin, gamma-aminobutyric acid, uric acid, adipic acid, pseudouridine shows excellent specificity and sensitivity.

TABLE 7 differential expression of metabolites present in the slightly and moderately severe depressed groups

TABLE 8 comparative analysis of results of different models for hierarchical diagnosis of mild and moderate major depressive disorder

Model	AUC values	Accuracy (%)	Sensitivity (%)	Specificity (%)
					Generalized linear model	0.9395	88.11%	92.71%	78.72%
Linear discriminant analysis	0.8867	83.22%	83.33%	82.98%
					K-nearest neighbor algorithm	0.9357	86.71%	92.71%	74.47%
Logistic regression	0.9153	90.91%	94.79%	82.98%
					Lasso regression	0.8991	81.82%	86.46%	72.34%
Decision tree	0.9339	88.11%	90.62%	82.98%
					Artificial neural network	0.9388	90.21%	92.71%	85.11%
Support vector machine	0.9457	89.51%	90.62%	87.23%
					Extreme gradient lifting	0.9641	90.21%	90.62%	89.36%
Random forest	0.9386	86.71%	92.71%	74.49%
					Principal component analysis	0.8204	80.42%	84.37%	72.34%
Bayesian network	0.9800	98.60%	97.87%	98.96%
					Linear regression	0.9058	82.52%	88.54%	70.21%

Experimental example 3 evaluation of the Effect of the combination of Metabolic markers and the diagnostic model established in example 3 on the Graded diagnosis of Mild and moderate major depressive disorder

Randomly taking dry blood spot samples from a hospital clinic or hospitalized patient, including 501 samples, all samples collecting explicit grading diagnosis data of a professional doctor according to ICD-10 standard, signing informed consent, directly detecting the ion abundance ratio of biomarkers (proline, valine, ornithine, tryptophan, 5-hydroxytryptamine, creatine, glutamine, serine, methionine, guanine, hypoxanthine, androstenedione, cytosine, bilirubin, gamma-aminobutyric acid, uric acid, adipic acid and pseudouridine) and internal standard according to the method of example 3, inputting the data into the diagnosis model established in example 3 to obtain grading diagnosis results, and comparing the diagnosis results with the grading diagnosis results of the professional doctor according to ICD-10 standard, wherein the results are shown in Table 9;

table 9 evaluation of the Effect of Metabolic marker kit in the hierarchical diagnosis of mild and moderate major depressive disorder

As can be seen from table 9, the accuracy of identification of major and minor depression was 100%, the sensitivity was 100% and the specificity was 100% using the marker combination of example 3 and the hierarchical diagnostic model thus established.

Example 4 Metabolic marker screening and model establishment to assess whether treatment of depressed patients is effective

In the same manner as in example 1, whether or not there is an obvious effect after treatment of a patient suffering from depression is evaluated, and the detection results are as follows:

samples of 140 patients with no significant effect on treatment and 60 patients with no significant effect on treatment were collected and tested, and the results of the sample tests showed that the differences in metabolite levels were able to well distinguish between patients with no significant effect on treatment, ornithine, betaine, alanine, kynurenine, 5-hydroxytryptamine, glutamic acid, lysine, methionine, phenylalanine, guanine, hypoxanthine, gamma-aminobutyric acid, allantoin and uric acid (Table 10). Further, different algorithms were used to construct a model (table 11), the best model being a Random Forest (RF), and further using significantly varying metabolites as the subject's working characteristics curve (ROC), the area under the curve (AUC) value was 1, accuracy 100%, sensitivity 100%, specificity 100%, see fig. 6. The model is shown to have good concomitant diagnostic value. The combination of the metabolites ornithine, betaine, alanine, kynurenine, 5-hydroxytryptamine, glutamic acid, lysine, methionine, phenylalanine, guanine, hypoxanthine, gamma-aminobutyric acid, allantoin and uric acid shows excellent specificity and sensitivity.

TABLE 10 differentially expressed metabolites present compared to the active and non-active depressed groups

TABLE 11 comparative analysis of results of different models for assessing efficacy after treatment of depression

Model	AUC values	Accuracy (%)	Sensitivity (%)	Specificity (%)
					Generalized linear model	0.9969	97.90%	98.96%	95.74%
Linear discriminant analysis	0.9576	94.40%	97.92%	87.23%
					K-nearest neighbor algorithm	0.99649	96.50%	98.96%	91.49%
Logistic regression	0.99809	97.20%	98.96%	93.62%
					Lasso regression	0.9895	96.50%	98.96%	91.49%
Decision tree	0.9976	96.50%	98.96%	91.49%
					Artificial neural network	0.9946	95.10%	96.87%	91.49%
Support vector machine	0.9978	97.90%	98.96%	95.74%
					Extreme gradient lifting	0.9980	97.90%	98.96%	95.74%
Random forest	1.000	100.00%	100.00%	100.00%
					Principal component analysis	0.9969	97.90%	98.96%	95.74%
Bayesian network	0.9078	83.20%	89.58%	70.21%
					Linear regression	0.9302	83.22%	95.83%	57.45%

Experimental example 4 evaluation of the concomitant diagnostic Effect of the diagnostic model constructed in example 4 on whether or not to be effective after treatment of a depressive patient

Dried blood spot samples from a hospital clinic or hospitalized patient were randomly taken, all samples were collected with evaluation data of the scale method and informed consent was signed, and the ratio of the ion abundance of the biomarkers (ornithine, betaine, alanine, kynurenine, 5-hydroxytryptamine, glutamic acid, lysine, methionine, phenylalanine, guanine, hypoxanthine, γ -aminobutyric acid, allantoin and uric acid) to the internal standard was directly detected by referring to the method of example 4, and the data was input into the predictive model established in example 4 to obtain the accompanying diagnosis results, which were compared and analyzed with the accompanying diagnosis results of a specialist doctor, and the results are shown in table 12.

TABLE 12 evaluation of efficacy of Metabolic marker kits to evaluate whether or not a depressive patient is effective after treatment

As can be seen from table 12, using the marker combination of example 4 and the accompanying diagnostic model established therefrom, the evaluation of efficacy after treatment for depression was 95.14% accurate, 96.40% sensitive, and 92.00% specific.

Example 5 Metabolic marker screening and model establishment for predicting treatment outcome in depressed patients

In the same manner as in example 1, the treatment outcome of the patients with depression was predicted as follows:

130 treatment-postulated depressive patients and 50 treatment-non-postulated depressive patient samples were collected and tested, and the results of the sample tests showed that the differences in metabolite levels were able to distinguish well between treatment-postulated and non-postulated patients of depression, valine, betaine, tryptophan, kynurenine, creatine, taurine, phenylalanine, tyrosine, histidine, aspartic acid, threonine, guanine, gamma-aminobutyric acid, allantoin and pseudouridine (Table 13). Further, different algorithms were used to construct a model (table 14), the best model being an artificial Neural Network (NNET), and further, a significantly varying metabolite was used as the subject's working characteristic curve (ROC), with an area under the curve (AUC) value of 1, 100% accuracy, 100% sensitivity, 100% specificity, see fig. 7. The model has good prediction value. The combination of the metabolites valine, betaine, tryptophan, kynurenine, creatine, taurine, phenylalanine, tyrosine, histidine, aspartic acid, threonine, guanine, gamma-aminobutyric acid, allantoin and pseudouridine shows excellent specificity and sensitivity.

TABLE 13 differential expression of metabolites present compared to the group of postulated and non-postulated depression

TABLE 14 comparative analysis of results of different models for predicting whether a depressive patient will return after treatment

Model	AUC values	Accuracy (%)	Sensitivity (%)	Specificity (%)
					Generalized linear model	0.9113	86.01%	93.75%	70.21%
Linear discriminant analysis	0.8547	84.61%	89.58%	74.49%
					K-nearest neighbor algorithm	0.9071	83.92%	93.75%	63.83%
Logistic regression	0.9281	85.31%	96.87%	61.70%
					Lasso regression	0.8970	84.61%	89.58%	74.47%
Decision tree	0.9191	82.52%	93.75%	59.57%
					Artificial neural network	1.000	100.00%	100.00%	100.00%
Support vector machine	0.9477	85.31%	94.79%	65.96%
					Extreme gradient lifting	0.9410	84.61%	88.54%	76.6%
Random forest	0.9160	84.61%	92.71%	68.08%
					Principal component analysis	0.9253	85.31%	95.83%	63.83%
Bayesian network	0.9188	86.71%	92.71%	74.47%
					Linear regression	0.9375	85.31%	90.62%	74.47%

Experimental example 5 evaluation of the predictive Effect of the predictive model constructed in example 5 on treatment outcome of patients with depression

Dried blood spot samples from a hospital clinic or hospitalized patient were randomly taken, all samples were collected for complete scale evaluation data and informed consent was signed, and biomarkers (valine, betaine, tryptophan, kynurenine, creatine, taurine, phenylalanine, tyrosine, histidine, aspartic acid, threonine, guanine, gamma-aminobutyric acid, allantoin and pseudouridine) were directly detected with reference to the method of example 5, and the ratio of ion abundance of the biomarkers to the internal standard was input into the predictive model established in example 5 to obtain predictive results, which were compared with the scale evaluation results, and the results are shown in table 15.

TABLE 15 evaluation of the effect of Metabolic marker kits to predict treatment outcome for patients with depression

As can be seen from table 15, using the marker combination of example 5 and the predictive model thus established, the accuracy of the prediction of treatment outcome for depressed patients was 93.91%, the sensitivity was 93.12% and the specificity was 95.71%.

Example 6 Metabolic marker screening and model establishment for predicting whether a depressive patient will relapse during the stationary phase

In the same manner as in example 1, it was predicted whether or not the stationary phase of the patient suffering from depression was recurrent, and the detection results were as follows:

Samples of 130 patients with no relapse in the stationary phase and 58 patients with depression in the stationary phase were collected and tested, and the results of the sample test showed that the differences in the metabolite levels were able to distinguish well between patients with relapse in the stationary phase and those with no relapse, proline, ornithine, alanine, tryptophan, kynurenine, creatine, 2-hydroxybutyric acid, androstenedione, cytosine, xanthine, bilirubin, gamma-aminobutyric acid, allantoin, deoxyribose, pyruvic acid and linoleic acid (Table 16). Further, different algorithms are adopted to construct a model (table 17), the optimal model is a Support Vector Machine (SVM), a significantly changed metabolite is further adopted as a subject work characteristic curve (ROC), the area under the curve (AUC) value is 1, the accuracy is 99.30%, the sensitivity is 100%, the specificity is 98.96%, and the chart is shown in fig. 8. The model has good prediction value. The combination of the metabolites proline, ornithine, alanine, tryptophan, kynurenine, creatine, 2-hydroxybutyric acid, androstenedione, cytosine, xanthine, bilirubin, gamma-aminobutyric acid, allantoin, deoxyribose, pyruvic acid and linoleic acid shows excellent specificity and sensitivity.

Table 16 differentially expressed metabolites present in recurrent and non-recurrent depressed groups

TABLE 17 comparative analysis of results of different models for predicting whether a depressed patient will relapse during the stationary phase

Model	AUC values	Accuracy (%)	Sensitivity (%)	Specificity (%)
					Generalized linear model	0.9869	96.50%	97.92%	93.62%
Linear discriminant analysis	0.9511	94.40%	97.92%	87.23%
					K-nearest neighbor algorithm	0.9867	95.80%	97.92%	91.49%
Logistic regression	0.9836	95.80%	98.96%	89.36%
					Lasso regression	0.9815	94.40%	98.96%	85.11%
Decision tree	0.9776	94.40%	98.96%	85.11%
					Artificial neural network	0.9742	94.40%	98.96%	85.11%
Support vector machine	1.000	99.30%	100.00%	98.96%
					Extreme gradient lifting	0.9894	95.80%	96.87%	93.62%
Random forest	0.9874	97.90%	98.96%	95.74%
					Principal component analysis	0.9865	95.80%	97.92%	91.49%
Bayesian network	0.9869	96.50%	97.92%	93.62%
					Linear regression	0.9362	86.01%	90.62%	76.59%

Experimental example 6 evaluation of the predictive Effect of the predictive model constructed in example 6 on the recurrence of depression patient in stationary phase

Dried blood spot samples from a hospital clinic or hospitalized patient were randomly taken, all samples were taken from explicit diagnostic data from the ICD-10 standard by a specialist, informed consent was signed, and biomarkers (proline, ornithine, alanine, tryptophan, kynurenine, creatine, 2-hydroxybutyric acid, androstenedione, cytosine, xanthine, bilirubin, gamma-aminobutyric acid, allantoin, deoxyribose, pyruvic acid and linoleic acid) were directly detected with reference to the method of example 6, and the data were input into the predictive model established in example 6 to obtain predictive results, which were compared with the diagnostic results of the specialist, as shown in Table 18.

TABLE 18 evaluation of the effect of Metabolic marker kit to predict the recurrence of patients with depression in stationary phase

As can be seen from table 18, using the marker combination of example 6 and the predictive model thus established, the accuracy of the prediction of the stationary phase recurrence for the depressed patient was 96.63%, the sensitivity was 90.00% and the specificity was 97.34%.

The embodiments of the present invention have been described in detail above, but the present invention is not limited to the described embodiments. It will be apparent to those skilled in the art that various changes, modifications, substitutions and alterations can be made to these embodiments without departing from the principles and spirit of the invention, and yet fall within the scope of the invention.

Claims

1. The depression disease management system based on the targeted metabonomics and machine learning model is characterized by comprising an intelligent diagnosis model module, a differential diagnosis model module, a hierarchical diagnosis model module, an accompanying diagnosis model module, a treatment end point return prediction model module, a recurrence prediction model module and a comprehensive judgment module;

The diagnosis system comprises an intelligent diagnosis model module, a grading diagnosis model module, a concomitance diagnosis model module, a treatment end point return prediction model module, a recurrence prediction model module and a comprehensive judgment module, wherein the intelligent diagnosis model module is used for distinguishing depression patients, the distinguishing diagnosis model module is used for distinguishing bipolar depression and unipolar depression, the grading diagnosis model module is used for distinguishing mild depression and moderate depression, the concomitance diagnosis model module is used for evaluating whether the depression patients play a role after treatment, the treatment end point return prediction model module is used for intelligently predicting clinical treatment returns of the patients and providing basis for optimizing treatment schemes, the recurrence prediction model module is used for predicting recurrence of diseases of patients in a stationary period and timely taking intervention measures, and the comprehensive judgment module is used for integrating the results of diagnosis, distinguishing and prediction models of other modules to construct a depression disease management system;

wherein, the using method of the depression disease management system comprises the following steps:

s5, constructing a treatment end point return prediction model through a treatment end point return prediction model module, and performing intelligent prediction on the clinical treatment return of the patient and providing a basis for optimizing a treatment scheme;

S6, constructing a recurrence prediction model through a recurrence prediction model module, and timely taking intervention measures for predicting the recurrence of the disease of the patient in the stationary phase;

s7, synthesizing each diagnosis, identification and prediction model through a comprehensive judgment module to construct a depression disease management system;

The specific operation of constructing an intelligent diagnosis model through an intelligent diagnosis model module in the S1 is that a blood sample is collected, and the following metabolic molecular marker combinations are detected, namely proline, betaine, alanine, tryptophan, kynurenine, 5-hydroxytryptamine, creatine, succinic acid, taurine and 2-hydroxybutyric acid, are constructed through different machine learning models, and the metabolic marker combinations under the optimal model are screened for identifying depression patients;

The specific operation of constructing a differential diagnosis model through a differential diagnosis model module in the S2 is that a blood sample is collected, and the following metabolic molecular marker combinations are detected, namely proline, ornithine, kynurenine, 5-hydroxytryptamine, succinic acid, 2-hydroxybutyric acid, xanthine, bilirubin, gamma-aminobutyric acid, allantoin, deoxyribose, pyruvic acid, linoleic acid and uric acid, and the optimal model is constructed through different machine learning models and is used for identifying patients suffering from unipolar depression and bipolar depression;

The specific operation of constructing a hierarchical diagnosis model through a hierarchical diagnosis model module in the step S3 is that blood samples are collected, and metabolic molecular marker combinations such as proline, valine, ornithine, tryptophan, 5-hydroxytryptamine, creatine, glutamine, serine, methionine, guanine, hypoxanthine, androstenedione, cytosine, bilirubin, gamma-aminobutyric acid, uric acid, adipic acid and pseudouridine are detected, and an optimal model is screened through different machine learning models for identifying mild depression and moderate and severe depression;

The specific operation of constructing a companion diagnostic model through a companion diagnostic model module in S4 is that blood samples are collected, and the combination of metabolic molecular markers such as ornithine, betaine, alanine, kynurenine, 5-hydroxytryptamine, glutamic acid, lysine, methionine, phenylalanine, guanine, hypoxanthine, gamma-aminobutyric acid, allantoin and uric acid is detected, and an optimal model is constructed through different machine learning models and is used for evaluating whether the patients with depression play a role after treatment and helping the screening of treatment drug schemes of the patients with different courses;

The specific operation of constructing a treatment end point prognosis model through a treatment end point prognosis model module in S5 is that the treatment condition and recovery condition of a depression patient to be treated are tracked, and the detection of the following metabolic molecular marker combinations in different stages, namely valine, betaine, tryptophan, kynurenine, creatine, taurine, phenylalanine, tyrosine, histidine, aspartic acid, threonine, guanine, gamma-aminobutyric acid, allantoin and pseudouridine, is carried out, and the optimal model is constructed through different machine learning models, so as to intelligently predict the clinical treatment prognosis of the patient, and provide basis for optimizing the treatment scheme;

The specific operation of constructing a recurrence prediction model through a recurrence prediction model module in the S6 is that illness state tracking is carried out on a patient suffering from depression, and detection of the following metabolic molecular marker combinations in different stages is carried out, wherein the metabolic molecular marker combinations comprise proline, ornithine, alanine, tryptophan, kynurenine, creatine, 2-hydroxybutyric acid, androstenedione, cytosine, xanthine, bilirubin, gamma-aminobutyric acid, allantoin, deoxyribose, pyruvic acid and linoleic acid;

The specific operation of the comprehensive judgment module in the S7 is that peripheral blood metabolite concentration data of different stages are collected by tracking the disease development of a depression patient, a model system based on depression diagnosis, grading diagnosis, differential diagnosis, concomitant diagnosis, treatment prognosis and stationary-stage recurrence prediction of a plurality of metabolic molecular markers is developed by utilizing a machine learning algorithm, wherein the disease management system provides an interactive interface which is convenient for a user to operate based on the peripheral blood metabolite expression concentration data, and outputs corresponding results and disease management decision suggestions according to different purposes;

the machine learning model comprises a generalized linear model, linear discriminant analysis, a K-nearest neighbor algorithm, logistic regression, lasso regression, a decision tree, an artificial neural network, a support vector machine, extreme gradient lifting, random forest, principal component analysis, a Bayesian network and linear regression.

2. The depressive disorder management system of claim 1, wherein the detected sample of the metabolic molecular marker combination is human serum, plasma, or dried blood spot.