Method for detecting imatinib metabolite in plasma of GIST patient based on non-targeted metabonomics
1. A method for non-targeted metabolomics based detection of imatinib metabolites in plasma of GIST patients comprising the steps of:
s1, collecting peripheral venous blood of a plurality of GIST patients who do not take IM and regularly take IM for a long time respectively, centrifuging the collected peripheral venous blood, and storing at low temperature to obtain plasma samples;
s2, analyzing the IM blood concentration in the plasma sample by adopting a two-dimensional liquid chromatography, and dividing the plasma sample into at least two groups according to the measured blood concentration value;
s3, measuring a proper amount of plasma samples which are divided into volume groups, respectively transferring the plasma samples into a centrifugal tube, adding methanol which is larger than the volume of the plasma samples in the centrifugal tube for vortex, and then performing low-temperature incubation; after the incubated sample is vortexed, centrifuging to collect supernatant, and standing; centrifuging the supernatant after standing again, and transposing the supernatant into a sample bottle to obtain a sample to be detected;
s4, carrying out non-targeted detection on the sample to be detected to obtain original data of metabolites in blood plasma;
s5, carrying out multivariate analysis on the original data, combining a VIP value with a P value of t test and an FC value of difference multiple analysis by using a variable projection importance value of an OPLS-DA model, searching for a difference metabolite under the conditions that VIP is more than or equal to 1, P is less than 0.05, and FC is more than or equal to 2 or less than or equal to 0.5, and searching a database to qualitatively identify GIST metabolic markers;
s6, carrying out KEGG pathway analysis on the metabolic markers, and determining the metabolic process of the GISI patient.
2. The method for detecting imatinib metabolites in plasma of GIST patients based on non-targeted metabolomics as claimed in claim 1, wherein in step S1, the peripheral venous blood is collected in centrifuge tube, centrifuged at 5000r/min for 5min at room temperature, the supernatant is collected and transferred to EP tube, and stored at-80 ℃ to obtain plasma sample.
3. The method of claim 1, wherein step S2 comprises setting IM plasma concentration > 2000ng/mL as group A, IM plasma concentration 1100-2000ng/mL as group B, IM plasma concentration < 1100ng/mL as group C, and IM-naive GIST patients as group D based on the determined plasma concentration values.
4. The method for non-targeted metabolomics-based detection of imatinib metabolites in plasma of GIST patients of claim 1, wherein said methanol of step S3 comprises 1ppm of 2-chlorophenylalanine, which is vortexed and incubated at-20 ℃ for 0.5 h; centrifuging the incubated plasma sample at 12000r/min at 4 ℃ for 10min, collecting 200uL of supernatant, transferring the supernatant into a new centrifuge tube, and standing at-20 ℃ for 0.5 h; and centrifuging the supernatant after standing at 12000r/min at 4 ℃ for 15min again, and taking the supernatant and transferring the supernatant into a sample bottle to obtain a sample to be detected.
5. The method for non-targeted metabolomics-based detection of imatinib metabolites in plasma of GIST patients according to claim 1, wherein the collection conditions of said UPLC-QTOF/MS combination technique in step S4 are:
and (3) UPLC: a Waters T3C 18 column, column temperature 40 ℃, flow rate 0.35mL/min, sample size 2uL, mobile phase a (0.04% acetic acid/water) and mobile phase B (0.04% acetic acid/acetonitrile);
QTOF/MS: adopting a positive ion mode and a negative ion mode of an electrospray ion source, wherein the positive ion mode is 250v of voltage, 8L/min of gas flow, 135v of fragmentation voltage, 325 ℃ of gas temperature, 325 ℃ of sheath gas temperature, 11L/min of sheath flow and 40psi of a sprayer; the negative ion mode is voltage 1500v, gas flow 8L/min, fragmentation voltage 135v, gas temperature 325 ℃, sheath flow 11L/min and atomizer 40 psi;
wherein UPLC represents ultra-high performance liquid chromatography, QTOF/MS represents quadrupole time-of-flight mass spectrum.
6. The method for detecting imatinib metabolites in plasma of GIST patients based on non-targeted metabolomics as claimed in claim 1, wherein the raw data is transformed into mzML format before multivariate analysis is performed on the raw data in step S5, and then peak alignment, retention time correction and peak area extraction are performed on the profile of the raw data.
7. The method for non-targeted metabolomics-based detection of imatinib metabolites in plasma of GIST patients according to claim 6, wherein the multivariate analysis of the raw data in step S5 comprises: after peak alignment, retention time correction and peak area extraction are carried out on the map of the original data, the data are processed by Auto scanning software, and unsupervised principal component analysis, orthogonal partial least square method discriminant analysis, Student's t-test and difference multiple analysis are carried out on the data.
8. The method for non-targeted metabolomics-based detection of imatinib metabolites in plasma of GIST patients according to claim 1, wherein before the KEGG pathway analysis of metabolic markers, bioinformatic analysis including cluster analysis, correlation analysis and violin analysis is further performed to evaluate the rationality of differential metabolites.
9. The method for non-targeted metabolomics-based detection of imatinib metabolites in plasma of GIST patient according to claim 1, wherein the database in step S5 is one or more of metl in database, hmdb database, metablolites database or lmsd database.
Background
Gastrointestinal Stromal Tumors (GIST) are the most common mesenchymal Tumors of the digestive tract, accounting for about 0.3% -1% of Gastrointestinal Tumors, and the incidence of the Gastrointestinal Tumors is younger. Imatinib Mesylate (IM) is a Tyrosinase (TKIs) inhibitor, has good c-Kit activity inhibition effect, is applied to clinical GIST treatment as a targeted drug, and has remarkable achievement for GIST patients at high risk of complete resection, unresectable, recurrent metastasis or late stage. Studies have shown that IM, as an oral small molecule targeted drug, has about 75% intra-individual variation and 60% inter-individual variation, and about 80% of patients develop secondary resistance after 1-3 years of treatment. Therefore, screening GIST patients for suitability for IM therapy and predicting the sensitivity of efficacy of IM therapy are clinical challenges.
Currently, plasma biomarkers applied to digestive tract tumors include carcinomas CEA, AFP, CA199, CA125, and CA 72-4. The plasma marker is mainly used for auxiliary screening and diagnosis of GIST, but after GIST is confirmed, the research on the biological marker for IM treatment is still a blank stage. The cancer cell genome in the tumor has the characteristic of instability, and is easy to generate new mutation, and meanwhile, the change of the metabolite is closer to the phenotype of a reaction organism. Therefore, the research of using the plasma specific metabolite as the biological marker for IM treatment of GIST has great social benefit for screening GIST patients suitable for IM treatment.
Metabonomics is a research mode for carrying out quantitative analysis on all metabolites in an organism and searching the relative relation between the metabolites and physiological and pathological changes by imitating the research ideas of genomics and proteomics, and is a component of system biology. The non-targeted metabonomics can detect all detected metabolite molecules in a sample without bias, and can comprehensively and effectively reflect the difference of organisms. Among them, nuclear magnetic resonance spectroscopy and mass spectrometry are the most prominent means for characterizing metabolites and can be used in conjunction with chromatography to improve sensitivity and accuracy. Due to the complexity of plasma metabolites, accurate analysis results are difficult to achieve with conventional analytical approaches. The liquid chromatogram-mass spectrum/mass spectrum combination can overcome background interference, improve the signal to noise ratio and still achieve high sensitivity on complex samples.
Therefore, the method for researching the biological marker suitable for IM treatment by using the non-targeted metabonomics lays a foundation for the effectiveness of the medicament treatment, the targeted and noninvasive screening and the marker characterization after the digestive tract tumor is diagnosed.
Disclosure of Invention
Aiming at the problems, the invention provides a method for detecting imatinib metabolite in plasma of GIST patients based on non-targeted metabonomics, which uses metabolite reaction organism phenotype to identify biomarker in metabolic process, provides technical support for effective treatment of GISI patients suitable for IM treatment, and can lay a foundation for drug treatment effectiveness, targeted and noninvasive screening and marker characterization after digestive tract tumor diagnosis.
The technical scheme of the invention is as follows:
1. a method for non-targeted metabolomics based detection of imatinib metabolites in plasma of GIST patients comprising the steps of:
s1, collecting peripheral venous blood of a plurality of GIST patients who do not take IM and regularly take IM for a long time respectively, centrifuging the collected peripheral venous blood, and storing at low temperature to obtain plasma samples;
s2, analyzing the IM blood concentration in the plasma sample by adopting a two-dimensional liquid chromatography, and dividing the plasma sample into at least two groups according to the measured blood concentration value;
s3, measuring a proper amount of plasma samples which are divided into volume groups, respectively transferring the plasma samples into a centrifugal tube, adding methanol which is larger than the volume of the plasma samples in the centrifugal tube for vortex, and then performing low-temperature incubation; after the incubated sample is vortexed, centrifuging to collect supernatant, and standing; centrifuging the supernatant after standing again, and transposing the supernatant into a sample bottle to obtain a sample to be detected;
s4, carrying out non-targeted detection on the sample to be detected to obtain original data of metabolites in blood plasma;
s5, carrying out multivariate analysis on the original data, combining a VIP value with a P value of t test and an FC value of difference multiple analysis by using a variable projection importance value of an OPLS-DA model, searching for a difference metabolite under the conditions that VIP is more than or equal to 1, P is less than 0.05, and FC is more than or equal to 2 or less than or equal to 0.5, and searching a database to qualitatively identify GIST metabolic markers;
s6, carrying out KEGG pathway analysis on the metabolic markers, and determining the metabolic process of the GISI patient.
The working principle of the technical scheme is as follows:
IM is a KIs inhibitor, which is used as a targeted drug for clinical treatment of GIST, but patients taking IM have huge individual difference, and new mutation is easy to occur due to unstable cancer cell genome in tumors, so that no technical means is provided for screening GIST patients suitable for IM treatment at present. The change of the metabolite is close to the phenotype of the organism, and the obtained differential metabolite can reflect the pathological change in the human body through the representation of the metabolite. The inventor of the application adopts a UPLC-QTOF/MS combined technology to test the blood metabolite data of different patients, groups the patients according to the blood concentration, can more clearly and visually know the association between the blood concentration difference and the metabolite expression difference, obtains fragment information for unknown substances through the UPLC-QTOF/MS combined technology test, and carries out structure identification on the metabolite structure through searching a database. Finally, through statistical analysis and based on an OPLS-DA result, differential metabolites among different groups can be preliminarily screened from a VIP value, the metabolite of the VIP is generally considered to be remarkably different, meanwhile, a GIST metabolic marker is further screened by combining a P value and an FC of a t test of univariate analysis, the expression level of the metabolite is set to be more than 2 times or less than 0.5 time different among the groups, namely, the metabolite of the FC is selected to be more than or equal to 2 or less than or equal to 0.5, the P of the t test is considered to be meaningful, and the differential metabolite obtained through screening is more reliable. Can provide technical support for the effective treatment of GISI patients suitable for IM treatment, and lay a foundation for the effectiveness of the medicament treatment, the targeted and noninvasive screening and the marker characterization after the diagnosis of the digestive tract tumor.
In a further technical scheme, in step S1, the peripheral venous blood is collected in a centrifuge tube, centrifuged at 5000r/min at room temperature for 5min, collected supernatant is transferred to an EP tube, and stored at-80 ℃ to obtain a plasma sample.
The rotation speed is set to be 5000r/min, the time is 5min, and the micromolecular compounds can be thoroughly separated in a short time. The supernatant is transferred to a refrigerator for low-temperature storage at-80 ℃ immediately after collection, so that protein aggregation can be prevented, and the storage time is longer.
In a further embodiment, in step S2, based on the measured blood concentration values, group A is defined as IM blood concentration > 2000ng/mL, group B is defined as IM blood concentration 1100-.
By grouping the patients, the correlation between the blood concentration difference and the metabolite expression difference among the patients can be known under the same medicine taking condition, and the visual analysis and the biomarker searching are facilitated.
In a further technical scheme, the methanol in the step S3 contains 1ppm of 2-chlorophenylalanine, and the methanol is incubated for 0.5h at-20 ℃ after vortex; centrifuging the incubated plasma sample at 12000r/min at 4 ℃ for 10min, collecting 200uL of supernatant, transferring the supernatant into a new centrifuge tube, and standing at-20 ℃ for 0.5 h; and centrifuging the supernatant after standing at 12000r/min at 4 ℃ for 15min again, and taking the supernatant and transferring the supernatant into a sample bottle to obtain a sample to be detected.
The plasma sample is preserved at the low temperature of minus 20 ℃, the cell metabolism can be inhibited, and the components with low temperature selectivity are stabilized, so that freeze thawing is avoided, and the preservation time is long. Since heat generation during centrifugation is not beneficial to the stability of the analyte, the temperature is controlled to be collected at 4 ℃ by centrifugation. Because the blood is cryopreserved, re-centrifugation prevents material interference when testing.
In a further technical scheme, the acquisition conditions of the UPLC-QTOF/MS combination technique in step S4 are as follows:
and (3) UPLC: a Waters T3C 18 column, column temperature 40 ℃, flow rate 0.35mL/min, sample size 2uL, mobile phase a (0.04% acetic acid/water) and mobile phase B (0.04% acetic acid/acetonitrile);
QTOF/MS: adopting a positive ion mode and a negative ion mode of an electrospray ion source, wherein the positive ion mode is 250v of voltage, 8L/min of gas flow, 135v of fragmentation voltage, 325 ℃ of gas temperature, 325 ℃ of sheath gas temperature, 11L/min of sheath flow and 40psi of a sprayer; the negative ion mode is voltage 1500v, gas flow 8L/min, fragmentation voltage 135v, gas temperature 325 ℃, sheath flow 11L/min and atomizer 40 psi;
wherein UPLC represents ultra performance liquid chromatography, and MS represents quadrupole time-of-flight mass spectrometry.
Through multiple creative works of the inventor, the inventor finds that the UPLC-QTOF/MS is set as the acquisition condition, background interference can be overcome, the signal to noise ratio is improved, high sensitivity can be still achieved on complex samples, the quantitative effect is good, the obtained fragment information peak shape is clear for unknown substances, and the result is more accurate.
In a further technical solution, before performing multivariate analysis on the raw data in step S5, the raw data is converted into an mzML format, and then peak alignment, retention time correction, and peak area extraction are performed on a graph of the raw data.
Because the format of the mass spectrum file is fixed, the output format is not suitable for analysis software supporting other data formats, the ProteWizard format conversion is realized, the format selectivity is wide, the mzML format is supported, the method can be directly used for XCMS software processing, the processing flow is smooth, and the universality is strong.
In a further technical solution, the performing multivariate analysis on the raw data in step S5 specifically includes: after peak alignment, retention time correction and peak area extraction are carried out on the map of the original data, the data are processed by Autoscaling software, and unsupervised principal component analysis, orthogonal partial least square method discriminant analysis, Student's t-test and difference multiple analysis are carried out on the data.
Unsupervised Principal Component Analysis (PCA) is an effective method for compressing and dimensionality reduction of data based on a variable covariance matrix, and the PCA analysis is carried out on samples to preliminarily know the total metabolic difference among various groups of samples and the variation degree among the samples in the groups; for samples with small inter-group difference, an orthogonal partial least squares discriminant analysis (OPLS-DA) method is adopted to establish a regression model, a data set can be divided into two parts which are related and unrelated to the experimental purpose, metabolic changes caused by factors unrelated to the experimental purpose are filtered out by adopting orthogonal signals of a noise filtering technology, the obtained OPLS-DA model can more accurately acquire inter-group difference information, and random differences in groups are ignored; fold difference analysis is to calculate the difference of metabolite expression between GISI patients taking IM and GISI patients not taking IM, classify the differential metabolites with fold difference more than or equal to 2 or less than or equal to 0.5, and finally determine the differential metabolites with statistical significance by combining Student's t-test. The differential metabolites obtained by the combined use of the analysis methods are comprehensive and accurate.
In a further technical scheme, before the KEGG pathway analysis is carried out on the metabolic markers, bioinformatics analysis is carried out to evaluate the rationality of the differential metabolites, wherein the bioinformatics analysis comprises cluster analysis, correlation analysis and violin analysis.
After the different metabolites are screened out by multivariate analysis, the rationality of the different metabolites is evaluated from different dimensions by different methods such as cluster analysis, correlation analysis, violin analysis and the like, and the relationship among samples is displayed more intuitively and comprehensively, so that the marker metabolites are screened out accurately by the aid of the method, and the accuracy of the screened out metabolic markers can be guaranteed. And finally, carrying out KEGG pathway analysis to obtain a standard, complete and accurate metabolic pathway map.
In a further embodiment, the database in step S5 is one or more of a metlin database, an hmdb database, a metablites database, or an lmsd database.
By searching the database, the types of the different metabolites can be known, the number of substances which can be detected by the database is rich and comprehensive, simple search and complex search can be carried out, and the different metabolites can be effectively qualified.
The invention has the beneficial effects that:
1. the invention uses non-targeted metabonomics to detect metabolites in the plasma of GIST patients, uses the metabolic products to react with the phenotype of organisms, identifies biomarkers in the metabolic process, provides technical support for the effective treatment of the GISI patients suitable for IM treatment, and can lay a foundation for the effectiveness, pertinence, noninvasive screening and marker characterization of the medicament treatment after the diagnosis of the digestive tract tumor;
2. compared with the traditional LC-MS, GC-MS and single detection, the UPLC-QTOF/MS is adopted, so that the background interference can be overcome, the signal to noise ratio is improved, the high sensitivity can be still achieved on a complex sample, the quantitative effect is good, the obtained fragment information peak shape is clear for unknown substances, and the result is more accurate;
3. according to the invention, by detecting the blood samples of patients and grouping the blood samples according to the blood concentration, the correlation between the blood concentration difference and the metabolite expression difference among the patients can be known under the same medicine taking condition, so that the visual analysis and the biomarker searching are facilitated;
4. the data processing module adopts a method of combining univariate analysis with multivariate analysis, has higher statistical result accuracy, comprehensive substances, simple operation and strong universality, and has obvious social progress significance;
5. according to the invention, the rationality of different metabolites is evaluated by different dimensions through different methods such as cluster analysis, correlation analysis, violin analysis and the like, and the relationship among samples is more intuitively and comprehensively displayed, so that the method helps us to accurately screen the marker metabolites and can ensure the accuracy of the screened metabolic markers. And finally, KEGG pathway analysis is carried out, and the obtained metabolic pathway diagram is standard, complete and accurate.
Drawings
FIG. 1 is a mass spectrometric detection TIC overlay of QC samples according to an embodiment of the invention
FIG. 2 is a PCA score plot of an embodiment of the present invention
FIG. 3 is an OPLS-DA score map and a model verification map according to an embodiment of the present invention
FIG. 4 is a volcano plot of differential metabolites according to embodiments of the present invention
FIG. 5 is a graph of the VIP values of the differential metabolites described in the examples of the present invention
FIG. 6 is a bar graph of the first 20 substances at fold difference of the differential metabolites described in the examples of the present invention
FIG. 7 is a heat map of cluster analysis of differential metabolites according to an embodiment of the present invention
FIG. 8 is a heat map of differential metabolite correlation as described in the examples of the present invention
FIG. 9 is a diagram of an integrated violin of the differential metabolites described in the embodiment of the present invention
FIG. 10 is a graph of differentially enriched bubbles according to an embodiment of the present invention
FIG. 11 is a functional annotation diagram of differential metabolites as described in the examples of the present invention.
Detailed Description
The following describes a specific embodiment of the present invention.
The technical scheme of the invention is as follows:
1. a method for non-targeted metabolomics based detection of imatinib metabolites in plasma of GIST patients comprising the steps of:
s1, collecting peripheral venous blood of a plurality of GIST patients who do not take IM and regularly take IM for a long time respectively, centrifuging the collected peripheral venous blood, and storing at low temperature to obtain plasma samples;
s2, analyzing the IM blood concentration in the plasma sample by adopting a two-dimensional liquid chromatography, and dividing the plasma sample into at least two groups according to the measured blood concentration value;
s3, measuring a proper amount of plasma samples which are divided into volume groups, respectively transferring the plasma samples into a centrifugal tube, adding methanol which is larger than the volume of the plasma samples in the centrifugal tube for vortex, and then performing low-temperature incubation; after the incubated sample is vortexed, centrifuging to collect supernatant, and standing; centrifuging the supernatant after standing again, and transposing the supernatant into a sample bottle to obtain a sample to be detected;
s4, carrying out non-targeted detection on the sample to be detected to obtain original data of metabolites in blood plasma;
s5, carrying out multivariate analysis on the original data, combining a VIP value with a P value of t test and an FC value of difference multiple analysis by using a variable projection importance value of an OPLS-DA model, searching for a difference metabolite under the conditions that VIP is more than or equal to 1, P is less than 0.05, and FC is more than or equal to 2 or less than or equal to 0.5, and searching a database to qualitatively identify GIST metabolic markers;
s6, carrying out KEGG pathway analysis on the metabolic markers, and determining the metabolic process of the GISI patient.
The working principle of the technical scheme is as follows:
the working principle of the technical scheme is as follows:
IM is a KIs inhibitor, which is used as a targeted drug for clinical treatment of GIST, but patients taking IM have huge individual difference, and new mutation is easy to occur due to unstable cancer cell genome in tumors, so that no technical means is provided for screening GIST patients suitable for IM treatment at present. The change of the metabolite is close to the phenotype of the organism, and the obtained differential metabolite can reflect the pathological change in the human body through the representation of the metabolite. The inventor of the application adopts a UPLC-QTOF/MS combined technology to test the blood metabolite data of different patients, groups the patients according to the blood concentration, can more clearly and visually know the association between the blood concentration difference and the metabolite expression difference, obtains fragment information for unknown substances through the UPLC-QTOF/MS combined technology test, and carries out structure identification on the metabolite structure through searching a database. Finally, through statistical analysis and based on an OPLS-DA result, differential metabolites among different groups can be preliminarily screened from a VIP value, the metabolite of the VIP is generally considered to be remarkably different, meanwhile, a GIST metabolic marker is further screened by combining a P value and an FC of a t test of univariate analysis, the expression level of the metabolite is set to be more than 2 times or less than 0.5 time different among the groups, namely, the metabolite of the FC is selected to be more than or equal to 2 or less than or equal to 0.5, the P of the t test is considered to be meaningful, and the differential metabolite obtained through screening is more reliable. Can provide technical support for the effective treatment of GISI patients suitable for IM treatment, and lay a foundation for the effectiveness of the medicament treatment, the targeted and noninvasive screening and the marker characterization after the diagnosis of the digestive tract tumor.
In another embodiment, in step S1, the peripheral venous blood is collected in a centrifuge tube, centrifuged at 5000r/min for 5min at room temperature, and the supernatant is transferred to an EP tube and stored at-80 ℃ to obtain a plasma sample.
In another embodiment, in step S2, based on the measured blood concentration values, group A is defined as IM blood concentration > 2000ng/mL, group B is defined as IM blood concentration 1100-.
In another example, the methanol in step S3 contains 1ppm 2-chlorophenylalanine, and is incubated at-20 ℃ for 0.5h after vortexing; centrifuging the incubated plasma sample at 12000r/min at 4 ℃ for 10min, collecting 200uL of supernatant, transferring the supernatant into a new centrifuge tube, and standing at-20 ℃ for 0.5 h; and centrifuging the supernatant after standing at 12000r/min at 4 ℃ for 15min again, and taking the supernatant and transferring the supernatant into a sample bottle to obtain a sample to be detected.
In another embodiment, the acquisition conditions of UPLC-QTOF/MS in step S4 are:
and (3) UPLC: a Waters T3C 18 column, column temperature 40 ℃, flow rate 0.35mL/min, sample size 2uL, mobile phase a (0.04% acetic acid/water) and mobile phase B (0.04% acetic acid/acetonitrile);
QTOF/MS: adopting a positive ion mode and a negative ion mode of an electrospray ion source, wherein the positive ion mode is 250v of voltage, 8L/min of gas flow, 135v of fragmentation voltage, 325 ℃ of gas temperature, 325 ℃ of sheath gas temperature, 11L/min of sheath flow and 40psi of a sprayer; the negative ion mode is voltage 1500v, gas flow 8L/min, fragmentation voltage 135v, gas temperature 325 ℃, sheath flow 11L/min and atomizer 40 psi;
wherein UPLC represents ultra-high performance liquid chromatography, QTOF/MS represents quadrupole time-of-flight mass spectrum.
In another embodiment, before performing multivariate analysis on the raw data in step S5, the raw data is converted into mzML format by using protewizard, and then the map of the raw data is subjected to peak alignment, retention time correction, and peak area extraction by XCMS software.
In another embodiment, the multivariate analysis performed on the raw data in step S5 specifically includes: after the XCMS software processing, the data are subjected to unsupervised principal component analysis, orthogonal partial least square method discriminant analysis, Student's t-test and difference multiple analysis through Auto scaling processing.
In another embodiment, before the KEGG pathway analysis is performed on the metabolic markers, bioinformatics analysis including cluster analysis, correlation analysis and violin analysis is also performed to evaluate the rationality of the differential metabolites.
In another embodiment, the database in step S5 is one or more of a metlin database, an hmdb database, a metablites database, or an lmsd database.
The embodiments of the present invention will be further described with reference to the accompanying drawings.
Example 1:
a method for non-targeted metabolomics based detection of imatinib metabolites in plasma of GIST patients comprising the steps of:
1. sample collection and classification
The study included 40 patients with GIST who had not taken IM and who had taken IM regularly for a long time, wherein the number of patients who had not taken IM and who had taken IM regularly for a long time was 1: 3, respectively drawing 5mL of peripheral venous blood into an anticoagulation tube, immediately and softly turning upside down for 8 times to uniformly mix the anticoagulation agent with the blood (the action is softly to avoid hemolysis), smoothly and quickly transferring to a laboratory, centrifuging for 5min at room temperature of 5000r/min, taking 300uL of supernatant, transferring to a clean EP tube, and storing in a refrigerator at-80 ℃ for testing.
As shown in Table 1, the plasma samples were numbered 1 to 40, and IM blood concentrations in the numbered plasma samples were sequentially analyzed by two-dimensional liquid chromatography, and based on the measured blood concentration values, group A consisted of IM blood concentrations > 2000ng/mL, group B consisted of IM blood concentrations 1100ng/mL, group C consisted of IM blood concentrations < 1100ng/mL, and group D consisted of GIST patients who had not taken IM.
Table 1 shows the grouping of IM plasma concentration values in the plasma of the subjects:
table 1:
2. characterization of the test
2.1 instruments and reagents
The instrument comprises the following steps: mass spectrometer (QTOF/MS-6545, Aglient), ultra high performance liquid chromatograph (1290Infinity LC, Aglient), vortex mixer (MIX-200, Shanghai Jingxin), centrifuge (5427R, Germany, Ebende), full-automatic multistage two-dimensional liquid chromatography coupler (FLC-2701, MLC2420, Hunan Demiter instruments, Inc.), ultra-low temperature refrigerator (Forma 900series, Thermo scientific), Sichuan super pure water machine (ULUP, Sichuan super pure science, Inc)
Reagent: methanol (chromatographically pure, Merck), formic acid (chromatographically pure, Thermo Fisher), acetonitrile (chromatographically pure, Merck), 2-chlorophenylalanine (Thermo Fisher)
2.2 sample processing
Respectively measuring 100uL of A, B, C, D groups of plasma samples, transferring the plasma samples into a 1.5mL centrifuge tube, adding 300uL of methanol containing 1ppm of 2-chlorophenylalanine, performing vortex for 2min, and performing refrigerator incubation for 0.5h at-20 ℃; taking the incubated sample, vortexing for 2min, centrifuging at 12000r/min at 4 ℃ for 10min, collecting 200uL of supernatant, transferring into a new 1.5mL centrifuge tube, and standing at-20 ℃ for 0.5 h; and centrifuging the supernatant after standing for 15min again at 12000r/min at 4 ℃, and transferring the supernatant into a sample bottle to obtain a sample to be detected, wherein a vortex mixer is adopted for vortex treatment, a centrifuge is adopted for centrifugal treatment, and an ultra-low temperature refrigerator is adopted for temperature setting.
2.3 non-Targeted detection
And carrying out non-targeted detection on the sample to be detected by adopting a UPLC-QTOF/MS combined technology, and sequentially sampling to obtain the original metabolite data in the plasma sample.
2.3.1 liquid chromatography conditions
A Waters T3C 18 column, i.d.2.1 multiplied by 100mm,1.8 mu m (C18 MWM-13), the column temperature is 40 ℃, the flow rate is 0.35mL/min, the sample injection amount is 2uL, the mobile phase A (0.04% acetic acid/water) and the mobile phase B (0.04% acetic acid/acetonitrile) are reduced from 95% to 5% at a constant speed from beginning to 10min, and meanwhile, the mobile phase proportion of the pump B is increased from 5% to 95% at a constant speed; maintaining the above ratio from 10min to 11 min; the proportion of the mobile phase of the pump A is increased from 5% to 95% at a constant speed from 11min to 11.1min, and the proportion of the mobile phase of the pump B is reduced from 95% to 5% at a constant speed; keep for 14 min.
2.3.2 Mass Spectrometry conditions
In positive ion mode: voltage (Voltage): 250 v; gas Flow rate (Gas Flow): 8L/min; fragmentation voltage (Fragmentor)135 v; gas Temperature (Gas Temperature): 325 ℃; sheath gas Temperature (Sheath Temperature): 325 ℃; sheath Flow (Sheath Flow): 11L/min; nebulizer (Nebulizer): 40 psi.
In the negative ion mode: voltage (Voltage): 1500 v; gas Flow rate (Gas Flow): 8L/min; fragmentation voltage (Fragmentor)135 v; gas Temperature (Gas Temperature): 325 ℃; sheath gas Temperature (Sheath Temperature): 325 ℃; sheath Flow (Sheath Flow): 11L/min; nebulizer (Nebulizer): 40 psi.
2.3.3 quality control analysis
(a) Preparation of quality control samples (QC): before the machine, 15. mu.L of the sample was taken out from each sample and mixed well to obtain QC samples.
(b) And (3) carrying out stability analysis verification on the instrument and the analysis method: during the analysis of the instrument, a QC sample is inserted into each 15 samples for detection and analysis to check the repeatability of the analysis process.
3. Data multivariate analysis
Firstly, converting original data into an mzML format by using ProteWizard, and then carrying out peak alignment, retention time correction and peak area extraction on a map of the original data by using XCMS software. After the XCMS software is processed, the data is subjected to Auto scaling processing, unsupervised Principal Component Analysis (PCA), orthogonal partial least squares discriminant analysis (OPLS-DA), Student's t-test and Fold difference analysis (Fold Change) are carried out on the data, a different metabolite is searched according to the variable projection importance value of an OPLS-DA model, namely the VIP value (VIP is more than or equal to 1) is combined with the FC value, namely FC is more than or equal to 2 or less than or equal to 0.5, and the P value (P is less than 0.05) of t test, and the GIST metabolite is identified by searching a database. Wherein, the fold difference (FC) is the difference of the metabolite expression quantity between GISI patients taking and not taking IM, which represents the ratio of the metabolite content before and after administration, FC is more than or equal to 2 indicates that the content is increased after administration, FC is less than or equal to 0.5 indicates that the content is reduced, so the FC is more than or equal to 2 or less than or equal to 0.5 to classify the different metabolites.
And finally, simultaneously comparing the metlin database and the hmdb database, arranging metabolic pathways participated by the metabolic markers in the plasma, evaluating the rationality of the different metabolites through clustering analysis, correlation analysis and violin analysis, finding the metabolic markers related to the GISI by adopting KEGG pathway analysis, and analyzing the metabolic network of the metabolic markers.
As a result:
the UPLC-QTOF/MS is precise in structure, and has a plurality of factors which can cause systematic errors of sample collection during the use process, such as temperature, humidity, cleanness of an instrument and the like. Therefore, the high stability of the instrument provides important guarantee for the repeatability and reliability of the data. A Total Ion Chromatogram (TIC), in which the ordinate represents the Total intensity of current collecting stored ions and the abscissa represents the time of generation of ions, is a graph describing the change of Total ion current with time. By overlapping and comparing spectrograms of different QC sample TIC images, the repeatability of metabolite extraction and detection can be judged. Fig. 1, a and B show superimposed graphs of mass spectrometric detection TIC of QC samples in positive ion and negative ion modes, respectively, as shown in fig. 1 (fig. 1 to 11 refer to color original drawings filed concurrently with the application document), the response intensities and retention times of the individual chromatographic peaks in ESI + and ESI-modes are substantially overlapped, and the distributions of QC samples are seen to be gathered together in the PCA analysis chart of fig. 2, which indicates that the variation caused by instrument errors is small in the whole experimental process.
Fig. 2 shows a PCA two-dimensional score plot, green for the high concentration group (group a), orange for the medium concentration group (group B), blue for the low concentration group (group C), red for the control group (group D), and MIX for the above-mentioned quality control sample, a and B in fig. 2 representing PCA score plots of the samples and the quality control samples of each group in ESI + mode and ESI-mode, respectively, C and D representing PCA score plots of DvsA in ESI + mode and ESI-mode, respectively, E and F representing PCA score plots of DvsB in ESI + mode and ESI-mode, respectively, and G and H representing PCA score plots of DvsC in ESI + mode and ESI-mode, respectively. The abscissa PC1 represents the first principal component and the ordinate PC2 represents the second principal component, and it can be found that the results of PCA in ESI + mode show that the metabolic groups have a tendency to separate between the groups, indicating that there is a difference between the metabolic groups between the groups. The PCA score plots in ESI-mode showed some convergence of the metabolome between A, C, D sets, so the following analyses were performed using ESI + mode only.
In FIG. 3, A and B are respectively an OPLS-DA score map and an OPLS-DA model verification map of DvsA in ESI + mode, C and D are respectively an OPLS-DA score map and an OPLS-DA model verification map of DvsB, E and F are respectively an OPLS-DA score map of DvsC and an OPLS-DA model verification map of DvsC. The abscissa of the OPLS-DA score map represents the score of the principal component in the Orthogonal Signal Correction (OSC) process, and thus, the difference between groups can be shown from the direction of the abscissa, the ordinate represents the score of the orthogonal component in the OSC process, the difference between samples in a group can be shown from the direction of the ordinate, each point in the map represents the position of the metabolome group of one sample projected on a two-dimensional plane after dimension reduction processing, and distinguishes different groups with different colors, wherein green represents the control group (group D), red represents the experimental group, and it can be seen from the map that each group is gathered in a relatively concentrated range, and the distinction between groups is obvious. The abscissa of the verification diagram of the OPLS-DA model represents the accuracy of the model, the ordinate represents the frequency of the accuracy of 200 models in 200 replacement tests (permatation test), the arrow represents the position of the accuracy of the OPLS-DA model, wherein R2X represents the interpretation rate of the established model to X and Y matrixes respectively at R2Y, Q2 represents the prediction capability of the model, R2Y, Q2 > 0.5 and the closer to 1, the better the model is fit, it can be seen from the diagram that R2Y and Q2 of each experimental group of VS of the control group are respectively 0.979, 0.874, 0.977, 0.858, 0.971 and 0.820, both close to 1 and P <0.05 in the ESI + mode, the data show that the model has no overfitting phenomenon, the model has prediction capability to grouping, and can be used for comparing the difference between the groups.
A, B and C in FIG. 4 are the volcanic map of D vs A, D vs B and D vs C, respectively, in ESI + mode. In the volcano plot, each point represents a metabolite, wherein red represents a significant up-regulated metabolite, green represents a significant down-regulated metabolite, gray represents a non-significant metabolite, the size of scatter represents a VIP value, the larger scatter represents a VIP value, the abscissa represents a value obtained by converting the Log2 of the metabolite difference factor, the Log2(fold change) of the up-regulated metabolite is more than or equal to 1, the Log2(fold change) of the down-regulated metabolite is less than or equal to-1, and the ordinate represents a value obtained by converting the t-test P value of the univariate analysis into-Log 10, the larger value represents the more significant. As can be seen from the figure, the metabolites that were significantly up-and down-regulated between groups in ESI + mode were 125, 168, respectively; 113. 168; 84. 195.
After multivariate analysis is performed on the raw data, partial differential metabolites found out under the conditions that the variable projection importance value of the OPLS-DA model, namely the VIP value is combined with the P value of the t test and the FC value of the difference multiple analysis, namely VIP is more than or equal to 1, P is less than 0.05, and FC is more than or equal to 2 or less than or equal to 0.5 are shown in Table 2.
Table 2 shows the results of differential metabolite screening between the two groups:
table 2:
VIP of 874 variables in the OPLS-DA model of DvsA in the ESI + mode is more than or equal to 1, the VIP obviously contributes to the classification of the group with high IM blood concentration and the control group, and 150 metabolites are found to meet the screening standard by further carrying out t test and fold difference analysis on the variables, wherein the metabolites with up-regulation and down-regulation are respectively 64 and 86. Aiming at unknown metabolites, on the basis of determining molecular ion peaks of the unknown metabolites, high-resolution mass spectrometry is adopted to determine accurate molecular weights of the unknown metabolites, element compositions of the unknown metabolites are analyzed, secondary mass spectrometry and other information are referred, network databases such as HMDB, METLIN and the like and laboratory self-built databases are combined to query possible identification results, wherein 29 metabolites are identified, and the number of the metabolites participating in up-regulation and down-regulation is 8 and 21 respectively; in the same manner, VIP of 863 variables in the OPLS-DA model of D vs B in the ESI + mode is more than or equal to 1, the metabolite meeting the screening standard is 149, the metabolites participating in up-regulation and down-regulation are 56 and 93 respectively, 34 metabolites are identified, and the number of the metabolites participating in up-regulation and down-regulation is 9 and 25 respectively; VIP of 848 variables in an OPLS-DA model of DvsC in an ESI + mode is more than or equal to 1, metabolites meeting the screening standard are 148, metabolites participating in up-regulation and down-regulation are 42 and 106 respectively, 33 of the metabolites are identified, and the number of the metabolites participating in up-regulation and down-regulation is 11 and 22 respectively. Table 2 shows the results of screening two-stage qualitative differential metabolites between two groups according to the arrangement of the first 20 VIP values from large to small.
Figures 5 and 6 rank the known differential metabolites VIP values from large to small for the VIP values and log2FC values, respectively, where A, B and C of figure 5 show the VIP values of the known differential metabolites DvsA, DvsB, DvsC in ESI + mode, respectively, with the abscissa representing VIP values and the ordinate representing differential metabolites. Wherein A, B and C of FIG. 6 show the FC histograms of known differential metabolites DvsA, DvsB, DvsC in ESI + mode (all showing the largest first 20 differential metabolites), respectively, with the abscissa representing the log2FC value and the ordinate representing the differential metabolite.
A, B and C of FIG. 7 show the known differential metabolite clustering heatmaps of DvsA, DvsB and DvsC, respectively, in ESI + mode, from which it can be seen that there are large differences in metabolite expression between groups and small differences in metabolite expression within groups. The cluster analysis is often used to determine the metabolic patterns of metabolites under different conditions, and hierarchical cluster analysis (hierarchical clustering analysis) is performed with the relative values of metabolites under different conditions as the metabolic levels, the result is represented by a heat map, the color gradient enables the visualization of the difference between data, and by data scaling, a larger difference is retained, and a smaller difference can be highlighted. The abscissa of the heat map represents sample information, wherein green represents a control group (group D), red represents an experimental group, and the ordinate represents differential metabolites screened by the experiment.
A, B and C in FIG. 8 show the heat map of identifying differential metabolite correlation of DvsA, DvsB and DvsC in ESI + mode respectively (all show the first 50 differential metabolites with the largest VIP value), the abscissa represents the metabolite, the upper right of the figure has a scale of the correlation coefficient, when the linear relationship of the two metabolites is enhanced, the correlation coefficient tends to 1 or-1, the red color represents stronger positive correlation, and the green color represents stronger negative correlation. The results of the correlation will provide information for scaling down large numbers of metabolites to mine the most potentially diverse metabolites.
A, B and C in FIG. 9 show the overall violin diagrams of known differential metabolites DvsA, DvsB and DvsC respectively (only the first 20 differential metabolites with the largest VIP value are shown), boxes in the violin diagrams represent the range from one quarter digit to three quarters digit, the middle black horizontal line represents the median, the thin black line extending from the box represents the 95% confidence interval, the outer shape represents the distribution density of the data, and ideally, the flatter the box in the violin diagrams represents the data distribution concentration. As shown in fig. 9, the differential metabolites between the groups showed different distribution patterns, which indicates that the control group and the experimental group without IM can be distinguished by the metabolites.
As in figure 10A, B and C are differentially enriched bubble maps for DvsA, DvsB, DvsC known differential metabolites KEGG, respectively, in positive ion mode. The left side of the bubble graph is the name of the metabolic Pathway (Pathway), the horizontal axis represents the enrichment factor (Rich factor), which is calculated by dividing in set in the enrichment result of the metabolic Pathway by the regression, the size of the bubble represents the number of differential metabolites participating in the Pathway, and the color of the bubble represents the hypergeometric P value of the metabolic Pathway. The lower the P value, the greater the number of metabolite hits in the pathway, the more matched the significant pathway. The most affected metabolic pathway is set as the pathway with the critical value of influence being more than 0.1, and the result shows that the influence mainly relates to important pathways such as sphingolipid metabolism, P450 metabolism, caffeine metabolism, vitamin digestion and absorption and the like, and the less important pathways are filtered to obtain the most possible differential metabolite KEGG pathway map. As shown in FIG. 11, D-sphingosine, sphingosine-1-phosphate, phytosphingosine, phosphoethanolamine belong to the sphingolipid metabolic pathway, wherein red color indicates significant up-regulation of metabolite content in the experimental group, and green color indicates significant down-regulation of metabolite content in the experimental group.
The above-mentioned embodiments only express the specific embodiments of the present invention, and the description thereof is more specific and detailed, but not construed as limiting the scope of the present invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the inventive concept, which falls within the scope of the present invention.