Cell nucleus image segmentation method based on pixel classification and distance regression
1. A cell nucleus image segmentation method based on pixel classification and distance regression is characterized by comprising the following steps:
(1) extracting features from an input image:
inputting an input image into a backbone network to extract image level features with different resolutions;
(2) constructing an upsampling double-branch decoding network:
constructing an up-sampling double-branch decoding network, and respectively up-sampling the image level features with different resolutions in the step (1) to restore the image resolution based on the double-branch decoding network to obtain pixel classification features and distance regression features of different levels;
(3) constructing a global information perception module:
constructing a global information perception module, processing the pixel classification features and the distance regression features in the step (2) based on the global information perception module, and screening the image level features with different resolutions in the step (1) through an attention mechanism;
(4) constructing a characteristic aggregation module:
a feature aggregation module is constructed, based on the double-branch decoding network in the step (2), a double-branch feature aggregation module based on pixel classification and distance regression is constructed, feature aggregation is carried out on pixel classification features and distance regression features which are located at the same feature level, and a final pixel classification output result and a final distance regression output result are obtained in the last feature aggregation module;
(5) training the algorithm network:
on a training data set, finishing algorithm network training by respectively minimizing a cross entropy loss function and a mean square error loss function on the pixel classification output result and the distance regression result in the step (4) by adopting a supervised learning mechanism to obtain network model parameters;
(6) and (3) testing the algorithm network:
and (4) on a test data set, utilizing the network model parameters obtained in the step (5) to obtain a final cell nucleus image segmentation result by utilizing a post-processing technology of controlling watershed based on the marker for the pixel classification output result and the distance regression output result obtained in the step (4).
2. The method of claim 1, wherein the image segmentation method comprises the following steps: the input image in the step (1) is a cell nucleus original image.
3. The method of claim 2, wherein the image segmentation method comprises the following steps: the backbone network in the step (1) is a ResNet-50 network, and the backbone network parameters are shared.
4. The method of claim 3, wherein the image segmentation method is based on pixel classification and distance regressionThe method is characterized in that: in the step (1), 5 image level features F with different resolutions of the generated cell nucleus image are extracted through a ResNet-50 network0,F1,F2,F3,F4Wherein the number of channels of each feature is 64, 256, 512, 1024, 2048 respectively.
5. The method of claim 4, wherein the image segmentation method comprises the following steps: the upsampling double-branch decoding network in the step (2) is to use the highest layer characteristic F in the step (1)4As input, restoring image resolution and respectively obtaining pixel classification characteristics of different levelsSum distance regression feature
6. The method of claim 5, wherein the image segmentation method comprises the following steps: the global information perception module in the step (3) is used for obtaining the hierarchical feature F obtained in the step (1)1,F2,F3And said step (2) dual branch upsampled features as input, sub-imaging pixel classification global attention featuresAnd
7. the method for segmenting the nuclear image based on the pixel classification and the distance regression as claimed in claim 6, wherein the step (3) further comprises the steps of:
(31) for coded network ResNet-50 networkThe feature taken is recorded as a high level feature, which is recorded as FiWherein i is 4;
(311) for the high-level features F in the step (31)1,F2,F3Carrying out global average pooling;
(312) for the high-level features F in the step (31)1,F2,F3Carrying out up-sampling;
(313) applying a Sigmoid function to the result of said step (311), expressed as follows:
βi=S(G(Fi)),
wherein, S (-) is a Sigmoid function, G (-) represents global average pooling, and i ═ 1,2, and 3 respectively represent high-level features with different resolutions;
(32) the input of the low-level features is the upsampled features obtained by the upsampling double-branch decoding module in the step (2); wherein the pixel classification is characterized byDistance regression features are notedWherein i is 1,2, 3;
(321) spatially separable convolving the low-level features of step (32);
(322) carrying out element-level multiplication on the image features generated in the step (321) and the result in the step (313) to obtain new features;
(323) performing an element-level addition operation on the result of the step (322) and the result of the step (312);
the global information perception module of the pixel classification branch and the distance regression branch of step (3) may be represented as follows:
where Upsmple () is an upsampling operation, S () is a Sigmoid function, G (-) represents a global averaging pooling, Spconv (-) represents a spatially separable convolution, and i ═ 1,2, and 3 represent features of different resolutions, respectively.
8. The method of claim 7, wherein the image segmentation method comprises: the construction of the dual-branch feature aggregation module based on pixel classification and distance regression in the step (4) comprises the following steps:
(41) performing pixel-level addition of the result of step (3) and the result of step (2) as input to step (4);
(42) respectively convolving the pixel classification characteristic diagram and the distance regression characteristic diagram obtained in the step (41), and cascading the convolved characteristic diagrams;
(43) convolving the result of step (42) and performing a pixel-level addition with the result of step (41);
the information aggregation module of the pixel classification branch and the distance regression branch of step (4) may be represented as follows:
where Cat (-) denotes the signature channel cascade and Conv (-) denotes the 3 × 3 convolution.
9. The method of claim 8, wherein the image segmentation method comprises: in the network training process in the step (5), supervision in the network training process is divided into two parts: the penalty for pixel classification is a cross entropy penalty function, and the penalty for distance regression is a mean square error penalty function for normalized coordinates.
10. The method of claim 9, wherein the image segmentation method comprises: and (6) testing the image by using the trained network model parameters in the step (5) to generate a pixel classification result image and a distance regression result image, negating the distance regression result image by using a post-processing technology as a mark of the pixel classification image, and controlling a watershed algorithm by using the mark to obtain a final cell nucleus image segmentation result image.
Background
The nucleus image segmentation means that a series of computer vision methods are used for processing the nucleus image, and the nucleus is extracted from a complex background area; the task is the basic premise of the digital pathological workflow and has important significance for cancer diagnosis, grading and prediction; the existing nuclear image segmentation methods can be divided into two main categories: one is a traditional cell nucleus image segmentation method, and the other is a cell nucleus image segmentation method based on deep learning.
However, the traditional segmentation method of the cell nucleus image mainly completes the segmentation of the cell nucleus image through the characteristics of pixel values, shapes and the like extracted manually, and the transition depends on the manually selected characteristics; most of the cell nucleus image segmentation methods only construct a double-branch subnetwork by a pixel classification method to respectively extract foreground classification features and boundary classification features of the cell nucleus image, and carry out simple post-processing on the extracted features to generate a final segmentation result; due to the fact that slice thicknesses are different in the manufacturing process of cell nucleus images, cell nucleus boundaries are often fuzzy and unclear, and the method based on pixel classification can cause cell nucleus undersampling with dense adhesion in a complex adhesion scene.
Disclosure of Invention
The invention aims to overcome the defects of the prior art and provides a cell nucleus image segmentation method based on pixel classification and distance regression.
In order to achieve the purpose, the invention adopts the following technical scheme:
a cell nucleus image segmentation method based on pixel classification and distance regression comprises the following steps:
(1) extracting features from an input image:
inputting an input image into a backbone network to extract image level features with different resolutions;
(2) constructing an upsampling double-branch decoding network:
constructing an up-sampling double-branch decoding network, and respectively up-sampling the image level features with different resolutions in the step (1) to restore the image resolution based on the double-branch decoding network to obtain pixel classification features and distance regression features of different levels;
(3) constructing a global information perception module:
constructing a global information perception module, processing the pixel classification features and the distance regression features in the step (2) based on the global information perception module, and screening the image level features with different resolutions in the step (1) through an attention mechanism;
(4) constructing a characteristic aggregation module:
a feature aggregation module is constructed, based on the double-branch decoding network in the step (2), a double-branch feature aggregation module based on pixel classification and distance regression is constructed, feature aggregation is carried out on pixel classification features and distance regression features which are located at the same feature level, and a final pixel classification output result and a final distance regression output result are obtained in the last feature aggregation module;
(5) training the algorithm network:
on a training data set, finishing algorithm network training by respectively minimizing a cross entropy loss function and a mean square error loss function on the pixel classification output result and the distance regression result in the step (4) by adopting a supervised learning mechanism to obtain network model parameters;
(6) and (3) testing the algorithm network:
and (4) on a test data set, utilizing the network model parameters obtained in the step (5) to obtain a final cell nucleus image segmentation result by utilizing a post-processing technology of controlling watershed based on the marker for the pixel classification output result and the distance regression output result obtained in the step (4).
Preferably, the input image in the step (1) is a cell nucleus original image.
Preferably, the backbone network in step (1) is a ResNet-50 network, and the backbone network parameters are shared.
Preferably, said step (1)) In the method, 5 image level features F with different resolutions of a generated cell nucleus image are extracted through a ResNet-50 network0,F1,F2,F3,F4Wherein the number of channels of each feature is 64, 256, 512, 1024, 2048 respectively.
Preferably, the upsampling two-branch decoding network in the step (2) is to use the highest layer characteristic F in the step (1)4As input, restoring image resolution and respectively obtaining pixel classification characteristics of different levelsSum distance regression feature
Preferably, the global information perception module in the step (3) is configured to use the hierarchical feature F obtained in the step (1)1,F2,F3And said step (2) dual branch upsampled features as input, sub-imaging pixel classification global attention featuresAnd
preferably, the step (3) further comprises the steps of:
(31) the feature extracted by the coding network ResNet-50 network is recorded as a high-level feature and is recorded as FiWherein i is 4;
(311) for the high-level features F in the step (31)1,F2,F3Carrying out global average pooling;
(312) for the high-level features F in the step (31)1,F2,F3Carrying out up-sampling;
(313) applying a Sigmoid function to the result of said step (311), expressed as follows:
βi=S(G(Fi)),
wherein, S (-) is a Sigmoid function, G (-) represents global average pooling, and i ═ 1,2, and 3 respectively represent high-level features with different resolutions;
(32) the low-level feature input is the upsampled feature obtained by the upsampling double-branch decoding module in the step (2); wherein the pixel classification is characterized byDistance regression features are notedWherein i is 1,2, 3;
(321) spatially separable convolving the low-level features of step (32);
(322) carrying out element-level multiplication on the image features generated in the step (321) and the result in the step (313) to obtain new features;
(323) performing an element-level addition operation on the result of the step (322) and the result of the step (312);
the global information perception module of the pixel classification branch and the distance regression branch of step (3) may be represented as follows:
where Upsmple () is an upsampling operation, S () is a Sigmoid function, G (-) represents a global averaging pooling, Spconv (-) represents a spatially separable convolution, and i ═ 1,2, and 3 represent features of different resolutions, respectively.
Preferably, the constructing of the dual-branch feature aggregation module based on pixel classification and distance regression in step (4) includes the following steps:
(41) performing pixel-level addition of the result of step (3) and the result of step (2) as input to step (4);
(42) respectively convolving the pixel classification characteristic diagram and the distance regression characteristic diagram obtained in the step (41), and cascading the convolved characteristic diagrams;
(43) convolving the result of step (42) and performing a pixel-level addition with the result of step (41);
the information aggregation module of the pixel classification branch and the distance regression branch of step (4) may be represented as follows:
where Cat (-) denotes the signature channel cascade and Conv (-) denotes the 3 × 3 convolution.
Preferably, in the network training process in step (5), supervision in the network training process is divided into two parts: the penalty for pixel classification is a cross entropy penalty function, and the penalty for distance regression is a mean square error penalty function for normalized coordinates.
Preferably, the step (6) tests the image by using the trained network model parameters in the step (5) to generate a pixel classification result image and a distance regression result image, and then, the distance regression image is inverted by a post-processing technology to be used as a mark of the pixel classification image, and a final cell nucleus image segmentation result image is obtained by using a mark control watershed algorithm.
In summary, due to the adoption of the technical scheme, the invention has the beneficial effects that:
1) the method can realize complete and consistent segmentation of the cell nucleus image without manual design and characteristic extraction, and simulation results show that the segmentation result of the method is basically not influenced under the condition of adhesion overlapping;
2) the invention is composed of an encoding network for characteristic extraction and a double-branch decoding network for generating a pixel classification graph and a distance regression graph; the core of the method is that a pixel classification image of boundary information is obtained by using pixel classification, a distance image with positioning information is obtained by using distance regression, and the segmentation effect of a cell nucleus image can be improved by effectively combining two segmentation principles;
3) the invention extracts features from the network, and by constructing the global information perception module, the module can screen low-level features from high-level features through an attention mechanism, so as to better guide the low-level features, thereby enhancing the feature contrast between cell nucleuses and backgrounds;
4) the method extracts features from a decoding network, constructs a feature aggregation module based on pixel classification and distance regression on a decoding branch network by constructing a feature aggregation module, aggregates pixel classification graphs and distance regression graphs of the same feature level, better utilizes semantic correlation and spatial correlation between the pixel classification and the distance regression, captures the correlation between two tasks and keeps the difference between the two tasks.
Drawings
FIG. 1 is a flow chart of the training of the present invention;
FIG. 2 is a flow chart of the test of the present invention;
FIG. 3 is a first (horizontal) web framework of the present invention;
FIG. 4 is a second (vertical) version of the overall framework of the network of the present invention;
FIG. 5 is a schematic diagram of a global information awareness module according to the present invention;
FIG. 6 is a schematic view of a feature aggregation module of the present invention;
FIG. 7 is a schematic diagram of the post-processing technique of the present invention.
Detailed Description
The following further describes an embodiment of the nuclear image segmentation method based on pixel classification and distance regression according to the present invention with reference to fig. 1 to 7. The method for segmenting the cell nucleus image based on the pixel classification and the distance regression is not limited to the description of the following embodiment.
Example (b):
this embodiment provides a specific implementation of a cell nucleus image segmentation method based on pixel classification and distance regression, as shown in fig. 1 to 7, including the following steps:
(1) extracting features from an input image:
inputting an input image into a backbone network to extract image level features with different resolutions;
(2) constructing an upsampling double-branch decoding network:
constructing an up-sampling double-branch decoding network, and respectively up-sampling image hierarchy features with different resolutions in the step (1) to restore the image resolution based on the double-branch decoding network to obtain pixel classification features and distance regression features of different hierarchies;
(3) constructing a global information perception module:
constructing a global information perception module, processing the pixel classification characteristic and the distance regression characteristic in the step (2) based on the global information perception module, and screening the image level characteristics with different resolutions in the step (1) through an attention mechanism;
(4) constructing a characteristic aggregation module:
constructing a feature aggregation module, constructing a dual-branch feature aggregation module based on pixel classification and distance regression based on the dual-branch decoding network in the step (2), performing feature aggregation on the pixel classification features and the distance regression features which are positioned at the same feature level, and obtaining a final pixel classification output result and a final distance regression output result in the last feature aggregation module;
(5) training the algorithm network:
on the training data set, finishing algorithm network training by respectively minimizing a cross entropy loss function and a mean square error loss function to the pixel classification output result and the distance regression result in the step (4) by adopting a supervised learning mechanism to obtain network model parameters;
(6) and (3) testing the algorithm network:
and (4) on a test data set, utilizing the network model parameters obtained in the step (5) to obtain a final cell nucleus image segmentation result by utilizing a post-processing technology of controlling watershed based on the mark on the pixel classification branch output result and the distance regression branch output result obtained in the step (4).
Specifically, the input image in step (1) is a cell nucleus original image.
Specifically, the backbone network in step (1) is a ResNet-50 network, and backbone network parameters are shared.
Specifically, in step (1), 5 image level features F with different resolutions of the generated cell nucleus image are extracted through a ResNet-50 network0,F1,F2,F3,F4Wherein the number of channels of each feature is 64, 256, 512, 1024, 2048 respectively.
Specifically, the upsampling dual-branch decoding network in the step (2) is to use the highest layer characteristic F in the step (1)4As input, restoring image resolution and respectively obtaining pixel classification characteristics of different levels Sum distance regression feature
Further, the global information perception module in the step (3) is used for obtaining the hierarchical feature F obtained in the step (1)1,F2,F3And (2) using the feature of the two-branch up-sampling as input, and classifying the global attention feature of the sub-pixelAnd
further, the step (3) further comprises the following steps:
(31) ResNet-50 network for coded networkThe extracted features are recorded as high-level features, which are recorded as FiWherein i is 4;
(311) for the high-level feature F in step (31)1,F2,F3Carrying out global average pooling;
(312) for the high-level feature F in step (31)1,F2,F3Carrying out up-sampling;
(313) applying the Sigmoid function to the result of step (311) is expressed as follows:
βi=S(G(Fi)),
wherein, S (-) is a Sigmoid function, G (-) represents global average pooling, and i ═ 1,2, and 3 respectively represent high-level features with different resolutions;
(32) inputting the low-level features into the upsampling features obtained by the upsampling double-branch decoding module in the step (2); wherein the pixel classification is characterized byDistance regression features are notedWherein i is 1,2, 3;
(321) spatially separable convolving the low-level features of step (32);
(322) carrying out element-level multiplication on the image features generated in the step (321) and the result in the step (313) to obtain new features;
(323) performing an element-level addition operation on the result of step (322) and the result of step (312);
the global information perception module of the pixel classification branch and the distance regression branch in step (3) may be represented as follows:
where Upsmple () is an upsampling operation, S () is a Sigmoid function, G (-) represents a global averaging pooling, Spconv (-) represents a spatially separable convolution, and i ═ 1,2, and 3 represent features of different resolutions, respectively.
Further, the constructing of the dual-branch feature aggregation module based on pixel classification and distance regression in the step (4) includes the following steps:
(41) performing pixel-level addition on the result in the step (3) and the result in the step (2) as input of the step (4);
(42) respectively convolving the pixel classification characteristic diagram and the distance regression characteristic diagram obtained in the step (41), and cascading the convolved characteristic diagrams;
(43) convolving the result of step (42) and performing a pixel-level addition with the result of step (41);
the information aggregation module of the pixel classification branch and the distance regression branch in step (4) can be expressed as follows:
where Cat (-) denotes the signature channel cascade and Conv (-) denotes the 3 × 3 convolution.
Further, in the network training process in the step (5), supervision in the network training process is divided into two parts: the penalty for pixel classification is a cross entropy penalty function, and the penalty for distance regression is a mean square error penalty function for normalized coordinates.
Further, step (6) tests the image by using the trained network model parameters in step (5) to generate a pixel classification result image and a distance regression result image, negating the distance regression result image through a post-processing technology to be used as a mark of the pixel classification image, and obtaining a final cell nucleus image segmentation result image by using a mark control watershed algorithm.
By adopting the technical scheme:
firstly, constructing a coding network, extracting features of an input image, inputting the input image into a backbone network, and extracting image level features with different resolutions;
then, a two-branch up-sampling decoding network is constructed, the features of the coding network are up-sampled to restore the image resolution, and pixel classification features and distance regression features of different levels are obtained respectively;
then, a global information perception module is constructed, and the pixel classification features and the distance regression features are screened by using different hierarchical features of the coding network through an attention mechanism;
then, constructing a feature aggregation module, performing feature aggregation on the pixel classification features and the distance regression features which are positioned at the same feature level in a decoding network, and obtaining a pixel classification result and a distance regression result in the last feature aggregation module;
then, adopting a supervised learning mechanism to supervise the model by respectively using a minimized cross entropy loss function and a mean square error function for the pixel classification branch and the distance regression branch;
and finally, training an algorithm network to obtain model parameters, testing the algorithm network model to obtain a pixel classification graph and a distance regression graph, and controlling a watershed post-processing algorithm by using a marker to obtain a final segmentation result.
Has the following advantages:
the invention can realize the complete and consistent segmentation of the cell nucleus image without manual design and characteristic extraction, and simulation results show that the segmentation result of the invention is not affected basically under the condition of adhesion and overlapping.
The invention is composed of an encoding network for feature extraction and a dual-branch decoding network for generating a pixel classification map and a distance regression map. The core of the method is to obtain a pixel classification image of boundary information by using pixel classification, obtain a distance image with positioning information by using distance regression, and effectively combine two segmentation principles to improve the segmentation effect of a cell nucleus image.
The invention extracts features from the network, and by constructing the global information perception module, the module can screen the low-level features from the high-level features through an attention mechanism, so as to better guide the low-level features, thereby enhancing the feature contrast between the cell nucleus and the background.
The method extracts features from a decoding network, constructs a feature aggregation module based on pixel classification and distance regression on a decoding branch network by constructing a feature aggregation module, aggregates pixel classification graphs and distance regression graphs of the same feature level, better utilizes semantic correlation and spatial correlation between the pixel classification and the distance regression, captures the correlation between two tasks and keeps the difference between the two tasks.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several simple deductions or substitutions can be made without departing from the spirit of the invention, and all shall be considered as belonging to the protection scope of the invention.