Multi-exposure fusion image quality evaluation method
1. A multi-exposure fusion image quality evaluation method is characterized by comprising the following steps:
step 1: selecting one multi-exposure fusion image as a multi-exposure fusion image to be evaluated and recording the multi-exposure fusion image as SmefiSimultaneously adding SmefiCorresponding three original images with different exposure degrees, namely a normal exposure image, an overexposure image and an underexposure image are correspondingly marked as Snormal、Sover-ex、Sunder-ex(ii) a Wherein S ismefi、Snormal、Sover-ex、Sunder-exHas a width of W and a height of H;
step 2: calculating Smefi、Snormal、Sover-ex、Sunder-exRespective gradient map, corresponding to Gmefi、Gnormal、Gover-ex、Gunder-ex(ii) a Wherein G ismefi、Gnormal、Gover-ex、Gunder-exHas a width of W and a height of H;
and step 3: from Gnormal、Gover-ex、Gunder-exExtracting to obtain a maximum value gradient map, and recording as GmaxG ismaxThe pixel value of the pixel point with the middle coordinate position (x, y) is marked as Gmax(x,y),Gmax(x,y)=max(Gnormal(x,y),Gover-ex(x,y),Gunder-ex(x, y)); then calculate GmaxEach pixel point in (1) and (G)mefiThe SSIM of the corresponding pixel point in (1); the average value of W × H SSIM values is calculated again, and the average value is taken as SmefiThe gradient characteristic of (a); wherein G ismaxIs W and H, x is more than or equal to 1 and less than or equal to W, y is more than or equal to 1 and less than or equal to H, max () is a function of taking the maximum value, Gnormal(x, y) denotes GnormalThe pixel value G of the pixel point with the middle coordinate position (x, y)over-ex(x, y) denotes Gover-exThe pixel value G of the pixel point with the middle coordinate position (x, y)under-ex(x, y) denotes Gunder-exThe middle coordinate position is the pixel value of the pixel point of (x, y);
and 4, step 4: according to Gnormal、Gover-ex、Gunder-exThe pixel values of the pixel points at the same coordinate position constitute Gnormal、Gover-ex、Gunder-exThe common corresponding dimension of the pixel points of the middle and same coordinate position is a gradient value matrix with 3 multiplied by 2, GnormalPixel point with (x, y) middle coordinate position, Gover-exPixel point with (x, y) middle coordinate position, Gunder-exThe gradient value matrix with dimension of 3 multiplied by 2, which is commonly corresponding to the pixels with (x, y) as the middle coordinate position, is recorded as J(x,y),Likewise, according to GmefiEach of (1) toPixel value composition G of each pixelmefiCorresponding to each pixel point in the matrix of gradient values of dimension 1 x 2, will GmefiThe gradient value matrix with the dimension of 1 multiplied by 2 corresponding to the pixel point with the (x, y) coordinate position is recorded as J'(x,y),Then calculate Gnormal、Gover-ex、Gunder-exStructure tensor of pixel point at middle and same coordinate position, GnormalPixel point with (x, y) middle coordinate position, Gover-exPixel point with (x, y) middle coordinate position, Gunder-exThe structure tensor of the pixel point with the middle coordinate position (x, y) is recorded as Z(x,y),Z(x,y)=(J(x,y))TJ(x,y)(ii) a Likewise, calculate GmefiStructure tensor of each pixel point in GmefiThe structure tensor of the pixel point with the middle coordinate position (x, y) is recorded as Z'(x,y),Z'(x,y)=(J'(x,y))TJ'(x,y)(ii) a Then calculate Gnormal、Gover-ex、Gunder-exStructure tensor and G of pixel point in same coordinate positionmefiThe cosine distance of the structure tensor of the middle corresponding pixel point is Z(x,y)And Z'(x,y)Is recorded as d(x,y)(ii) a Then, the average value of all cosine distances is taken as SmefiStructural features of (a); wherein the content of the first and second substances,which is indicative of the horizontal direction,which is indicative of the vertical direction of the,represents GnormalA component in the horizontal direction of (x, y),represents GnormalThe vertical direction component of (x, y),represents Gover-exA component in the horizontal direction of (x, y),represents Gover-exThe vertical direction component of (x, y),represents Gunder-exA component in the horizontal direction of (x, y),represents Gunder-exComponent of (x, y) in the vertical direction, Z(x,y)Has a dimension of 2X 2, (J)(x,y))TDenotes J(x,y)The transpose of (a) is performed,represents GmefiThe pixel value G of the pixel point with the middle coordinate position of (x, y)mefiA component in the horizontal direction of (x, y),represents GmefiThe pixel value G of the pixel point with the middle coordinate position of (x, y)mefiComponent of (x, y) in the vertical direction, Z'(x,y)Has a dimensionality of 2X 2, (J'(x,y))TRepresents J'(x,y)Transposing;
and 5: calculating Snormal、Sover-ex、Sunder-exExposure, contrast and saturation of each pixel in each, SnormalThe exposure, contrast and saturation of the pixel point with the (x, y) middle coordinate position are correspondingly marked as Enormal(x,y)、Cnormal(x,y)、Sanormal(x, y), mixing Sover-exExposure and pair of pixel point with (x, y) as middle coordinate positionThe ratio and saturation correspondence is denoted as Eover-ex(x,y)、Cover-ex(x,y)、Saover-ex(x, y), mixing Sunder-exThe exposure, contrast and saturation of the pixel point with the (x, y) middle coordinate position are correspondingly marked as Eunder-ex(x,y)、Cunder-ex(x,y)、Saunder-ex(x, y); then calculate Snormal、Sover-ex、Sunder-exThe weight of each pixel point in the S is the weight of SnormalThe weight of the pixel point with the middle coordinate position (x, y) is recorded as omeganormal(x,y),ωnormal(x,y)=Enormal(x,y)×Cnormal(x,y)×Sanormal(x, y), mixing Sover-exThe weight of the pixel point with the middle coordinate position (x, y) is recorded as omegaover-ex(x,y),ωover-ex(x,y)=Eover-ex(x,y)×Cover-ex(x,y)×Saover-ex(x, y), mixing Sunder-exThe weight of the pixel point with the middle coordinate position (x, y) is recorded as omegaunder-ex(x,y),ωunder-ex(x,y)=Eunder-ex(x,y)×Cunder-ex(x,y)×Saunder-ex(x, y); then to Snormal、Sover-ex、Sunder-exThe weight of each pixel point in the S is normalized to obtain Snormal、Sover-ex、Sunder-exThe weight map of each corresponding weight, the correspondence is marked as weightnormal、weightover-ex、weightunder-ex(ii) a Then to Snormal、Sover-ex、Sunder-exAnd weightnormal、weightover-ex、weightunder-exPyramid fusion is carried out to Snormal、Sover-ex、Sunder-exUpsampling generates a Laplacian pyramid, for weightnormal、weightover-ex、weightunder-exSampling up to generate a Gaussian pyramid, and fusing to obtain a pseudo-reference fusion image; then calculate SmefiEach pixel point in the pseudo-reference fusion image and the corresponding SSIM value of the pixel point in the pseudo-reference fusion image; finally, calculating the average value of W multiplied by H SSIM values, and taking the average value as SmefiThe global perceptual features of (1);
step 6:will SmefiCharacteristic of gradient of SmefiStructural feature of (1), SmefiIs used as SmefiThe feature vector of (2);
and 7: will SmefiThe characteristic vector of the S is used as input, and the S is obtained by calculation by combining the support vector regression technologymefiThe objective quality evaluation predicted value; wherein S ismefiThe larger the objective quality evaluation predicted value of (A), the larger the result of the evaluation is, the more SmefiThe better the quality of (b); otherwise, explain SmefiThe worse the quality of (c).
2. The method for evaluating the quality of a multi-exposure fusion image according to claim 1, wherein in the step 2, S is calculatedmefi、Snormal、Sover-ex、Sunder-exThe gradient operator adopted in the respective gradient map is one of Prewitt operator, Roberts operator, Scharr operator and Sobel operator.
3. The method for evaluating the quality of a multi-exposure fusion image according to claim 1 or 2, wherein in the step 4,wherein the content of the first and second substances,is represented by Z(x,y)Converted into a vector of length 4,is represented by Z'(x,y)Converted into a vector of length 4,andthe obtaining mode is the same, the symbol "| | |" is the operation symbol of taking the modulus。
4. The method for evaluating the quality of a multi-exposure fusion image according to claim 3, wherein in the step 5,Cnormal(x,y)=|L*Ynormal(x,y)|,Sanormal(x,y)=|Unormal(x,y)|+|Vnormal(x,y)|+1,Cover-ex(x,y)=|L*Yover-ex(x,y)|,Saover-ex(x,y)=|Uover-ex(x,y)|+|Vover-ex(x,y)|+1,Cunder-ex(x,y)=|L*Yunder-ex(x,y)|,Saunder-ex(x,y)=|Uunder-ex(x,y)|+|Vunder-ex(x, y) | + 1; wherein e represents a natural base number,denotes SnormalThe coordinate position in the Y channel is the normalized value of the pixel point of (x, Y),Ynormal(x, y) denotes SnormalThe Y channel of (a) is a pixel value of a pixel point whose coordinate position is (x, Y), μ and σ are both constants, μ ═ 0.5, σ ═ 0.2, the symbol "|" is an absolute value symbol, L denotes a laplacian operator, the symbol "|" is a convolution operation symbol, U is a convolution operation symbol, and Y is a linear function of the absolute value of the symbol, "| | is a linear function of the symbolnormal(x, y) denotes SnormalThe U channel of (b) is a pixel value of a pixel point with a coordinate position of (x, y), Vnormal(x, y) denotes SnormalThe coordinate position in the V channel is the pixel value of the pixel point of (x, y),denotes Sover-exThe coordinate position in the Y channel is the normalized value of the pixel point of (x, Y),Yover-ex(x, y) denotes Sover-exThe Y channel has a pixel value of a pixel point with a coordinate position of (x, Y), Uover-ex(x, y) denotes Sover-exThe U channel of (b) is a pixel value of a pixel point with a coordinate position of (x, y), Vover-ex(x, y) denotes Sover-exThe coordinate position in the V channel is the pixel value of the pixel point of (x, y),denotes Sunder-exThe coordinate position in the Y channel is the normalized value of the pixel point of (x, Y),Yunder-ex(x, y) denotes Sunder-exThe Y channel has a pixel value of a pixel point with a coordinate position of (x, Y), Uunder-ex(x, y) denotes Sunder-exThe U channel of (b) is a pixel value of a pixel point with a coordinate position of (x, y), Vunder-ex(x, y) denotes Sunder-exThe coordinate position in the V channel of (a) is the pixel value of the pixel point of (x, y).
5. The method for evaluating the quality of a multi-exposure fusion image as claimed in claim 4, wherein in the step 5, weightnormal、weightover-ex、weightunder-exThe acquisition process comprises the following steps: will weightnormalThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as weightnormal(x,y),weightnormal(x, y) is also for ωnormal(x, y) weight obtained after normalization processing, and weightover-exThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as weightover-ex(x,y),weightover-ex(x, y) is also for ωover-ex(x, y) weight obtained after normalization processing, and weightunder-exThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as weightunder-ex(x,y),weightunder-ex(x, y) is also for ωunder-ex(x, y) weights obtained after normalization processing.
Background
Dynamic range refers to the ratio of the maximum to minimum of the light intensity in a scene. On the other hand, in a scene with a high dynamic range, the existing display device can only process a very limited dynamic range, which results in that a single digital photo cannot show all the detail information in the natural scene, specifically, the detail information in a bright area and a dark area of the digital photo is lost. In the shooting process of a common digital camera, overexposure and underexposure phenomena exist, which causes the phenomenon of low dynamic range. In a special shooting environment, such as in the sky, due to the particularity of the light environment, there is a phenomenon of high brightness and high darkness, which results in the loss of detail information of an excessively bright area and an excessively dark area of the digital photo. Therefore, a single digital photo usually loses part of the detailed information of the natural scene, even the key information, and thus cannot meet the requirement of people for high-quality pictures.
In recent years, two technical approaches exist to solve the technical problem that a single digital photo cannot show all the detailed information in a natural scene: high Dynamic Range (HDR) technology and Multi-exposure image Fusion (MEF) technology.
The high dynamic range technology firstly expands the dynamic range of an image through image sequences with different exposure degrees in the same scene, and then displays the image in a single image through a tone mapping method. The high dynamic range technology requires two steps of high dynamic range reconstruction and tone mapping, so that the problems of large calculation amount and long time consumption exist.
The multi-exposure image fusion technology directly fuses the multi-exposure image sequence, and greatly simplifies the image generation process. Generally, a multi-exposure image fusion technique first takes a plurality of low dynamic range images at different exposures using a digital camera, and then performs image fusion in a transform domain or a spatial domain. In recent years, with the continuous development of deep learning and neural network related research, many deep learning-based multi-exposure image fusion methods are also appeared. The multi-exposure image fusion technology is widely applied to various electronic display devices at present, and is proved to be capable of effectively enhancing the display effect of images and displaying the images on common display devices without expensive high-dynamic display devices.
At present, many people propose different multi-exposure image fusion methods, but research on multi-exposure fusion image quality evaluation is still deficient. Therefore, in order to select a multi-exposure image fusion method with the best performance, the quality evaluation of the multi-exposure fusion image is important. In recent decades, a large number of researchers in the field of image evaluation have developed objective quality evaluation models for evaluating the quality of a multi-exposure fusion image, and some have considered that the quality of the multi-exposure fusion image is related to the degree of information retention, and proposed a method for evaluating the quality of the multi-exposure fusion image by calculating mutual information between a reference image and the multi-exposure fusion image, but the method does not aim at specific features of the multi-exposure fusion image, but only considers the information correlation of the whole multi-exposure fusion image. Later, people found that the edge area of the multi-exposure fusion image has great influence on a human visual perception system, and therefore, quality evaluation methods based on image edge information are provided. For example, Sobel edge operators are used for extracting edge information of input images, the intensity and direction keeping degree of the edge information in each reference image and multi-exposure fusion image is calculated, and then combination is carried out between original images to obtain a final quality score; or the image is subjected to scale decomposition by using wavelet transform, and the edge preservation of the multi-exposure fusion image at each scale is calculated. In addition, an evaluation method has been developed by calculating the degree of expression of the local saliency information of the reference image in the multi-exposure fusion image. The relevance between the objective evaluation result and the subjective perception of the multi-exposure fusion image quality evaluation methods is still to be improved.
Disclosure of Invention
The invention aims to provide a multi-exposure fusion image quality evaluation method which can effectively improve the correlation between objective evaluation results and subjective perception.
The technical scheme adopted by the invention for solving the technical problems is as follows: a multi-exposure fusion image quality evaluation method is characterized by comprising the following steps:
step 1: selecting one multi-exposure fusion image as a multi-exposure fusion image to be evaluated and recording the multi-exposure fusion image as SmefiSimultaneously adding SmefiCorresponding three original images with different exposure degrees, namely a normal exposure image, an overexposure image and an underexposure image are correspondingly marked as Snormal、Sover-ex、Sunder-ex(ii) a Wherein S ismefi、Snormal、Sover-ex、Sunder-exHas a width of W and a height of H;
step 2: calculating Smefi、Snormal、Sover-ex、Sunder-exRespective gradient map, corresponding to Gmefi、Gnormal、Gover-ex、Gunder-ex(ii) a Wherein G ismefi、Gnormal、Gover-ex、Gunder-exHas a width of W and a height of H;
and step 3: from Gnormal、Gover-ex、Gunder-exExtracting to obtain a maximum value gradient map, and recording as GmaxG ismaxThe pixel value of the pixel point with the middle coordinate position (x, y) is marked as Gmax(x,y),Gmax(x,y)=max(Gnormal(x,y),Gover-ex(x,y),Gunder-ex(x, y)); then calculate GmaxEach pixel point in (1) and (G)mefiThe SSIM of the corresponding pixel point in (1); the average value of W × H SSIM values is calculated again, and the average value is taken as SmefiThe gradient characteristic of (a); wherein G ismaxIs W and H, x is more than or equal to 1 and less than or equal to W, y is more than or equal to 1 and less than or equal to H, max () is a function of taking the maximum value, Gnormal(x, y) denotes GnormalThe pixel value G of the pixel point with the middle coordinate position (x, y)over-ex(x, y) denotes Gover-exThe pixel value G of the pixel point with the middle coordinate position (x, y)under-ex(x, y) denotes Gunder-exThe middle coordinate position is the pixel value of the pixel point of (x, y);
and 4, step 4: according to Gnormal、Gover-ex、Gunder-exThe pixel values of the pixel points at the same coordinate position constitute Gnormal、Gover-ex、Gunder-exThe common corresponding dimension of the pixel points of the middle and same coordinate position is a gradient value matrix with 3 multiplied by 2, GnormalPixel point with (x, y) middle coordinate position, Gover-exPixel point with (x, y) middle coordinate position, Gunder-exThe gradient value matrix with dimension of 3 multiplied by 2, which is commonly corresponding to the pixels with (x, y) as the middle coordinate position, is recorded as J(x,y),Likewise, according to GmefiThe pixel value of each pixel point in (1) constitutes GmefiCorresponding to each pixel point in the matrix of gradient values of dimension 1 x 2, will GmefiThe gradient value matrix with the dimension of 1 multiplied by 2 corresponding to the pixel point with the (x, y) coordinate position is recorded as J'(x,y),Then calculate Gnormal、Gover-ex、Gunder-exStructure tensor of pixel point at middle and same coordinate position, GnormalPixel point with (x, y) middle coordinate position, Gover-exPixel point with (x, y) middle coordinate position, Gunder-exThe structure tensor of the pixel point with the middle coordinate position (x, y) is recorded as Z(x,y),Z(x,y)=(J(x,y))TJ(x,y)(ii) a Likewise, calculate GmefiStructure tensor of each pixel point in GmefiThe structure tensor of the pixel point with the middle coordinate position (x, y) is recorded as Z'(x,y),Z'(x,y)=(J'(x,y))TJ'(x,y)(ii) a Then calculate Gnormal、Gover-ex、Gunder-exStructure tensor and G of pixel point in same coordinate positionmefiThe cosine distance of the structure tensor of the middle corresponding pixel point is Z(x,y)And Z'(x,y)Is recorded as d(x,y)(ii) a Then, the average value of all cosine distances is taken as SmefiStructural features of (a); wherein the content of the first and second substances,which is indicative of the horizontal direction,which is indicative of the vertical direction of the,represents GnormalA component in the horizontal direction of (x, y),represents GnormalThe vertical direction component of (x, y),represents Gover-exA component in the horizontal direction of (x, y),represents Gover-exThe vertical direction component of (x, y),represents Gunder-exA component in the horizontal direction of (x, y),represents Gunder-exComponent of (x, y) in the vertical direction, Z(x,y)Has a dimension of 2X 2, (J)(x,y))TDenotes J(x,y)The transpose of (a) is performed,represents GmefiThe pixel value G of the pixel point with the middle coordinate position of (x, y)mefiA component in the horizontal direction of (x, y),represents GmefiThe pixel value G of the pixel point with the middle coordinate position of (x, y)mefiComponent of (x, y) in the vertical direction, Z'(x,y)Has a dimensionality of 2X 2, (J'(x,y))TRepresents J'(x,y)Transposing;
and 5: calculating Snormal、Sover-ex、Sunder-exExposure, contrast and saturation of each pixel in each, SnormalThe exposure, contrast and saturation of the pixel point with the (x, y) middle coordinate position are correspondingly marked as Enormal(x,y)、Cnormal(x,y)、Sanormal(x, y), mixing Sover-exThe exposure, contrast and saturation of the pixel point with the (x, y) middle coordinate position are correspondingly marked as Eover-ex(x,y)、Cover-ex(x,y)、Saover-ex(x, y), mixing Sunder-exThe exposure, contrast and saturation of the pixel point with the (x, y) middle coordinate position are correspondingly marked as Eunder-ex(x,y)、Cunder-ex(x,y)、Saunder-ex(x, y); then calculate Snormal、Sover-ex、Sunder-exThe weight of each pixel point in the S is the weight of SnormalThe weight of the pixel point with the middle coordinate position (x, y) is recorded as omeganormal(x,y),ωnormal(x,y)=Enormal(x,y)×Cnormal(x,y)×Sanormal(x, y), mixing Sover-exThe weight of the pixel point with the middle coordinate position (x, y) is recorded as omegaover-ex(x,y),ωover-ex(x,y)=Eover-ex(x,y)×Cover-ex(x,y)×Saover-ex(x, y), mixing Sunder-exThe weight of the pixel point with the middle coordinate position (x, y) is recorded as omegaunder-ex(x,y),ωunder-ex(x,y)=Eunder-ex(x,y)×Cunder-ex(x,y)×Saunder-ex(x, y); then to Snormal、Sover-ex、Sunder-exThe weight of each pixel point in the S is normalized to obtain Snormal、Sover-ex、Sunder-exThe weight map of each corresponding weight, the correspondence is marked as weightnormal、weightover-ex、weightunder-ex(ii) a Then to Snormal、Sover-ex、Sunder-exAnd weightnormal、weightover-ex、weightunder-exPyramid fusion is carried out to Snormal、Sover-ex、Sunder-exUpsampling generates a Laplacian pyramid, for weightnormal、weightover-ex、weightunder-exSampling up to generate a Gaussian pyramid, and fusing to obtain a pseudo-reference fusion image; then calculate SmefiEach pixel point in the pseudo-reference fusion image and the corresponding SSIM value of the pixel point in the pseudo-reference fusion image; finally, calculating the average value of W multiplied by H SSIM values, and taking the average value as SmefiThe global perceptual features of (1);
step 6: will SmefiCharacteristic of gradient of SmefiStructural feature of (1), SmefiIs used as SmefiThe feature vector of (2);
and 7: will SmefiThe characteristic vector of the S is used as input, and the S is obtained by calculation by combining the support vector regression technologymefiThe objective quality evaluation predicted value; wherein S ismefiThe larger the objective quality evaluation predicted value of (A), the larger the result of the evaluation is, the more SmefiThe better the quality of (b); otherwise, explain SmefiThe worse the quality of (c).
In the step 2, S is calculatedmefi、Snormal、Sover-ex、Sunder-exThe gradient operator adopted in the respective gradient map is one of Prewitt operator, Roberts operator, Scharr operator and Sobel operator.
In the step 4, the step of processing the image,wherein the content of the first and second substances,is represented by Z(x,y)Converted into a vector of length 4,is represented by Z'(x,y)Converted into a vector of length 4,andare obtained in the same way, symbols "And | | is a modulo operation symbol.
In the step 5, the step of the method is that,Cnormal(x,y)=|L*Ynormal(x,y)|,Sanormal(x,y)=|Unormal(x,y)|+|Vnormal(x,y)|+1,Cover-ex(x,y)=|L*Yover-ex(x,y)|,Saover-ex(x,y)=|Uover-ex(x,y)|+|Vover-ex(x,y)|+1,Cunder-ex(x,y)=|L*Yunder-ex(x,y)|,Saunder-ex(x,y)=|Uunder-ex(x,y)|+|Vunder-ex(x, y) | + 1; wherein e represents a natural base number,denotes SnormalThe coordinate position in the Y channel is the normalized value of the pixel point of (x, Y),Ynormal(x, y) denotes SnormalThe Y channel of (a) is a pixel value of a pixel point whose coordinate position is (x, Y), μ and σ are both constants, μ ═ 0.5, σ ═ 0.2, the symbol "|" is an absolute value symbol, L denotes a laplacian operator, the symbol "|" is a convolution operation symbol, U is a convolution operation symbol, and Y is a linear function of the absolute value of the symbol, "| | is a linear function of the symbolnormal(x, y) denotes SnormalThe U channel of (b) is a pixel value of a pixel point with a coordinate position of (x, y), Vnormal(x, y) denotes SnormalThe coordinate position in the V channel is the pixel value of the pixel point of (x, y),denotes Sover-exThe coordinate position in the Y channel is the normalized value of the pixel point of (x, Y),Yover-ex(x, y) denotes Sover-exThe Y channel has a pixel value of a pixel point with a coordinate position of (x, Y), Uover-ex(x, y) denotes Sover-exThe U channel of (b) is a pixel value of a pixel point with a coordinate position of (x, y), Vover-ex(x, y) denotes Sover-exThe coordinate position in the V channel is the pixel value of the pixel point of (x, y),denotes Sunder-exThe coordinate position in the Y channel is the normalized value of the pixel point of (x, Y),Yunder-ex(x, y) denotes Sunder-exThe Y channel has a pixel value of a pixel point with a coordinate position of (x, Y), Uunder-ex(x, y) denotes Sunder-exThe U channel of (b) is a pixel value of a pixel point with a coordinate position of (x, y), Vunder-ex(x, y) denotes Sunder-exThe coordinate position in the V channel of (a) is the pixel value of the pixel point of (x, y).
In the step 5, weightnormal、weightover-ex、weightunder-exThe acquisition process comprises the following steps: will weightnormalThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as weightnormal(x,y),
weightnormal(x, y) is also for ωnormal(x, y) weight obtained after normalization processing, and weightover-exThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as weightover-ex(x,y),
weightover-ex(x, y) is also for ωover-ex(x, y) normalization processingThe weight obtained later, weightunder-exThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as weightunder-ex(x,y),
weightunder-ex(x, y) is also for ωunder-ex(x, y) weights obtained after normalization processing.
Compared with the prior art, the invention has the advantages that:
1) the method firstly considers that the gradient value of the pixel point in the image reflects the change degree of the pixel value of the pixel point, pixel points at some edge positions of the image usually have larger gradient values, and in areas where the image details are less and smoother, the gradient values of the pixel points are smaller, and in general, the visibility of the edge pixel points in the image is closely related to the amplitude of the gradient, the part with larger amplitude of the gradient of the pixel points is generally higher in visibility and clearer, and considering that three original images with different exposure degrees, namely a normal exposure image, an overexposure image and an underexposure image, are shot real natural scenes, therefore, the maximum gradient value of the corresponding pixel point is used as the optimal gradient value under the real natural scene condition, the gradient characteristics of the multi-exposure fusion image can be well reflected, and the correlation between the objective evaluation result and the subjective perception is effectively improved.
2) According to the method, a Jacobian matrix is used for combining structural features of images with different exposure degrees, the structural features of a multi-exposure fusion image are expressed through a constructed structure tensor, and in consideration of the fact that the brightness change and the chromaticity change of the image are particularly important for image quality, an RGB image is converted into a YUV color space, a weight graph is constructed from three aspects of exposure, contrast and saturation, the weight graph is used for obtaining various information in the multi-exposure fusion image, and further the global perception feature of the multi-exposure fusion image is obtained.
Drawings
FIG. 1 is a block diagram of an overall implementation of the method of the present invention;
FIG. 2 is a schematic diagram of a pyramid fusion process in the method of the present invention;
FIG. 3a is an overexposed image;
FIG. 3b is a normal exposure image;
FIG. 3c is an underexposed image;
FIG. 3d is a multi-exposure fused image obtained according to FIGS. 3a, 3b, and 3 c;
FIG. 3e is a gradient map of FIG. 3 a;
FIG. 3f is a gradient map of FIG. 3 b;
FIG. 3g is a gradient map of FIG. 3 c;
FIG. 3h is the gradient map of FIG. 3 d;
FIG. 3i is a graph of the maximum gradient extracted from FIGS. 3e, 3f, and 3 g;
FIG. 4a is a pseudo-reference fused image;
FIG. 4b is a multi-exposure fusion image to be evaluated;
fig. 4c is the SSIM diagram of fig. 4b and fig. 4 a.
Detailed Description
The invention is described in further detail below with reference to the accompanying examples.
The invention provides a multi-exposure fusion image quality evaluation method, the overall implementation block diagram of which is shown in FIG. 1, and the method comprises the following steps:
step 1: selecting one multi-exposure fusion image as a multi-exposure fusion image to be evaluated and recording the multi-exposure fusion image as SmefiSimultaneously adding SmefiCorresponding three original images with different exposure degrees, namely a normal exposure image, an overexposure image and an underexposure image are correspondingly marked as Snormal、Sover-ex、Sunder-ex(ii) a Wherein S ismefi、Snormal、Sover-ex、Sunder-exHas a width W and a height H.
Step 2: calculating Smefi、Snormal、Sover-ex、Sunder-exRespective gradient map, corresponding to Gmefi、Gnormal、Gover-ex、Gunder-ex(ii) a Wherein G ismefi、Gnormal、Gover-ex、Gunder-exHas a width of W and a height of H; the gradient value of the pixel point in the image reflects the change degree of the pixel point value, the pixel points at some edge positions of the image usually have larger gradient values, and in the areas with less image details and smoother, the pixel value change is smaller, and the gradient values of the pixel points in the corresponding areas are also reduced.
In this embodiment, in step 2, S is calculatedmefi、Snormal、Sover-ex、Sunder-exThe gradient operator adopted in the respective gradient map is one of Prewitt operator, Roberts operator, Scharr operator and Sobel operator. In image processing, the gradient values of pixel points in an image generally refer to the modulus of the gradient.
And step 3: generally speaking, the visibility of an edge pixel point in an image, namely the visibility of the edge pixel point, is closely related to the amplitude of a gradient, the part with the larger amplitude of the gradient of the pixel point is generally higher in visibility and clearer, meanwhile, the too-large gradient may cause the local too-high sharpness of the image, and a certain difference exists between the local sharpness of the image and an actual scene, and the S is considerednormal、Sover-ex、Sunder-exThe maximum gradient value of the corresponding pixel point is taken as the optimal gradient value under the condition of the real natural scene. Thus the invention proceeds from Gnormal、Gover-ex、Gunder-exExtracting to obtain a maximum value gradient map, and recording as GmaxG ismaxThe pixel value of the pixel point with the middle coordinate position (x, y) is marked as Gmax(x,y),Gmax(x,y)=max(Gnormal(x,y),Gover-ex(x,y),Gunder-ex(x, y)); then calculate GmaxEach pixel point in (1) and (G)mefiThe SSIM (structural similarity) value of the corresponding pixel point (i.e. calculating G)maxAnd GmefiSSIM value of a pixel point of the same coordinate position); then calculate the average value of W × H SSIM values, andaverage value is taken as SmefiThe gradient characteristic of (a); wherein G ismaxIs W and H, x is more than or equal to 1 and less than or equal to W, y is more than or equal to 1 and less than or equal to H, max () is a function of taking the maximum value, Gnormal(x, y) denotes GnormalThe pixel value of the pixel point with the middle coordinate position (x, y), namely SnormalGradient value G of pixel point with (x, y) as middle coordinate positionover-ex(x, y) denotes Gover-exThe pixel value of the pixel point with the middle coordinate position (x, y), namely Sover-exGradient value G of pixel point with (x, y) as middle coordinate positionunder-ex(x, y) denotes Gunder-exThe pixel value of the pixel point with the middle coordinate position (x, y), namely Sunder-exAnd the middle coordinate position is the gradient value of the pixel point of (x, y).
And 4, step 4: however, in order to process a high-dimensional image formed by combining a plurality of images with different exposure degrees, structural information of how to combine the images with different exposure degrees needs to be considered again and is inspired by research on structure tensor, and the invention combines the structural characteristics of the images with different exposure degrees by using a Jacobian matrix, namely according to Gnormal、Gover-ex、Gunder-exThe pixel values of the pixel points at the same coordinate position constitute Gnormal、Gover-ex、Gunder-exThe common corresponding dimension of the pixel points of the middle and same coordinate position is a gradient value matrix with 3 multiplied by 2, GnormalPixel point with (x, y) middle coordinate position, Gover-exPixel point with (x, y) middle coordinate position, Gunder-exThe gradient value matrix with dimension of 3 multiplied by 2, which is commonly corresponding to the pixels with (x, y) as the middle coordinate position, is recorded as J(x,y),Likewise, according to GmefiThe pixel value of each pixel point in (1) constitutes GmefiCorresponding to each pixel point in the matrix of gradient values of dimension 1 x 2, will GmefiThe middle coordinate position isA gradient value matrix with 1 × 2 dimensionality corresponding to the pixel points of (x, y) is recorded as J'(x,y),Then calculate Gnormal、Gover-ex、Gunder-exStructure tensor of pixel point at middle and same coordinate position, GnormalPixel point with (x, y) middle coordinate position, Gover-exPixel point with (x, y) middle coordinate position, Gunder-exThe structure tensor of the pixel point with the middle coordinate position (x, y) is recorded as Z(x,y),Z(x,y)=(J(x,y))TJ(x,y),Z(x,y)Is a real symmetric matrix, so it has two non-negative real eigenvalues, the corresponding eigenvalues represent the rate of change of the image; likewise, calculate GmefiStructure tensor of each pixel point in GmefiThe structure tensor of the pixel point with the middle coordinate position (x, y) is recorded as Z'(x,y),Z'(x,y)=(J'(x,y))TJ'(x,y)(ii) a Then, in order to calculate the difference of the structural information between the three original images with different exposure degrees and the multi-exposure fusion image to be evaluated, the invention uses the cosine distance between the structure tensors to represent the difference and calculates Gnormal、Gover-ex、Gunder-exStructure tensor and G of pixel point in same coordinate positionmefiThe cosine distance of the structure tensor of the middle corresponding pixel point is Z(x,y)And Z'(x,y)Is recorded as d(x,y)(ii) a Then, the average value of all cosine distances is taken as SmefiStructural features of (a); wherein the content of the first and second substances,which is indicative of the horizontal direction,which is indicative of the vertical direction of the,represents GnormalThe component of (x, y) in the horizontal direction, i.e. isSnormalThe horizontal gradient value of the pixel point with the middle coordinate position of (x, y),represents GnormalThe perpendicular component of (x, y), i.e. SnormalThe vertical gradient value of the pixel point with the middle coordinate position (x, y),represents Gover-exThe horizontal component of (x, y), i.e. Sover-exThe horizontal gradient value of the pixel point with the middle coordinate position of (x, y),represents Gover-exThe perpendicular component of (x, y), i.e. Sover-exThe vertical gradient value of the pixel point with the middle coordinate position (x, y),represents Gunder-exThe horizontal component of (x, y), i.e. Sunder-exThe horizontal gradient value of the pixel point with the middle coordinate position of (x, y),represents Gunder-exThe perpendicular component of (x, y), i.e. Sunder-exVertical gradient value, Z, of pixel point with (x, y) as middle coordinate position(x,y)Has a dimension of 2X 2, (J)(x,y))TDenotes J(x,y)The transpose of (a) is performed,represents GmefiThe pixel value G of the pixel point with the middle coordinate position of (x, y)mefiThe horizontal component of (x, y), i.e. SmefiThe horizontal gradient value of the pixel point with the middle coordinate position of (x, y),represents GmefiThe middle coordinate position is (x,y) pixel value G of a pixel pointmefiThe perpendicular component of (x, y), i.e. SmefiThe vertical gradient value, Z ', of the pixel point with the middle coordinate position (x, y)'(x,y)Has a dimensionality of 2X 2, (J'(x,y))TRepresents J'(x,y)The transposing of (1).
In this particular embodiment, in step 4,wherein the content of the first and second substances,is represented by Z(x,y)Converted into a vector of length 4,is represented by Z'(x,y)Converted into a vector of length 4,andthe obtaining mode is the same, and the symbol "| | |" is a modulus operation symbol.
And 5: although the gradient features can effectively capture the local edges in the multi-exposure fusion image to be evaluated, and the structure can effectively capture the structure of the multi-exposure fusion image to be evaluated, the global perception change of the multi-exposure fusion image to be evaluated is not noticed, and the slight change is easily perceived by human beings, so that the global perception measurement is added in order to enable the objective evaluation result of the method to be more consistent with the perception of the human beings on the image. Considering that the influence of brightness change and chromaticity change of an image on human perception is particularly important, firstly, an RGB image is converted into a YUV color space, in the process of multi-exposure image fusion, the influence of different exposure degrees of the multi-exposure image on a final fusion result is most obvious, secondly, the contrast and the saturation of the multi-exposure image are obtained, and then a weight map of the multi-exposure image is constructed from the three aspects to obtain various information in the multi-exposure image so as to obtain the global perception feature.
Calculating Snormal、Sover-ex、Sunder-exExposure, contrast and saturation of each pixel in each, SnormalThe exposure, contrast and saturation of the pixel point with the (x, y) middle coordinate position are correspondingly marked as Enormal(x,y)、Cnormal(x,y)、Sanormal(x, y), mixing Sover-exThe exposure, contrast and saturation of the pixel point with the (x, y) middle coordinate position are correspondingly marked as Eover-ex(x,y)、Cover-ex(x,y)、Saover-ex(x, y), mixing Sunder-exThe exposure, contrast and saturation of the pixel point with the (x, y) middle coordinate position are correspondingly marked as Eunder-ex(x,y)、Cunder-ex(x,y)、Saunder-ex(x, y); then calculate Snormal、Sover-ex、Sunder-exThe weight of each pixel point in the S is the weight of SnormalThe weight of the pixel point with the middle coordinate position (x, y) is recorded as omeganormal(x,y),ωnormal(x,y)=Enormal(x,y)×Cnormal(x,y)×Sanormal(x, y), mixing Sover-exThe weight of the pixel point with the middle coordinate position (x, y) is recorded as omegaover-ex(x,y),ωover-ex(x,y)=Eover-ex(x,y)×Cover-ex(x,y)×Saover-ex(x, y), mixing Sunder-exThe weight of the pixel point with the middle coordinate position (x, y) is recorded as omegaunder-ex(x,y),ωunder-ex(x,y)=Eunder-ex(x,y)×Cunder-ex(x,y)×Saunder-ex(x, y); then to Snormal、Sover-ex、Sunder-exThe weight of each pixel point in the S is normalized to obtain Snormal、Sover-ex、Sunder-exThe weight map of each corresponding weight, the correspondence is marked as weightnormal、weightover-ex、weightunder-ex(ii) a Then to Snormal、Sover-ex、Sunder-exAnd weightnormal、weightover-ex、weightunder-exPyramid fusion is carried out to Snormal、Sover-ex、Sunder-exUpsampling generates a Laplacian pyramid, for weightnormal、weightover-ex、weightunder-exSampling up to generate a Gaussian pyramid, and fusing to obtain a pseudo-reference fusion image; then calculate SmefiEach pixel point in the pseudo-reference fused image and the corresponding pixel point in the pseudo-reference fused image (i.e. calculating SmefiThe SSIM value of the pixel point at the same coordinate position in the pseudo-reference fusion image); finally, calculating the average value of W multiplied by H SSIM values, and taking the average value as SmefiThe global perceptual features of (1).
Here, pyramid fusion is a prior art, fig. 2 shows a schematic process diagram of pyramid fusion, and I in fig. 2 denotes Snormal、Sover-ex、Sunder-exI.e. I (1) denotes SnormalI (N) denotes Sunder-exW represents weightnormal、weightover-ex、weightunder-exThat is, W (1) represents weightnormalW (N) denotes weightunder-ex。
In this embodiment, the process, in step 5,Cnormal(x,y)=|L*Ynormal(x,y)|,Sanormal(x,y)=|Unormal(x,y)|+|Vnormal(x,y)|+1,Cover-ex(x,y)=|L*Yover-ex(x,y)|,Saover-ex(x,y)=|Uover-ex(x,y)|+|Vover-ex(x,y)|+1,Cunder-ex(x,y)=|L*Yunder-ex(x,y)|,Saunder-ex(x,y)=|Uunder-ex(x,y)|+|Vunder-ex(x, y) | + 1; wherein e represents a natural base number, e is 2.71828 …,denotes SnormalThe coordinate position in the Y channel (brightness channel) is the normalized value of the pixel point of (x, Y),Ynormal(x, y) denotes SnormalThe Y channel (luminance channel) of (c) is a pixel value of a pixel point whose coordinate position is (x, Y), μ and σ are both constants, μ ═ 0.5, σ ═ 0.2, the symbol "|" is an absolute value symbol, L denotes a laplacian, the symbol "×" is a convolution operation symbol, U denotes a laplacian, and U denotes a color space of a pixel point whose coordinate position is (x, Y) is equal to the color space of the Y channel (luminance channel)normal(x, y) denotes SnormalThe coordinate position of the U channel (U chrominance channel) is the pixel value, V, of the pixel point of (x, y)normal(x, y) denotes SnormalThe coordinate position in the V channel (V chroma channel) is the pixel value of the pixel point of (x, y),denotes Sover-exThe coordinate position in the Y channel (brightness channel) is the normalized value of the pixel point of (x, Y),Yover-ex(x, y) denotes Sover-exThe coordinate position in the Y channel (luminance channel) is the pixel value of the pixel point of (x, Y), Uover-ex(x, y) denotes Sover-exThe coordinate position of the U channel (U chrominance channel) is the pixel value, V, of the pixel point of (x, y)over-ex(x, y) denotes Sover-exThe coordinate position in the V channel (V chrominance channel) is the pixel value of the pixel point of (x, Y), Yunder-ex(x, y) denotes Sunder-exThe coordinate position in the Y channel (brightness channel) is the normalized value of the pixel point of (x, Y),Yunder-ex(x, y) denotes Sunder-exThe coordinate position in the Y channel (luminance channel) is the pixel value of the pixel point of (x, Y), Uunder-ex(x, y) denotes Sunder-exThe image of the pixel point with coordinate position (x, y) in the U channel (U chroma channel)Elemental value, Vunder-ex(x, y) denotes Sunder-exThe coordinate position in the V channel (V chrominance channel) is the pixel value of the pixel point of (x, y).
In this embodiment, weight, step 5normal、weightover-ex、weightunder-exThe acquisition process comprises the following steps: will weightnormalThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as weightnormal(x,y),weightnormal(x, y) is also for ωnormal(x, y) weight obtained after normalization processing, and weightover-exThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as weightover-ex(x,y),weightover-ex(x, y) is also for ωover-ex(x, y) weight obtained after normalization processing, and weightunder-exThe pixel value of the pixel point with the middle coordinate position (x, y) is recorded as weightunder-ex(x,y),weightunder-ex(x, y) is also for ωunder-ex(x, y) weights obtained after normalization processing.
Step 6: will SmefiCharacteristic of gradient of SmefiStructural feature of (1), SmefiIs used as SmefiThe feature vector of (2).
And 7: will SmefiThe characteristic vector of the S is used as input, and the S is obtained by calculation by combining the support vector regression technologymefiThe objective quality evaluation predicted value; wherein S ismefiThe larger the objective quality evaluation predicted value of (A), the larger the result of the evaluation is, the more SmefiThe better the quality of (b); otherwise, explain SmefiThe worse the quality of (c).
To further illustrate the feasibility and effectiveness of the method of the present invention, the method of the present invention was tested.
Selecting a ready-made database, wherein the database comprises natural images (namely original images) under 17 different scenes, the 17 different scenes are respectively 'balloon', 'Belgium House', 'Lampl', 'canister', 'Cave', 'Chinese garden', 'Farmhouse', 'House', 'Kluki', 'Lamp 2', 'Landscape', 'Lighthouse', 'Madison capsule', 'mean', 'Office', 'power' and 'Venice', 8 different multi-exposure image fusion methods are respectively used for each scene, subjective scores (namely subjective average score MOS) of 25 subjects are possessed, a Pearson Linear Correlation Coefficient (PLCC) between an objective quality evaluation predicted value and the subjective average score MOS and a Spearman linear correlation coefficient (SROCC) are calculated to be used as evaluation standards, and the PLCC or SROCC value indicates that the performance is better.
In the experiment, a leave one out strategy is adopted, only 8 multi-exposure fusion images in one scene are selected from the database at a time to form a test set, and a total of 128 (16 × 8 ═ 128) multi-exposure fusion images in the other 16 scenes form a training set. During training, according to the process from step 1 to step 6 of the method of the invention, the feature vector of each multi-exposure fusion image in the training set is obtained in the same way, the feature vectors of all multi-exposure fusion images in the training set are input into a support vector machine for training, so that the error between the regression function value obtained through training and the subjective average division MOS is minimum, the optimal weight vector and the optimal bias term are obtained through fitting, and then the optimal weight vector and the optimal bias term are utilized to construct a support vector regression model. During testing, according to the processes from step 1 to step 6 of the method, the feature vector of each multi-exposure fusion image in the test set is obtained in the same manner, and the support vector regression model is adopted to test the feature vector of each multi-exposure fusion image in the test set, so that the objective quality evaluation predicted value of each multi-exposure fusion image in the test set is obtained.
Training 17 times according to the leave one out strategy, and testing 8 multi-exposure fusion images in each scene once. Calculating average PLCC values and average SROCC values of schemes in which the feature vectors to be evaluated in the method of the invention under 17 scenes are composed of different features as listed in Table 1 when the feature vector of the multi-exposure fusion image to be evaluated is composed of only gradient features, the feature vector of the multi-exposure fusion image to be evaluated is composed of only structural features, the feature vector of the multi-exposure fusion image to be evaluated is composed of only global perception features, the feature vector of the multi-exposure fusion image to be evaluated is composed of gradient features and structural features, the feature vector of the multi-exposure fusion image to be evaluated is composed of gradient features and global perception features, the feature vector of the multi-exposure fusion image to be evaluated is composed of gradient features, structural features and global perception features, the average SROCC values for each scheme in which the feature vector consists of different features for 17 scenarios are listed in table 2.
Table 117 mean PLCC values of schemes whose feature vectors are composed of different features in the scene
Features of gradient
√
√
√
√
Structural features
√
√
√
√
Global perceptual features
√
√
√
√
Balloons
0.8260
0.8116
0.8947
0.8150
0.8599
0.8147
0.8358
Belgium house
0.9517
0.8381
0.9759
0.9608
0.9820
0.9776
0.9931
Lampl
0.8318
0.8590
0.8578
0.8674
0.8767
0.8639
0.9218
Candle
0.9390
0.8355
0.9770
0.9519
0.9731
0.9590
0.8643
Cave
0.9077
0.3578
0.9293
0.9144
0.9391
0.9293
0.9421
Chinese garden
0.8809
0.6531
0.9596
0.8891
0.9556
0.9596
0.9683
Farmhouse
0.8159
0.9787
0.8616
0.8218
0.8729
0.8617
0.8759
House
0.8947
0.7171
0.9553
0.8993
0.9505
0.9556
0.9601
Kluki
0.5157
0.8402
0.7446
0.7947
0.7233
0.7418
0.7305
Lamp2
0.9279
0.7933
0.8432
0.9127
0.9432
0.8436
0.9748
Landscape
0.8997
0.8110
0.7210
0.9335
0.9525
0.7208
0.8838
Lighthouse
0.9753
0.9417
0.9500
0.9729
0.9812
0.9470
0.9879
Madison capitol
0.8378
0.7791
0.9368
0.9241
0.9509
0.9364
0.9433
Memorial
0.9567
0.9579
0.9689
0.9659
0.9642
0.9700
0.9680
Office
0.8689
0.8507
0.8904
0.9429
0.8913
0.8902
0.9553
Tower
0.9148
0.8701
0.9370
0.9190
0.9428
0.9480
0.9580
Venice
0.8976
0.8925
0.8764
0.8842
0.8793
0.8765
0.9668
Mean value of
0.8730
0.8110
0.8987
0.9048
0.9199
0.8938
0.9252
Table 217 mean SROCC values of schemes whose feature vectors are composed of different features in scene
As can be seen from tables 1 and 2, the schemes in which the feature vector includes the gradient feature both show high consistency with the subjective score of the subject, because the observer is very sensitive to the local edge feature of the image, while the maximum value gradient map calculation method adopted by the method in the gradient domain can obtain the best image quality, and the global perception feature can well reflect the objective quality of the multi-exposure fusion image because the quality of the synthesized pseudo-reference fusion image is very consistent with the pursuit of human eyes for high-quality images.
Fig. 3a shows an overexposed image, fig. 3b shows a normal exposure image, fig. 3c shows an underexposed image, fig. 3d shows a multi-exposure fusion image obtained according to fig. 3a, fig. 3b, fig. 3c, fig. 3e shows a gradient map of fig. 3a, fig. 3f shows a gradient map of fig. 3b, fig. 3g shows a gradient map of fig. 3c, fig. 3h shows a gradient map of fig. 3d, fig. 3i shows a maximum value gradient map extracted from fig. 3e, fig. 3f, fig. 3 g. As can be seen from fig. 3a, 3c, and 3i, there is a case where detail information is lost in both a high-light region such as the sky in the overexposed image and a dark region such as the house in the underexposed image, and in the maximum value gradient map, edge detail information of the house in the overexposed image and the sky clouds in the underexposed image are successfully extracted and combined together to obtain the best image quality.
Fig. 4a shows a pseudo-reference fused image, fig. 4b shows a multi-exposure fused image to be evaluated, and fig. 4c shows SSIM diagrams of fig. 4b and 4 a. As can be seen from fig. 4b, there are some unnatural artifacts in the sky in fig. 4b and at the edge of the iron tower, the details inside the iron tower are blurred, and these information lost portions can be shown in the quality chart, i.e., fig. 4c, the dark areas indicate portions with poor image quality, and the white areas indicate portions with good image quality.
From the analysis, the method has higher consistency with the human perception quality of the image in the aspect of evaluating the multi-exposure fusion image quality in the natural scene.
In order to make the experimental results more compelling, the method of the present invention is compared with the representative 4 image quality evaluation methods proposed in recent years, and the 4 image quality evaluation methods are respectively: [1] from c.s.xydeas and v.s.petrovic, "Objective image Fusion performance measure," proc.spie, Sensor Fusion, archit, Algorithms, appl.iv, vol.4051, pp.89-98, apr.2000. (Objective image Fusion performance index), it extracts the edge information of the input image using Sobel edge operator, calculates the degree of retention of the intensity and direction of the edge information in each reference image and the Fusion image, and then combines between the source images to obtain the final quality score. [2] From p.wang and b.liu, "a novel image fusion based on multi-scale analysis," in proc.ieee 9th int.conf.signal process, Oct 2008, pp.965-968. (a novel image fusion evaluation method based on multi-scale analysis) which uses wavelet transformation to scale down an image and calculates the edge preservation of the fused image at each scale. [3] Quoted from k.ma, k.zeng, and z.wang, "Perceptual quality assessment for multi-exposure image fusion," IEEE trans.image processing ", vol.24, No.11, pp.3345-3356, nov.2015. (multiple exposure fusion image Perceptual quality assessment), which decomposes an image into luminance, contrast, and structure information, and enhances contrast and structure, respectively, to obtain pseudo-reference information, proposing an assessment criterion. [4] From d.kundu, d.ghadiyaram, a.c.bovik and b.l.evans, "No-Reference Quality Assessment of Tone-Mapped HDR images," in IEEE Transactions on Image Processing, vol.26, No.6, pp.2957-2971, June 2017 (No-Reference Quality Assessment of Tone-Mapped HDR images), a No-Reference Quality Assessment model was constructed based on differential natural scene statistics. The average PLCC values of the method of the present invention and the existing 4 image quality evaluation methods in 17 scenes are listed in table 3, and the average SROCC values of the method of the present invention and the existing 4 image quality evaluation methods in 17 scenes are listed in table 4.
Average PLCC value of the method of the present invention and the existing 4 image quality evaluation methods under 317 scenes
Average SROCC value of the method of the present invention and the existing 4 image quality evaluation methods under table 417 scenes
Original image
Method [1]
Method [2]
Method [3]
Method [4]
The method of the invention
Balloons
0.6667
0.5000
0.8333
0.9286
0.8095
Belgium house
0.7785
0.7545
0.9701
0.9222
0.9701
Lampl
0.7857
0.6190
0.9762
0.8095
0.9048
Candle
0.9762
0.7857
0.9286
0.7615
0.9762
Cave
0.7143
0.8095
0.8333
0.6190
0.8333
Chinese garden
0.6905
0.7857
0.9286
0.5714
0.7857
Farmhouse
0.7381
0.8095
0.9286
0.5714
0.9286
House
0.5952
0.4524
0.8571
0.9762
0.8333
Kluki
0.2619
0.2857
0.7857
-0.1667
0.7381
Lamp2
0.7619
0.6190
0.7143
0.7381
0.9524
Landscape
0.0238
0.4048
0.5238
0.5000
0.7619
Lighthouse
0.5000
0.4286
0.8810
0.7857
0.8810
Madison capitol
0.5238
0.3571
0.8810
0.6429
0.8095
Memorial
0.7619
0.5476
0.8571
0.8810
0.8571
Office
0.2771
0.3976
0.7832
0.1687
0.8555
Tower
0.5714
0.5238
0.9524
0.7381
0.8571
Venice
0.9102
0.7306
0.9341
0.5868
0.8623
Mean value of
0.6198
0.5771
0.8570
0.6491
0.8597
As can be seen from tables 3 and 4, the PLCC and SROCC values of the method of the present invention are higher than those of the 4 prior art methods, demonstrating the superior performance of the method of the present invention.
- 上一篇:石墨接头机器人自动装卡簧、装栓机
- 下一篇:一种面向夜间图像的无参考质量评价方法