Image quality detection method, apparatus, device, and medium
1. An image quality detection method, comprising:
acquiring an image to be processed, and positioning characters in the image to be processed by adopting a character positioning model so as to determine a character area in the image to be processed;
carrying out Sobel operator convolution processing on the character area in the image to be processed to obtain a target convolution matrix of the character area;
acquiring the area information of the character area according to the pixel value of the pixel in the target convolution matrix and the distribution interval of the pixel value;
and determining the quality detection result of the image to be processed according to the area information of each character area and a preset quality condition.
2. The image quality detection method according to claim 1, wherein the performing sobel operator convolution processing on the text region in the image to be processed to obtain the target convolution matrix of the text region includes:
processing the image to be processed according to the position of each character area to obtain a single-channel image corresponding to a plurality of character areas;
carrying out Sobel operator convolution processing on the single-channel image corresponding to each character area to obtain a Sobel operator convolution matrix of the single-channel image corresponding to the character area in multiple directions;
and carrying out root mean square calculation on the Sobel operator convolution data in the multiple directions to obtain a target convolution matrix of the character area.
3. The image quality detection method according to claim 2, wherein the processing the image to be processed according to the position of each text region to obtain a single-channel image corresponding to a plurality of text regions comprises:
determining whether the image to be processed is a multi-channel image;
if the image to be processed is the multi-channel image, converting the image to be processed into a single-channel image to be processed;
and dividing the image to be processed of the single channel according to the position of each character area to obtain a single channel image corresponding to each character area.
4. The image quality detection method according to claim 1, wherein the determining the quality detection result of the image to be processed according to the region information of each text region and a preset quality condition comprises:
determining the quality of the image to be processed according to the area information of the character areas and the Euclidean distance between each character area and a light source;
and determining the quality detection result of the image to be processed according to the quality of the image to be processed and the preset quality condition.
5. The image quality detection method of claim 4, wherein the determining the quality of the image to be processed according to the region information of the text regions and the Euclidean distance between each text region and the light source comprises:
determining the position of a light source according to the area information of each character area;
determining Euclidean distances between the character areas and the light source according to the positions of the character areas and the position of the light source;
taking the character area with the Euclidean distance from the light source smaller than a preset distance as a target character area;
and determining the quality of the image to be processed according to the area information of the target character areas.
6. The image quality detection method according to claim 4, wherein the determining the quality detection result of the image to be processed according to the quality information of the image to be processed and the preset quality condition comprises:
determining a preset quality condition of the image to be processed according to the type of the image to be processed;
determining whether the quality of the image to be processed meets the preset quality condition;
if the quality of the image to be processed meets the preset quality condition, determining that the quality detection result of the image to be processed is quality detection passing;
and if the quality of the image to be processed does not meet the preset quality condition, determining that the quality detection result of the image to be processed is failed quality detection.
7. The image quality detection method according to any one of claims 1 to 6, wherein before the processing the text region in the image to be processed by using the sobel operator, the method further comprises:
determining whether the number of character areas in the image to be processed is smaller than a preset number;
if the number of the character areas in the image to be processed is larger than or equal to the preset number, carrying out Sobel operator convolution processing on the character areas in the image to be processed;
and if the number of the character areas in the image to be processed is smaller than the preset number, determining that the quality detection result of the image to be processed is failed quality detection.
8. An image quality detection apparatus, characterized by comprising:
the positioning module is used for acquiring an image to be processed and positioning characters in the image to be processed by adopting a character positioning model so as to determine a character area in the image to be processed;
the processing module is used for carrying out Sobel operator convolution processing on the character area in the image to be processed to obtain a target convolution matrix of the character area;
the calculation module is used for acquiring the area information of the character area according to the pixel value of the pixel in the target convolution matrix and the distribution interval of the pixel value;
and the determining module is used for determining the quality detection result of the image to be processed according to the area information of each character area and a preset quality condition.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the image quality detection method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the image quality detection method according to any one of claims 1 to 7.
Background
Optical Character Recognition (OCR) refers to a process of analyzing and recognizing an image file of text data to obtain text information. The text material image information is usually obtained by a scanner, a camera, etc. and stored in an image file, and then OCR software reads, analyzes the image file and extracts character strings therein by character recognition. However, the performance of ocr is easily affected by the image quality, and if the image to be recognized is not clear enough or even blurred seriously, the subsequent positioning and recognition of the characters in the image are likely to fail, and the task of character recognition is likely to fail. Therefore, it is important to control the quality of the image before character recognition.
In the existing image quality detection method, an edge detection operator is generally used for filtering a whole image, and then the obtained filtering result is simply summed or subjected to variance calculation to give a quality score; or the quality of the image is detected through a quality detection model of deep learning, and then the quality detection of an edge detection operator or the quality detection of the deep learning model is easily interfered by non-character background information in the image, so that the quality accuracy of the obtained image is not high.
Disclosure of Invention
The invention provides an image quality detection method, an image quality detection device and an image quality detection medium, which aim to solve the problem that in the existing image quality detection, the obtained image quality is low in accuracy due to the fact that the existing image quality detection is easily interfered by non-character background information in an image.
Provided is an image quality detection method including:
acquiring an image to be processed, and positioning characters in the image to be processed by adopting a character positioning model so as to determine a character area in the image to be processed;
carrying out Sobel operator convolution processing on the character area in the image to be processed to obtain a target convolution matrix of the character area;
acquiring the area information of the character area according to the pixel value of the pixel in the target convolution matrix and the distribution interval of the pixel value;
and determining the quality detection result of the image to be processed according to the area information of each character area and a preset quality condition.
Provided is an image quality detection apparatus including:
the positioning module is used for acquiring an image to be processed and positioning characters in the image to be processed by adopting a character positioning model so as to determine a character area in the image to be processed;
the processing module is used for carrying out Sobel operator convolution processing on the character area in the image to be processed to obtain a target convolution matrix of the character area;
the calculation module is used for acquiring the area information of the character area according to the pixel value of the pixel in the target convolution matrix and the distribution interval of the pixel value;
and the determining module is used for determining the quality detection result of the image to be processed according to the area information of each character area and a preset quality condition.
There is provided a computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the image quality detection method described above when executing the computer program.
There is provided a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the image quality detection method described above.
In one scheme provided by the image quality detection method, the device, the equipment and the medium, a character area in the image to be processed is determined by acquiring the image to be processed and positioning characters in the image to be processed by adopting a character positioning model, then Sobel operator convolution processing is carried out on the character area in the image to be processed to acquire a target convolution matrix of the character area, area information of the character area is acquired according to a pixel value of a pixel in the target convolution matrix and a distribution interval of the pixel value, and finally a quality detection result of the image to be processed is determined according to the area information of the character area and a preset quality condition; in the invention, the Sobel operator convolution processing is carried out on each character area, and then the pixel value of each pixel in the sobel result is processed, so that the character area edge with obvious change is strengthened in the quality score, the interference edge in the background with small change is weakened, the interference of image background information can be reduced, the precision of a quality detection algorithm is improved, and the accuracy of the image quality is further improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
FIG. 1 is a diagram illustrating an application environment of an image quality detection method according to an embodiment of the present invention;
FIG. 2 is a flow chart of an image quality detection method according to an embodiment of the invention;
FIG. 3 is a flowchart illustrating an implementation of step S20 in FIG. 2;
FIG. 4 is a flowchart illustrating an implementation of step S21 in FIG. 3;
FIG. 5 is a schematic flow chart of an image quality detection method according to an embodiment of the present invention;
FIG. 6 is a flowchart illustrating an implementation of step S40 in FIG. 2;
FIG. 7 is a flowchart illustrating an implementation of step S41 in FIG. 6;
FIG. 8 is a flowchart illustrating an implementation of step S42 in FIG. 6;
FIG. 9 is a schematic diagram of a process for obtaining the quality score threshold according to an embodiment of the present invention;
FIG. 10 is a schematic diagram of an exemplary image quality detection apparatus according to the present invention;
FIG. 11 is a block diagram of a computer device according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The image quality detection method provided by the embodiment of the invention can be applied to the application environment shown in fig. 1, wherein the client communicates with the server through a network. The method comprises the steps that a server side obtains an image to be processed sent by a user through a client side, characters in the image to be processed are located through a character locating model to determine character areas in the image to be processed, then Sobel operator convolution processing is conducted on the character areas in the image to be processed to obtain a target convolution matrix of the character areas, area information of the character areas is obtained according to pixel values of pixels in the target convolution matrix and distribution intervals of the pixel values, and finally a quality detection result of the image to be processed is determined according to the area information of the character areas and preset quality conditions; by carrying out Sobel operator convolution processing on each character region and then processing the pixel value of each pixel in the sobel result, the character region edge with obvious change is strengthened in quality score, the interference edge in the background with small change degree is weakened, the interference of image background information can be reduced, the precision of a quality detection algorithm is improved, the accuracy of image quality is further improved, the artificial intelligence of image quality detection is finally further improved, and the detection efficiency is improved.
The image quality detection method comprises the steps that the image to be processed, the region information of each character region, the target quality score of the image to be processed and other related data are stored in a database of a server, and when the image quality detection method is executed, the obtained, generated and used data are directly stored in the database, so that the subsequent use is facilitated.
The database in this embodiment is stored in the blockchain network, and is used to store data used and generated in the semantic recall method based on the graph neural network, such as relevant data of the image to be processed, the area information of each character area, the target quality score of the image to be processed, and the like. The block chain referred by the application is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like. The database is deployed in the blockchain, so that the safety of data storage can be improved.
The client is also called a user side, and refers to a program corresponding to the server and providing local services for the client. The client may be, but is not limited to, various personal computers, laptops, smartphones, tablets, and portable wearable devices. The server can be implemented by an independent server or a server cluster composed of a plurality of servers.
In an embodiment, as shown in fig. 2, an image quality detection method is provided, which is described by taking the example that the method is applied to the server side in fig. 1, and includes the following steps:
s10: and acquiring an image to be processed, and positioning characters in the image to be processed by adopting a character positioning model so as to determine a character area in the image to be processed.
The method comprises the steps of obtaining an image to be processed, wherein the image to be processed is an image which needs to be subjected to image quality detection to determine whether the image quality is qualified, and after the image to be processed is obtained, firstly, a character positioning model based on deep learning is adopted to position character parts in the image to be processed so as to determine character areas in the image to be processed. The character areas are independent and separated character areas, and are used for positioning character parts in the image to be processed, namely positioning all the separated character areas in the image to be processed.
The character positioning model is a deep learning model used for positioning characters in the image to be processed. After the image to be processed is input into the character positioning model, the character positioning model can identify all characters in the image to be processed, so as to determine intervals between all characters, if the interval between two adjacent characters is larger than a preset interval (such as two characters), the edges of the two characters are one edge of the character area to which the two characters belong, so as to divide all the character areas. Each character area in the image to be processed can be displayed by adopting a highlight text box, so that the number of the character areas can be determined in the following process.
S20: and carrying out Sobel operator convolution processing on the character area in the image to be processed to obtain a target convolution matrix of the character area.
After characters in an image to be processed are located by adopting a character locating model to determine character areas in the image to be processed, Sobel operator (sobel) convolution processing is carried out on each character area, quality detection is carried out on horizontal edges and vertical edges of the character areas to obtain Sobel operator convolution matrixes in multiple directions of the character areas, and then the Sobel operator convolution data in the multiple directions are subjected to summarizing calculation to obtain target convolution matrixes of the character areas.
S30: and acquiring the area information of the character area according to the pixel value of the pixel in the target convolution matrix and the distribution interval of the pixel value.
After the target convolution matrix of each character area is obtained, according to the pixel value of each pixel and the distribution interval of the pixel value in the target convolution matrix of the character area, each pixel in the target convolution matrix is subjected to weighted summation to determine the quality score of the character area, and the quality score of the character area is used for representing the area information of the character area. In other embodiments, the area information of the text area may also be represented by other information, which is not described herein again.
Specifically, the pixel value distribution condition of each pixel in the target convolution matrix is counted, a distribution interval where the pixel value of each pixel is located is determined, then the number of the pixel values in different distribution intervals is determined, that is, the number of the pixels in different distribution intervals is determined, finally, the number of the pixels in different distribution intervals is subjected to weighted summation, and the weighted summation result is used as the quality score of the text region. Wherein, each distribution interval is a preset and continuous individual interval.
The calculation formula of the quality score of the character region is as follows:
score=k1×m1+k2×m2+k3×m3+k4×m4+k5×m5...+knmn;
wherein k isnFor the coefficient corresponding to each distribution interval, mnThe number of pixels with pixel values in each distribution interval is 1, 2, 3, 4 and 5; score is the quality score of each text region.
For example, each distribution interval may be [0, 0.1), [0.1, 0.2), [0.2, 0.3), [0.3, 0.4), [0.4, 0.5), [0.5, + ∞]Distribution interval [0, 0.1 ], [0.1, 0.2 ], [0.2, 0.3 ], [0.3, 0.4 ], [0.4, 0.5 ], [0.5, + ∞]The corresponding coefficients are in turn: 10-5、10-4、10-3、10-2、10-1、100(ii) a In a text region label convolution matrix, if the number of pixels having pixel values of [0, 0.1) is a, the number of pixels having pixel values of [0.1, 0.2) is b, the number of pixels having pixel values of [0.2, 0.3) is c, the number of pixels having pixel values of [0.2, 0.3) is d, the number of pixels having pixel values of [0.4, 0.5) is e, and the number of pixels having pixel values of [0.4, 0.5) is f, then the quality score of the text region is:
score=10-5×a+10-4×b+10-3×c+10-2×d+10-1×e+f。
in this embodiment, each distribution interval may be [0, 0.1 ], [0.1, 0.2 ], [0.2, 0.3 ], [0.3, 0.4 ], [0.4, 0.5 ], [0.5, + ∞]For illustrative purposes only, the coefficient corresponding to each distribution interval is 10-5、10-4、10-3、10-2、10-1、100For example, in other embodiments, each distribution interval may also be other continuous intervals, and coefficients corresponding to each distribution interval may also be other values, which is not described herein again.
The quality scores of the character areas can be obtained by performing weighted summation on the pixel values of the pixels and the distribution intervals of the pixel values in the target convolution matrix corresponding to the character areas so as to determine the area information of the character areas.
S40: and determining the quality detection result of the image to be processed according to the area information of each character area and the preset quality condition.
After the quality detection is carried out on each character area to determine the area information of each character area, the quality of the image to be processed is determined according to the area information of each character area, and then the quality detection result of the image to be processed is determined according to the quality of the image to be processed and the preset quality condition, wherein the quality detection result comprises passing quality detection and failing quality detection. The preset quality condition is a preset condition used for evaluating whether the quality of the image to be processed is qualified or not.
The quality of the image to be processed is determined according to the region information of each character region, and various determination modes can be provided, for example, the median of the quality scores of all the character regions is used as the target quality score of the image to be processed to represent the quality of the image to be processed, and the representative median of the quality scores is adopted, so that the quality requirement of the image to be processed can be relaxed, the processing speed is improved, and the time cost is reduced; the minimum value of the quality scores of all the character areas can be used as the target quality score of the image to be processed, the quality requirement of the image to be processed is improved, the image quality auditing strength is increased, and the characters in the image can be accurately identified; the position of the light source can be determined according to the quality scores of the character areas, the target quality score of the image to be processed is determined according to the quality score of the character area far away from the position of the light source, the possibility of error is reduced, and the accuracy of the target quality score is improved.
After determining the target quality score of the image to be processed according to the region information of each character region, namely determining the quality of the image to be processed, determining the quality detection result of the image to be processed according to the target quality score and a preset quality condition, wherein the higher the target quality score of the image to be processed is, the clearer the image to be processed is, when the target quality score of the image to be processed is within a certain quality score range or the target quality score of the image to be processed is higher than a certain threshold value, the better the quality of the image to be processed is, the image to be processed is an available clear image, namely the target quality score meets the preset quality condition, and determining that the quality detection result of the image to be processed passes quality detection; and when the target quality score of the image to be processed is not within a certain quality score range or is lower than a certain threshold value, the image to be processed is represented to be poor in quality and is an unusable blurred image, namely the target quality score does not meet a preset quality condition, and the quality detection result of the image to be processed is determined to be failed quality detection.
In the embodiment, an image to be processed is obtained, characters in the image to be processed are located by adopting a character locating model to determine a character area in the image to be processed, then the character area in the image to be processed is subjected to Sobel operator convolution processing to obtain a target convolution matrix of the character area, then area information of the character area is obtained according to a pixel value of a pixel in the target convolution matrix and a distribution interval of the pixel value, and finally a quality detection result of the image to be processed is determined according to the area information of the character area and a preset quality condition; by carrying out sobel operator convolution processing on each character region and then carrying out weighted summation on the number of pixels of each pixel value in different intervals in the sobel result, the character region edge with obvious change is strengthened in quality score, the interference edge in the background with small change degree is weakened, the interference of image background information can be reduced, the precision of a quality detection algorithm is improved, and the accuracy of image quality is further improved.
In an embodiment, as shown in fig. 3, in step S20, that is, performing sobel operator convolution processing on a text region in the image to be processed to obtain a target convolution matrix of the text region, the method specifically includes the following steps:
s21: and processing the image to be processed according to the position of each character area to obtain a single-channel image corresponding to the plurality of character areas.
If the number of the character areas in the image to be processed is greater than or equal to the preset number, the image to be processed is shown to meet the set requirement of the number of the character areas, and firstly, the image to be processed is required to be processed according to the position of each character area so as to obtain a single-channel image corresponding to a plurality of character areas.
For example, after determining a plurality of text regions in the image to be processed, the image to be processed is divided into images corresponding to the text regions according to the highlighted text box of each text region. Then, the image corresponding to each text area is converted into a single-channel image (such as a gray scale image) so as to perform image quality detection on the image corresponding to each text area in the following.
S22: and carrying out Sobel operator convolution processing on the single-channel image corresponding to each character area to obtain a Sobel operator convolution matrix of the single-channel image corresponding to the character area in multiple directions.
After single-channel images corresponding to a plurality of character areas are obtained, Sobel operator convolution processing is carried out on the single-channel images corresponding to the character areas, and Sobel operator convolution matrixes of the single-channel images corresponding to the character areas in a plurality of directions are obtained.
Specifically, the sobel operator is convolved on the single-channel image corresponding to each character area in the x direction and the y direction respectively, and the convolution results of the edge pixels in the single-channel image corresponding to each character area are all set to be 0, so that the sobel operator convolution matrix of the single-channel image corresponding to each character area in the x direction and the y direction is obtained.
Wherein, the convolution kernel in the x direction is:
wherein, the convolution kernel in the y direction is:
after the single-channel image corresponding to each character area is subjected to Sobel operator convolution processing, a Sobel operator convolution matrix of the single-channel image corresponding to the character area in the x direction and a Sobel operator convolution matrix of the single-channel image corresponding to the character area in the x direction are obtained.
S23: and carrying out root mean square calculation on the Sobel operator convolution data in multiple directions to obtain a target convolution matrix of the character area.
After the Sobel operator convolution matrixes of the single-channel image corresponding to the character area in multiple directions are obtained, root mean square calculation is carried out on Sobel operator convolution data in the multiple directions, and a target convolution matrix of the character area is obtained.
For example, after the sobel operator convolution matrixes of the single-channel image corresponding to each character area in the x and y directions are obtained, root mean square calculation is performed on sobel operator convolution data in the x and y directions to obtain a target convolution matrix of the character area, wherein a calculation formula of the target convolution matrix of each character area is as follows:
the res _ sobel is a target convolution matrix of the text area, res _ sobel _ h is a sobel operator convolution matrix of the single-channel image corresponding to the text area in the x direction, and res _ sobel _ v is a sobel operator convolution matrix of the single-channel image corresponding to the text area in the x direction.
The target convolution matrix res-sobel of the text area is a matrix composed of pixels, and each pixel corresponds to a numerical value.
Since the sobel operator has to process the floating point type matrix, before the sobel operator (sobel) convolution processing is performed on the single-channel image corresponding to each text region, in order to facilitate the calculation of the single-channel image corresponding to each text region, it is necessary to compress the pixel value-taking interval in the single-channel image corresponding to each text region from [0, 255] to [0, 1], that is, all the pixel values in the single-channel image corresponding to each text region are divided by 255, so that the integer value in the single-channel image can be converted into the floating point type by dividing by 255, which is convenient for processing. After all the pixels are divided by 255, the numerical distribution of each pixel in res-sobel can be conveniently counted subsequently, so that the error of res-sobel is reduced, and the boundary of each distribution interval is further reduced.
In this embodiment, an image to be processed is processed according to the position of each text region to obtain a single-channel image corresponding to a plurality of text regions, then sobel operator convolution processing is performed on the single-channel image corresponding to each text region to obtain sobel operator convolution matrices in a plurality of directions of the single-channel image corresponding to each text region, then weighted summation is performed on sobel operator convolution data in the plurality of directions to obtain a target convolution matrix of the text region, finally, weighted summation is performed on each pixel in the target convolution matrix according to a pixel value and a distribution interval of each pixel in the target convolution matrix to determine region information of the text region, quality detection is performed on each text region is refined, specific steps for determining the region information of each text region are provided, and a basis is provided for subsequent calculation.
In an embodiment, as shown in fig. 4, in step S21, processing the image to be processed according to the position of each text region to obtain a single-channel image corresponding to a plurality of text regions, specifically includes the following steps:
s211: determining whether the image to be processed is a multi-channel image;
s212: if the image to be processed is a multi-channel image, converting the image to be processed into a single-channel image to be processed;
s213: and dividing the image to be processed of the single channel according to the position of each character area to obtain a single-channel image corresponding to each character area.
Since most of images are color images, i.e., multi-channel images, a color image is composed of R, G, B and other channels, each channel is a numerical matrix, and for the following algorithm processing, the multi-channel color image needs to be converted into a single-channel image.
Therefore, when the image to be processed is processed according to the position of each character area, it is necessary to determine whether the image to be processed is a multi-channel image, that is, whether the image corresponding to the character area is a multi-channel image, and if the image to be processed is a single-channel image, that is, the image corresponding to the character area is a single-channel image, it indicates that the image can be directly processed without conversion. If the image to be processed is determined to be a multi-channel image, the image to be processed needs to be converted into a single-channel image to be processed, so that the image to be processed can be processed in the following process.
After the image to be processed is converted into a single-channel image to be processed, dividing the single-channel image to be processed according to the position of each character area to obtain a single-channel image corresponding to each character area.
In this embodiment, whether an image to be processed is a multi-channel image is determined, if the image to be processed is a multi-channel image, the image to be processed is converted into a single-channel image to be processed, the single-channel image to be processed is divided according to the positions of the character regions to obtain a single-channel image corresponding to each character region, the specific process of processing the image to be processed according to the positions of the character regions to obtain single-channel images corresponding to the character regions is determined, the multi-channel image is converted into a single-channel image, so that image quality detection is performed subsequently, the speed of performing image quality detection subsequently is increased, and the efficiency of image quality detection is improved.
In an embodiment, as shown in fig. 5, after determining the text region in the image to be processed and before step S20, that is, before performing sobel operator convolution processing on the text region in the image to be processed, the method specifically includes the following steps:
s11: and determining whether the number of the character areas in the image to be processed is less than a preset number.
After determining the text regions in the image to be processed and before performing sobel operator convolution processing on the text regions in the image to be processed, it is required to determine whether the number of the text regions in the image to be processed is less than a preset number. Because the character positioning model can not position the character area in the blurred image, if the image to be processed is the blurred image, the character area positioned by the character positioning model is less or almost not. Therefore, in order to reduce the subsequent calculation amount and reduce the time cost caused by performing the character region processing, it is necessary to determine whether the number of the character regions in the image to be processed is smaller than a preset number, so as to determine whether the image to be processed needs to be read for further quality detection according to the determination result.
S12: and if the number of the character areas in the image to be processed is greater than or equal to the preset number, carrying out Sobel operator convolution processing on the character areas in the image to be processed.
After determining whether the number of the character areas in the image to be processed is smaller than the preset number, if the number of the character areas in the image to be processed is larger than or equal to the preset number, the image to be processed is represented to meet the set requirement of the number of the character areas, the image to be processed is difficult to recognize characters, and the quality of the image to be processed is qualified, the image to be processed is determined to be a clear image. At this time, sobel operator convolution processing needs to be performed on the character regions in the image to be processed, so as to determine region information of each character region.
S13: and if the number of the character areas in the image to be processed is less than the preset number, determining that the quality detection result of the image to be processed is failed quality detection.
If the number of the character areas in the image to be processed is smaller than the preset number, the image to be processed does not meet the set requirement of the number of the character areas, character recognition of the image to be processed is difficult, the quality of the image to be processed is poor, the image to be processed is determined to be a fuzzy image, and the quality detection result of the image to be processed is determined to be quality detection failure. By the method, the extremely fuzzy picture can be filtered, the subsequent calculation steps are reduced, and the image quality detection speed is accelerated.
The preset number needs to be determined according to the type of the image to be processed, that is, according to the service scene setting to which the image to be processed belongs, different service scenes have different text region setting numbers, that is, the preset number values are different in different scenes.
For example, if the image to be processed is an invoice image, in an invoice OCR recognition scene, all separated character areas on the invoice image are located, after each character area in the invoice is determined, if the number of the character areas in the invoice image is less than 10 (that is, the preset number of the invoice OCR recognition scene is 10), it is determined that the invoice image does not meet the set requirement for the number of the character areas, character recognition cannot be performed on the invoice image, and if the quality of the invoice image is poor, it is determined that the invoice is a fuzzy invoice, it is determined that the invoice image is not available, and a new invoice image needs to be uploaded. If the image to be processed is an identity card image, positioning all separated character areas on the identity card image in an identity card OCR scene, wherein the number of characters corresponding to all identity card images is small due to the fact that the number of characters in the identity card is small, and the preset number of images on the front side of the identity card can be 4; after determining each character area in the identity card front image, if the number of the character areas in the identity card front image is less than 4, determining that the identity card front image does not meet the requirement of the number of the set character areas, and cannot perform character recognition on the identity card front image, and if the quality of the identity card front image is poor, determining that the identity card front is a fuzzy invoice, judging that the identity card front image is unavailable, and uploading a new identity card front image.
In this embodiment, the service scenes to which the to-be-processed image belongs are an invoice OCR recognition scene and an identity card OCR recognition scene, which are only exemplary illustrations, and in other embodiments, the service scenes to which the to-be-processed image belongs may also be other scenes, which are not described herein again.
In this embodiment, the preset number in the invoice OCR recognition scenario is 10, and the preset number in the identity card OCR recognition scenario is 4, which is only an exemplary illustration.
In this embodiment, before performing convolution processing on a text region in an image to be processed by using a sobel operator, determining whether the number of the text regions in the image to be processed is less than a preset number, and performing convolution processing on the text region in the image to be processed by using the sobel operator if the number of the text regions in the image to be processed is greater than or equal to the preset number; if the number of the character areas in the image to be processed is smaller than the preset number, the quality detection result of the image to be processed is determined to be that the quality detection is not passed, and the image quality of the image to be processed can be preliminarily judged by judging the number of the character areas, so that the subsequent calculation steps are reduced, and the image quality detection speed is increased.
In an embodiment, as shown in fig. 6, in step S40, determining a quality detection result of the image to be processed according to the area information of each text area and a preset quality condition specifically includes the following steps:
s41: and determining the quality of the image to be processed according to the area information of the character areas and the Euclidean distance between each character area and the light source.
The position of the light source can be determined according to the region information of each character region, then the Euclidean distance between each character region and the light source is determined, and then the target quality score of the image to be processed is determined according to the region information of the character region and the Euclidean distance between each character region and the light source, so that the quality of the image to be processed is represented by the target quality score.
For example, a text region with the maximum euclidean distance to the light source may be determined as a target region, that is, the text region farthest from the light source is the target region, and the quality score of the target region is used as the target quality score of the image to be processed; or selecting a plurality of character areas with the maximum Euclidean distance from the light source as a plurality of target character areas, and then taking the median of the quality scores of the plurality of target character areas or the average of the quality scores of the plurality of target character areas as the target quality score of the image to be processed.
It can be understood that the farther the euclidean distance between the text region and the light source is, the lower the quality score of the text region is, and the target quality score of the image to be processed is determined according to the euclidean distance between each text region and the light source, so that the possibility of error calculation of the quality scores of part of the text regions can be reduced, further, the error is reduced, and the accuracy of the target quality score is improved, that is, the accuracy of the quality of the image to be processed is improved.
In other embodiments, the median in the region information of all the text regions is also used as a target quality score of the image to be processed to represent the quality of the image to be processed. After the quality scores of all the character areas are determined, the median of the quality scores of all the character areas can be used as the target quality score of the image to be processed, the representative median of the quality scores is adopted, the quality requirement of the image to be processed can be relaxed, the processing speed is improved, and the time cost is reduced.
S42: and determining the quality detection result of the image to be processed according to the quality of the image to be processed and a preset quality condition.
After the quality of the image to be processed is determined according to the area information of the character areas and the Euclidean distance between each character area and the light source, a preset quality condition needs to be obtained, whether the quality of the image to be processed meets the preset quality condition or not is determined, if yes, the image to be processed is determined to pass quality detection, namely the quality detection result is quality detection passing; if not, determining that the image to be processed does not pass the quality detection, namely determining that the quality detection result is the quality detection failure.
In the embodiment, the median in the region information of all the character regions is used as the target quality score of the image to be processed; or, the quality of the image to be processed is determined according to the area information of the character areas and the Euclidean distance between each character area and the light source, and then the quality detection result of the image to be processed is determined according to the quality of the image to be processed and the preset quality condition, so that the step of determining the quality detection result of the image to be processed according to the area information of each character area and the preset quality condition is defined, the accuracy of the quality of the image to be processed is improved, and the accuracy of the quality detection result is improved.
In an embodiment, as shown in fig. 7, in step S41, that is, determining the quality of the image to be processed according to the area information of the text areas and the euclidean distances between the text areas and the light sources, the method specifically includes the following steps:
s411: and determining the position of the light source according to the area information of each character area.
In the present embodiment, the description is given taking the region information as an example of the quality score, and after the quality score of each character region is obtained, the light source matrix formed by all the character regions is constructed based on the quality score of each character region to specify the position of the light source. This step can be calculated from the relevant software.
S412: and determining the Euclidean distance between each character area and the light source according to the position of each character area and the position of the light source.
After the position of the light source is determined according to the area information of each character area, the Euclidean distance between each character area and the light source is determined according to the position of each character area and the position of the light source.
S413: and taking the character area with the Euclidean distance from the light source smaller than the preset distance as a target character area.
After determining the Euclidean distance between each character area and the light source, determining the character area with the Euclidean distance from the light source smaller than the preset distance as a target character area to obtain a plurality of target character areas, and then obtaining a plurality of target character areas
S414: and determining the quality of the image to be processed according to the area information of the target character areas.
After obtaining the plurality of target character areas, the median of the quality scores of the plurality of target character areas may be used as the target quality score of the image to be processed, the target quality score of the image to be processed is reduced, and the quality determination result of the image to be processed is determined. The maximum value of the quality scores of the target character areas can be used as the target quality score of the image to be processed, so that the quality detection threshold is improved.
In this embodiment, the position of the light source is determined by determining the region information of each text region, then the euclidean distance between each text region and the light source is determined according to the position of each text region and the position of the light source, then the text region with the euclidean distance from the light source smaller than the preset distance is used as the target text region, and finally the quality of the image to be processed is determined according to the region information of a plurality of target text regions.
In an embodiment, as shown in fig. 8, in step S50, that is, determining a quality detection result of the image to be processed according to the quality of the image to be processed and a preset quality condition, the method specifically includes the following steps:
s421: and determining a preset quality condition according to the type of the image to be processed.
After the image to be processed is acquired, a preset quality condition for evaluating the quality of the image to be processed needs to be determined according to the type of the image to be processed. In the embodiment, the preset quality condition is represented by the quality scoring threshold, so that the method is visual and convenient, the information processing complexity in the quality detection process is reduced, and the quality detection speed is further improved.
Different types of images have different quality score thresholds. For example, if the image to be processed is an invoice image, the quality score threshold of the image to be processed is a quality score threshold corresponding to the invoice image; and if the image to be processed is the identity card image, the quality scoring threshold value of the image to be processed is the quality scoring threshold value corresponding to the identity card image.
The quality score threshold is a predetermined score threshold, and may be an empirical value set manually or determined according to the quality score of the same type of image.
S422: and determining whether the quality of the image to be processed meets a preset quality condition.
After determining the quality of the image to be processed, it is necessary to determine whether the quality of the image to be processed satisfies a preset quality condition. And when the target quality score represents the quality of the processed image and represents the preset quality condition by using the quality score threshold, determining whether the target quality score of the image to be processed is greater than the quality score threshold, and determining whether the image to be processed meets the preset quality condition according to the comparison condition of the target quality score and the quality score threshold.
S423: and if the quality of the processed image meets the preset quality condition, determining that the quality detection result of the image to be processed is quality detection passing.
After determining whether the target quality score of the image to be processed is greater than the quality score threshold, if the target quality score of the image to be processed is greater than the quality score threshold, it indicates that the target quality score of the image to be processed is high and the quality of the image to be processed is good, determining that the image to be processed is a clear image, the image to be processed is available, and the image to be processed meets a preset quality condition, determining that the quality detection result of the image to be processed is quality detection passing.
S424: and if the quality of the processed image meets the preset quality condition, determining that the quality detection result of the image to be processed is failed quality detection.
After determining whether the target quality score of the image to be processed is greater than the quality score threshold, if the target quality score of the image to be processed is less than or equal to the quality score threshold, the target quality score of the image to be processed is low, the quality of the image to be processed is poor, the image to be processed is determined to be a clear image, the image to be processed is unavailable, and the image to be processed does not meet the preset quality condition, the image to be processed does not pass quality detection, and a new image needs to be retransmitted and uploaded.
In this embodiment, a preset quality condition is determined according to the type of the image to be processed, whether the quality of the image to be processed meets the preset quality condition is determined, and if the quality of the processed image meets the preset quality condition, the quality detection result of the image to be processed is determined to be passing quality detection; if the quality of the processed image meets the preset quality condition, the quality detection result of the image to be processed is determined to be failed in quality detection, the specific process of determining the quality detection result of the image to be processed according to the quality of the image to be processed and the preset quality condition is determined, different preset quality conditions are set for different types of images, whether the image quality meets the requirement or not is further determined according to the corresponding preset quality condition, and the accuracy of the quality detection result of the image is further improved.
In an embodiment, as shown in fig. 9, before determining the preset quality condition of the image to be processed according to the type of the image to be processed, quality score thresholds of different types of images to be processed need to be determined to characterize the preset quality condition by the quality score thresholds, where the quality score thresholds of the images to be processed are determined by:
s01: a plurality of historical images having tags of different quality are acquired.
Acquiring a plurality of historical images of the same type, and manually labeling the plurality of historical images of the same type to mark quality labels on the plurality of historical images of the same type, thereby acquiring a plurality of historical images with different quality labels. The type of the historical image is the same as that of the image to be processed, and the quality label on the historical image comprises a clear image and a blurred image.
S02: and determining the target quality score of each historical image according to the quality score of the character region in each historical image.
After acquiring a plurality of history images with different quality labels, according to the processing procedure described above, the quality score of the character region in each history image is determined, and then the target quality score of each history image is determined.
S03: and determining the image quality of each historical image under a plurality of different preset values according to the target quality score and the preset value of each historical image.
After the target quality scores of the historical images are obtained, a plurality of preset values are set, and the target quality scores of the historical images are compared with the preset values one by one to determine the image quality of the historical images under the different preset values. Among them, the image quality includes sharpness and blur.
For example, a preset value is selected from a plurality of preset values, the target quality score of each historical image and the size of the preset value are determined, and if the target quality score of the historical image is smaller than or equal to the preset value, the quality of the historical image is determined to be fuzzy; and if the target quality score of the historical image is larger than the preset value, determining that the quality of the historical image is clear, sequentially selecting the preset values, repeating the process, and determining the image quality of each historical image under each set value.
Wherein the plurality of different preset values may be a plurality of consecutive numerical values determined according to the number of the history images. For example, if the number of the history images is 500, the preset value is gradually increased from 0 to 0.1 and is increased to 500, that is, 0, 0.1, 0.2, 0.3, … 499.8, 499.9, 500, and 5001 values are taken as the preset values respectively.
In this embodiment, the value of the preset value is only an exemplary illustration, and in other embodiments, the preset value may also be other continuous values.
S04: and determining a quality score threshold value in a plurality of different preset values according to the matching condition of the image quality and the quality label of each historical image under different preset values.
After the image quality of each historical image under each preset value is determined, a quality score threshold value is determined in a plurality of different preset values according to the matching condition of the image quality of each historical image under the preset value and the quality label.
Specifically, the image quality of each historical image under each preset value is determined, the matching number of the image quality of each historical image and the quality label of each historical image is determined, the matching number of the image quality and the quality label of each historical image under different preset values is obtained, and the preset value with the largest matching number is selected as the quality scoring threshold.
In the embodiment, by acquiring a plurality of historical images with different quality labels, wherein the quality labels comprise clear images and blurred images, the type of the historical images is the same as that of the images to be processed, then determining the target quality score of each historical image according to the quality score of the character region in each historical image, then determining the image quality of each historical image under a plurality of different preset values according to the target quality score and the preset value of each historical image, wherein the image quality comprises the clear and the blurred images, finally determining the quality score threshold value in the plurality of different preset values according to the matching condition of the image quality of each historical image under different preset values and the quality label, determining the specific determination process of the quality score threshold value of the images to be processed, and determining the quality score of each historical image by the image quality detection method, and then according to the matching condition of the image quality of each historical image and the quality label, determining a quality grading threshold value in a plurality of different preset values, so that the accuracy of the quality grading threshold value is improved, and the accuracy of the image quality determined according to the quality grading threshold value subsequently is improved.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
In one embodiment, an image quality detection apparatus is provided, which corresponds to the image quality detection method in the above embodiments one to one. As shown in fig. 10, the image quality detection apparatus includes a positioning module 101, a processing module 102, a calculation module 103, and a determination module 104. The functional modules are explained in detail as follows:
the positioning module 101 is configured to acquire an image to be processed, and position characters in the image to be processed by using a character positioning model to determine a character area in the image to be processed;
the processing module 102 is configured to perform sobel operator convolution processing on a text region in the image to be processed to obtain a target convolution matrix of the text region;
a calculating module 103, configured to obtain area information of the text area according to a pixel value of a pixel in the target convolution matrix and a distribution interval of the pixel value;
a determining module 104, configured to determine a quality detection result of the image to be processed according to the area information of each text area and a preset quality condition.
Further, the processing module 102 is specifically configured to:
processing the image to be processed according to the position of each character area to obtain a single-channel image corresponding to a plurality of character areas;
carrying out Sobel operator convolution processing on the single-channel image corresponding to each character area to obtain a Sobel operator convolution matrix of the single-channel image corresponding to the character area in multiple directions;
and carrying out root mean square calculation on the Sobel operator convolution data in the multiple directions to obtain a target convolution matrix of the character area.
Further, the processing module 102 is further specifically configured to:
determining whether the image to be processed is a multi-channel image;
if the image to be processed is the multi-channel image, converting the image to be processed into a single-channel image to be processed;
and dividing the image to be processed of the single channel according to the position of each character area to obtain a single channel image corresponding to each character area.
Further, the determining module 104 is specifically configured to:
determining the quality of the image to be processed according to the area information of the character areas and the Euclidean distance between each character area and a light source;
and determining the quality detection result of the image to be processed according to the quality of the image to be processed and the preset quality condition.
Further, the determining module 104 is specifically further configured to:
determining the position of a light source according to the area information of each character area;
determining Euclidean distances between the character areas and the light source according to the positions of the character areas and the position of the light source;
taking the character area with the Euclidean distance from the light source smaller than a preset distance as a target character area;
and determining the quality of the image to be processed according to the area information of the target character areas.
Further, the determining module 104 is specifically further configured to:
determining the preset quality condition according to the type of the image to be processed;
determining whether the quality of the image to be processed meets the preset quality condition;
if the quality of the image to be processed meets the preset quality condition, determining that the quality detection result of the image to be processed is quality detection passing;
and if the quality of the image to be processed does not meet the preset quality condition, determining that the quality detection result of the image to be processed is failed quality detection.
Further, before the text region in the image to be processed is processed by using the sobel operator, the determining module 104 is specifically configured to:
determining whether the number of character areas in the image to be processed is smaller than a preset number;
if the number of the character areas in the image to be processed is larger than or equal to the preset number, processing the character areas in the image to be processed by using a Sobel operator;
and if the number of the character areas in the image to be processed is smaller than the preset number, determining that the quality detection result of the image to be processed is failed quality detection.
For specific limitations of the image quality detection apparatus, reference may be made to the above limitations of the image quality detection method, which are not described herein again. The modules in the image quality detection device can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 11. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer equipment is used for storing data such as images to be processed, character areas, quality scores, quality score thresholds and the like. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement an image quality detection method.
In one embodiment, a computer device is provided, comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
acquiring an image to be processed, and positioning characters in the image to be processed by adopting a character positioning model so as to determine a character area in the image to be processed;
carrying out Sobel operator convolution processing on the character area in the image to be processed to obtain a target convolution matrix of the character area;
acquiring the area information of the character area according to the pixel value of the pixel in the target convolution matrix and the distribution interval of the pixel value;
and determining the quality detection result of the image to be processed according to the area information of each character area and a preset quality condition.
In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:
acquiring an image to be processed, and positioning characters in the image to be processed by adopting a character positioning model so as to determine a character area in the image to be processed;
carrying out Sobel operator convolution processing on the character area in the image to be processed to obtain a target convolution matrix of the character area;
acquiring the area information of the character area according to the pixel value of the pixel in the target convolution matrix and the distribution interval of the pixel value;
and determining the quality detection result of the image to be processed according to the area information of each character area and a preset quality condition.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.