Method and device for optimizing image to be processed, storage medium and electronic equipment
1. A method for optimizing an image to be processed, the image to be processed including at least one human face, the method comprising:
extracting a reference face image corresponding to each face from the image to be processed, and determining the size of each reference face image;
obtaining a plurality of image optimization models, and determining the size of an input image corresponding to each image optimization model;
matching corresponding image optimization models for the reference face images according to the sizes of the reference face images and the size of the input image;
and optimizing each reference face image by using the image optimization model to obtain a target face image, and obtaining an optimized image corresponding to the image to be processed based on the target face image.
2. The method of claim 1, wherein matching an image optimization model for each of the reference face images based on the size of the reference face image and the size of the input image comprises:
performing the following operations for each of the reference face images:
acquiring the mapping relation between the size of the reference face image and the size of each input image;
and matching an image optimization model for the reference face image according to the mapping relation.
The mapping relation comprises that the difference value between the size of a reference face image and the size of each input image is smaller than or equal to a preset threshold value; or
The mapping relation comprises that the ratio of the size of the reference face image to the size of each input image is in a preset range.
3. The method according to claim 1, wherein the optimizing each of the reference face images by using the image optimization model to obtain a target face image comprises:
performing the following operations for each of the reference face images:
judging the size of the reference face image and the size of an input image of a corresponding image optimization model;
if the size of the reference face image is smaller than that of the input image, intercepting an initial face image with the same size as that of the input image from the image to be processed, wherein the initial face image comprises the reference face image;
and inputting the initial face image into the image optimization model to obtain a target face image.
4. The method according to claim 3, wherein the optimizing each of the reference face images by using the image optimization model to obtain a target face image comprises:
if the size of the reference face image is larger than that of the input image, downsampling the reference face image to obtain an initial face image with the same size as the input image;
and inputting the initial face image into the image optimization model for optimization, and performing up-sampling on the optimized initial face image to obtain a target face image.
5. The method of claim 4, wherein the upsampling the optimized initial face image to obtain a target face image comprises:
obtaining the loss amount between the initial face image and the reference face image in the downsampling process;
and performing up-sampling on the optimized initial face image based on the loss amount to obtain a target face image.
6. The method according to claim 1, wherein the obtaining an optimized image corresponding to the image to be processed based on the target face image comprises:
replacing the area corresponding to the reference face image in the image to be processed with the target face image to obtain a replaced image;
and smoothing the replaced image to obtain the optimized image.
7. The method of claim 1, further comprising:
determining a minimum allowable size according to the minimum value of the input image sizes;
and if the size of the reference face image is smaller than the minimum allowable size, deleting the reference face image corresponding to the size of the reference face image.
8. The method of claim 1, further comprising:
determining a maximum allowable size according to the maximum value of the sizes of the input images;
if the size of the reference face image is larger than the maximum allowable size, dividing the reference face image corresponding to the size of the reference face image into a plurality of new reference face images;
and determining the optimization model corresponding to the new reference face image according to the size of the new reference face image and the size of the input image of the plurality of image optimization models.
9. The method of claim 1, further comprising:
for each of the reference face images, performing the following operations:
determining the maximum limit size of an image optimization model corresponding to the reference face image;
if the size of the reference face image is larger than the maximum limit size, dividing the reference face image corresponding to the size of the reference face image into a plurality of new reference face images;
and determining the image optimization model corresponding to the new reference face image according to the size of the new reference face image and the size of the input images of the plurality of image optimization models.
10. An apparatus for optimizing an image to be processed, the image to be processed including at least one human face, the apparatus comprising:
the image acquisition module is used for extracting a reference face image corresponding to each face in the image to be processed and determining the size of the reference face image of each reference face image;
the model acquisition module is used for acquiring a plurality of image optimization models and determining the input image size of the input image of each image optimization model;
the model matching module is used for matching corresponding image optimization models for the reference face images according to the sizes of the reference face images and the size of the input image;
and the image optimization model is used for optimizing each reference face image by using the image optimization model to obtain a target face image, and obtaining an optimized image corresponding to the image to be processed based on the target face image.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a method of optimizing an image to be processed according to any one of claims 1 to 9.
12. An electronic device, comprising:
a processor; and
memory for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement a method of optimization of an image to be processed as claimed in any one of claims 1 to 9.
Background
When a picture is taken, if a face appears in the picture, the attention of the face is the highest in the whole shooting. The flaws in the face area can be enlarged relative to the background area, and especially the blurred face is difficult to accept by the user.
In the related art, the deblurring method based on deep learning has completely surpassed the traditional deblurring algorithm in effect. However, the deblurring method based on deep learning has large calculation amount, long algorithm time delay and high power consumption.
It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The invention aims to provide an optimization method of an image to be processed, an optimization device of the image to be processed, a computer readable medium and electronic equipment, which overcome the problems of long algorithm time delay and high power consumption caused by large calculation amount of a deblurring method based on deep learning to a certain extent.
According to a first aspect of the present disclosure, there is provided a method for optimizing an image to be processed, the image to be processed including at least one human face, the method including:
extracting a reference face image corresponding to each face from the image to be processed, and determining the size of each reference face image;
obtaining a plurality of image optimization models, and determining the size of an input image corresponding to each image optimization model;
matching corresponding image optimization models for the reference face images according to the sizes of the reference face images and the size of the input image;
and optimizing each reference face image by using the image optimization model to obtain a target face image, and obtaining an optimized image corresponding to the image to be processed based on the target face image.
According to a second aspect of the present disclosure, there is provided an apparatus for optimizing an image to be processed, comprising:
the image acquisition module is used for extracting a reference face image corresponding to each face in the image to be processed and determining the size of the reference face image of each reference face image;
the model acquisition module is used for acquiring a plurality of image optimization models and determining the input image size of the input image of each image optimization model;
the model matching module is used for matching corresponding image optimization models for the reference face images according to the sizes of the reference face images and the size of the input image;
and the image optimization model is used for optimizing each reference face image by using the image optimization model to obtain a target face image, and obtaining an optimized image corresponding to the image to be processed based on the target face image.
According to a third aspect of the present disclosure, a computer-readable medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, is adapted to carry out the above-mentioned method.
According to a fourth aspect of the present disclosure, there is provided an electronic apparatus, comprising:
a processor; and
a memory for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the above-described method.
The optimization method of the image to be processed provided by the embodiment of the disclosure extracts the reference face image corresponding to each face from the image to be processed, and determines the size of each reference face image; acquiring a plurality of image optimization models, and determining the size of an input image corresponding to each image optimization model; matching corresponding image optimization models for the reference face images according to the sizes of the reference face images and the size of the input image; and optimizing each reference face image by using the image optimization model to obtain a target face image, and obtaining an optimized image corresponding to the image to be processed based on the target face image. Compared with the prior art, on one hand, different image optimization models are adopted for processing the extracted reference face images with different sizes, so that the accuracy of image optimization can be improved, and on the other hand, a plurality of image optimization models can simultaneously optimize a plurality of reference face images, so that the optimization rate of the images is increased.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty. In the drawings:
FIG. 1 illustrates a schematic diagram of an exemplary system architecture to which embodiments of the present disclosure may be applied;
FIG. 2 shows a schematic diagram of an electronic device to which embodiments of the present disclosure may be applied;
FIG. 3 schematically illustrates a method of optimizing an image to be processed in the related art;
FIG. 4 schematically illustrates a flow chart of a method for optimizing an image to be processed in an exemplary embodiment of the present disclosure;
FIG. 5 is a schematic diagram illustrating a method for obtaining a target face image using an initial image and a loss value according to an exemplary embodiment of the disclosure;
FIG. 6 schematically illustrates a flow chart for deblurring an image to be processed in an exemplary embodiment of the present disclosure;
fig. 7 schematically shows a composition diagram of an optimization apparatus for a to-be-processed image in an exemplary embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus their repetitive description will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
Fig. 1 is a schematic diagram illustrating a system architecture of an exemplary application environment to which the method and apparatus for optimizing a to-be-processed image according to the embodiment of the present disclosure may be applied.
As shown in fig. 1, the system architecture 100 may include one or more of terminal devices 101, 102, 103, a network 104, and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few. The terminal devices 101, 102, 103 may be various electronic devices having an image processing function, including but not limited to desktop computers, portable computers, smart phones, tablet computers, and the like. It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, server 105 may be a server cluster comprised of multiple servers, or the like.
The optimization method for the to-be-processed image provided by the embodiment of the present disclosure is generally executed by the terminal devices 101, 102, and 103, and accordingly, the optimization device for the to-be-processed image is generally disposed in the terminal devices 101, 102, and 103. However, it is easily understood by those skilled in the art that the optimization method for the to-be-processed image provided in the embodiment of the present disclosure may also be executed by the server 105, and accordingly, the optimization device for the to-be-processed image may also be disposed in the server 105, which is not particularly limited in the exemplary embodiment. For example, in an exemplary embodiment, a user may extract a reference face image corresponding to each face from an image to be processed through the terminal devices 101, 102, and 103, and then upload the reference face image to the server 105, and after the server optimizes the image to be processed through the method for optimizing the image to be processed provided by the embodiment of the present disclosure, the server transmits the optimized image to the terminal devices 101, 102, and 103, and so on.
The exemplary embodiment of the present disclosure provides an electronic device for implementing an optimization method of an image to be processed, which may be the terminal device 101, 102, 103 or the server 105 in fig. 1. The electronic device comprises at least a processor and a memory for storing executable instructions of the processor, the processor being configured to perform the method of optimization of the image to be processed via execution of the executable instructions.
The following takes the mobile terminal 200 in fig. 2 as an example, and exemplifies the configuration of the electronic device. It will be appreciated by those skilled in the art that the configuration of figure 2 can also be applied to fixed type devices, in addition to components specifically intended for mobile purposes. In other embodiments, mobile terminal 200 may include more or fewer components than shown, or some components may be combined, some components may be split, or a different arrangement of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware. The interfacing relationship between the components is only schematically illustrated and does not constitute a structural limitation of the mobile terminal 200. In other embodiments, the mobile terminal 200 may also interface differently than shown in fig. 2, or a combination of multiple interfaces.
As shown in fig. 2, the mobile terminal 200 may specifically include: a processor 210, an internal memory 221, an external memory interface 222, a Universal Serial Bus (USB) interface 230, a charging management module 240, a power management module 241, a battery 242, an antenna 1, an antenna 2, a mobile communication module 250, a wireless communication module 260, an audio module 270, a speaker 271, a microphone 272, a microphone 273, an earphone interface 274, a sensor module 280, a display 290, a camera module 291, an indicator 292, a motor 293, a button 294, and a Subscriber Identity Module (SIM) card interface 295. Wherein the sensor module 280 may include a depth sensor 2801, a pressure sensor 2802, a gyroscope sensor 2803, and the like.
Processor 210 may include one or more processing units, such as: the Processor 210 may include an Application Processor (AP), a modem Processor, a Graphics Processing Unit (GPU), an Image Signal Processor (ISP), a controller, a video codec, a Digital Signal Processor (DSP), a baseband Processor, and/or a Neural-Network Processing Unit (NPU), and the like. The different processing units may be separate devices or may be integrated into one or more processors.
The NPU is a Neural-Network (NN) computing processor, which processes input information quickly by using a biological Neural Network structure, for example, by using a transfer mode between neurons of a human brain, and can also learn by itself continuously. The NPU can implement applications such as intelligent recognition of the mobile terminal 200, for example: image recognition, face recognition, speech recognition, text understanding, and the like.
A memory is provided in the processor 210. The memory may store instructions for implementing six modular functions: detection instructions, connection instructions, information management instructions, analysis instructions, data transmission instructions, and notification instructions, and execution is controlled by processor 210.
The charge management module 240 is configured to receive a charging input from a charger. The power management module 241 is used for connecting the battery 242, the charging management module 240 and the processor 210. The power management module 241 receives the input of the battery 242 and/or the charging management module 240, and supplies power to the processor 210, the internal memory 221, the display screen 290, the camera module 291, the wireless communication module 260, and the like.
The wireless communication function of the mobile terminal 200 may be implemented by the antenna 1, the antenna 2, the mobile communication module 250, the wireless communication module 260, a modem processor, a baseband processor, and the like. Wherein, the antenna 1 and the antenna 2 are used for transmitting and receiving electromagnetic wave signals; the mobile communication module 250 may provide a solution including wireless communication of 2G/3G/4G/5G, etc. applied to the mobile terminal 200; the modem processor may include a modulator and a demodulator; the Wireless communication module 260 may provide a solution for Wireless communication including a Wireless Local Area Network (WLAN) (e.g., a Wireless Fidelity (Wi-Fi) network), Bluetooth (BT), and the like, applied to the mobile terminal 200. In some embodiments, antenna 1 of the mobile terminal 200 is coupled to the mobile communication module 250 and antenna 2 is coupled to the wireless communication module 260, such that the mobile terminal 200 may communicate with networks and other devices via wireless communication techniques.
The mobile terminal 200 implements a display function through the GPU, the display screen 290, the application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display screen 290 and an application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 210 may include one or more GPUs that execute program instructions to generate or alter display information.
The mobile terminal 200 may implement a photographing function through the ISP, the camera module 291, the video codec, the GPU, the display screen 290, the application processor, and the like. The ISP is used for processing data fed back by the camera module 291; the camera module 291 is used for capturing still images or videos; the digital signal processor is used for processing digital signals, and can process other digital signals besides digital image signals; the video codec is used to compress or decompress digital video, and the mobile terminal 200 may also support one or more video codecs.
The external memory interface 222 may be used to connect an external memory card, such as a Micro SD card, to extend the memory capability of the mobile terminal 200. The external memory card communicates with the processor 210 through the external memory interface 222 to implement a data storage function. For example, files such as music, video, etc. are saved in an external memory card.
Internal memory 221 may be used to store computer-executable program code, which includes instructions. The internal memory 221 may include a program storage area and a data storage area. The storage program area may store an operating system, an application program (such as a sound playing function, an image playing function, etc.) required by at least one function, and the like. The storage data area may store data (e.g., audio data, a phonebook, etc.) created during use of the mobile terminal 200, and the like. In addition, the internal memory 221 may include a high-speed random access memory, and may further include a nonvolatile memory, such as at least one magnetic disk Storage device, a Flash memory device, a Universal Flash Storage (UFS), and the like. The processor 210 executes various functional applications of the mobile terminal 200 and data processing by executing instructions stored in the internal memory 221 and/or instructions stored in a memory provided in the processor.
The mobile terminal 200 may implement an audio function through the audio module 270, the speaker 271, the receiver 272, the microphone 273, the earphone interface 274, the application processor, and the like. Such as music playing, recording, etc.
The depth sensor 2801 is used to acquire depth information of a scene. In some embodiments, a depth sensor may be provided to the camera module 291.
The pressure sensor 2802 is used to sense a pressure signal and convert the pressure signal into an electrical signal. In some embodiments, the pressure sensor 2802 may be disposed on the display screen 290. Pressure sensor 2802 can be of a wide variety, such as a resistive pressure sensor, an inductive pressure sensor, a capacitive pressure sensor, and the like.
The gyro sensor 2803 may be used to determine a motion gesture of the mobile terminal 200. In some embodiments, the angular velocity of the mobile terminal 200 about three axes (i.e., x, y, and z axes) may be determined by the gyroscope sensor 2803. The gyro sensor 2803 can be used to photograph anti-shake, navigation, body-feel game scenes, and the like.
In addition, other functional sensors, such as an air pressure sensor, a magnetic sensor, an acceleration sensor, a distance sensor, a proximity light sensor, a fingerprint sensor, a temperature sensor, a touch sensor, an ambient light sensor, a bone conduction sensor, etc., may be provided in the sensor module 280 according to actual needs.
Other devices for providing auxiliary functions may also be included in mobile terminal 200. For example, the keys 294 include a power-on key, a volume key, and the like, and a user can generate key signal inputs related to user settings and function control of the mobile terminal 200 through key inputs. Further examples include indicator 292, motor 293, SIM card interface 295, etc.
In the related art, face blur caused by motion is the most common, and then face virtual focus blur caused by focusing problem or virtual focus blur caused by the fact that the front and back face positions exceed the depth of field limit of a camera under a multi-person scene is the second most common. In the related art, the problems that the power consumption is large when fuzzy images are optimized and the calculation amount is large are solved, the model is usually miniaturized, namely the model runs faster, or for processing pictures with 2k or 4k or even higher resolution in case of motion, the pictures are usually cut into blocks and the block size is matched with the input size of the AI model in consideration of the limitations of memory and bandwidth.
Referring to fig. 3, when performing image optimization, a face detection frame 301 may be determined first, then a face expansion frame 302 may be determined according to the face detection frame 301, then a face image 303 to be processed is determined in an image to be processed, and the image to be processed is partitioned, and finally the face image 303 is processed by a model. The face blocking processing scheme in the related art improves the execution efficiency of the AI deblurring algorithm to a certain extent, but when a plurality of faces exist in a picture and the position distribution is scattered, the blocking method has poor efficiency and even approaches to the full-picture blocking processing time. Blocking based on face regions is meaningless.
The following describes a method and an apparatus for optimizing an image to be processed according to exemplary embodiments of the present disclosure.
Fig. 4 illustrates a flow of a method for optimizing an image to be processed in the present exemplary embodiment, and the present disclosure provides a method for optimizing an image to be processed, where the image to be processed includes at least one human face, and the method for optimizing an image to be processed may include the following steps:
step S410, extracting reference face images corresponding to the faces from the image to be processed, and determining the size of each reference face image;
step S420, obtaining a plurality of image optimization models, and determining the size of an input image corresponding to each image optimization model;
step S430, matching corresponding image optimization models for the reference face images according to the sizes of the reference face images and the size of the input image;
step S440, performing optimization processing on each reference face image by using the image optimization model to obtain a target face image, and obtaining an optimized image corresponding to the image to be processed based on the target face image.
Compared with the prior art, on one hand, different models are adopted for processing extracted reference face images with different sizes, so that the accuracy of image optimization can be improved, and on the other hand, multiple reference face images can be simultaneously optimized by multiple models, so that the optimization rate of the images is increased.
The above steps will be described in detail below.
In step S410, a reference face image corresponding to each face is extracted from the image to be processed, and the size of each reference face image is determined.
In an example embodiment of the present disclosure, an image to be processed may be first obtained, where the image to be processed may include at least one face, a reference face image corresponding to the face may be extracted from the image to be processed, and a size of each reference face image may be determined.
In the present exemplary embodiment, the image to be processed may be, but is not limited to: acquiring a monitoring video in a region to be monitored, and processing the monitoring video frame by frame to obtain a plurality of images; or the worker directly uploads the image containing the face as the image to be recognized.
In this example embodiment, a face image extraction model may be used to extract a reference face image, and specifically, the to-be-processed image may be input to a pre-trained face image extraction model to obtain reference face images corresponding to all faces in the to-be-processed image.
When the face image extraction model is obtained, an image with a plurality of faces is firstly configured, a first initial model of the face image corresponding to each face in the image is input, then a plurality of training images with a plurality of faces are extracted from a database, the face image corresponding to each face is manually screened from the training images, and the training images and the corresponding face images are used for training the first initial model to obtain the face image extraction model.
The face image extraction model may be a retavaace detection network (which is a one-stage face detection network), and specifically, after the retavaace detection network extracts image features of an image to be recognized, the retavaace detection network further extracts more detailed face features through a Feature Pyramid Network (FPN) and an ssh (single stage) network, and then predicts a face frame and face Feature point coordinates through detection, so as to extract a face image from the image to be recognized according to the face Feature point coordinates and the face frame.
It should be noted that the scheme for extracting the reference facial image from the image to be processed includes various schemes, as long as the extraction of the reference facial image from the image to be processed can be completed, and is not particularly limited in this exemplary embodiment.
In the present exemplary embodiment, after each reference face image is extracted, the size of each reference face image may be determined.
In step S420, a plurality of image optimization models are obtained, and an input image size corresponding to each of the image optimization models is determined.
In an example embodiment of the present disclosure, a plurality of image optimization models may be obtained from a database, and an input image size of each image optimization model is obtained, where the image optimization models may be models for deblurring an image to be processed, or models for beautifying and thinning a face in the image to be processed, and are not specifically limited in this example embodiment.
In the present exemplary embodiment, the sizes of the input images of the plurality of image optimization models are different from each other, so that the plurality of models can simultaneously perform optimization processing on images to be processed of different sizes.
The image optimization models in the database are obtained through training, during training, a second initial model which is input as an image to be optimized and output as an optimized image can be configured, then a plurality of images to be optimized and optimized images corresponding to the images to be optimized are obtained from data and serve as training data, and the second initial model is trained through the training data to obtain the image optimization models.
In step S430, a corresponding image optimization model is matched for each reference face image according to the size of the reference face image and the size of the input image.
In an example embodiment of the present disclosure, after the image of the reference face and the size of the input image are acquired, a corresponding image optimization model may be matched for each reference face image according to the size of the reference face image and the size of the input image of the image optimization model.
Specifically, the following operation may be performed for each of the reference face images, and specifically, a mapping relationship between the size of the reference face image and the size of the input image of each image optimization model may be first obtained, and then the image optimization model may be matched for the reference face image according to the mapping relationship.
In this exemplary embodiment, the mapping relationship may include a difference between the size of the reference face image and the size of the input image, and an absolute value of each difference may be first obtained, and an image optimization model corresponding to the input image size of the absolute value of the difference may be assigned to the reference face image.
In another exemplary embodiment of the present disclosure, the mapping relationship may further include a ratio between a size of the reference face image and a size of the input image, a preset ratio range may be first determined, and the image optimization model of the input image size corresponding to the ratio falling within the preset ratio range is allocated to the reference face image.
In step S440, each reference face image is optimized by using the image optimization model to obtain a target face image, and an optimized image corresponding to the image to be processed is obtained based on the target face image.
In this exemplary embodiment, the optimization process may include deblurring the reference face image, or beautifying and face-thinning the reference face image, and correspondingly, the image optimization model may be a deblurring model, a beautifying model, a face-thinning model, and the like, which is not specifically limited in this exemplary embodiment.
In an example embodiment of the present disclosure, the following operations may be performed for each reference face image. Specifically, determining the size of a reference face image and the size of an input image of a corresponding image optimization model, and if the size of the reference face image is smaller than the size of the input image, intercepting an initial face image with the same size as the input image from the image to be processed, wherein the initial face image comprises the reference face image; specifically, an initial face image is captured by taking the reference face image as a center, wherein the initial face image comprises the reference face image, and the initial face image is input into an image optimization model to obtain a target face image.
In an example embodiment of the present disclosure, if the size of a reference face image is larger than the size of an input image, downsampling the reference face image to obtain an initial face image with the same size as the input image; and inputting the initial face image into an image optimization model for optimization, and performing up-sampling on the optimized initial face image to obtain a target face image. If the optimized initial face image is up-sampled to obtain the target face image, the loss between the initial face image and the reference face image in the down-sampling process can be firstly obtained; then, the optimized initial face image can be up-sampled based on the loss amount to obtain a target face image.
Specifically, as shown in fig. 5, a laplacian pyramid is used to record a loss amount 502 of the picture in the downsampling process, and then the target face image may be obtained by using a laplacian pyramid compensation method according to the loss amount 502 in the downsampling process. The optimization precision of the image to be processed can be improved. That is, after the initial face image 501 is up-sampled, the loss amount is added to the up-sampled image to obtain the target face image 503.
In the present exemplary embodiment, the method for obtaining the downsampled loss amount is not limited to the above laplacian pyramid compensation method, and may also be customized according to the user requirement, and is not specifically limited in the present exemplary embodiment.
In this exemplary embodiment, after obtaining the target face image, an optimized image corresponding to the image to be processed may be obtained based on the target face image, specifically, a region corresponding to a reference face image in the image to be processed is replaced with the target face image to obtain a replaced image, and then the replaced image may be subjected to smoothing processing to complete optimization processing on the image to be processed, so as to obtain an optimized image.
It should be noted that the smoothing process may be implemented in various manners, such as a mean filtering smoothing process, a median filtering smoothing process, a gaussian filtering smoothing process, etc., and is not limited in this exemplary embodiment.
In an example embodiment of the present disclosure, the method for optimizing an image to be processed may further include determining a minimum allowable size according to a minimum value of sizes of the input images; and if the size of the reference face image is smaller than the minimum allowable size, deleting the reference face image corresponding to the size of the reference face image.
For example, if the minimum value of the input image sizes is a, the minimum allowable size may be set to a/2, a/3, a/4, or the like, or may be customized according to the user's requirement, which is not specifically limited in the present exemplary embodiment. Images smaller than the minimum allowable size are deleted, so that the non-important small faces can be filtered, and the optimization speed of the images to be processed is improved
In an example embodiment of the present disclosure, the method for optimizing an image to be processed may further include determining a maximum allowable size according to a maximum value of sizes of the input images; if the size of the reference face image is larger than the size allowed to be large, dividing the reference face image corresponding to the size of the reference face image into a plurality of new reference face images; and determining an image optimization model corresponding to the new reference face image according to the size of the new reference face image and the sizes of the input images of the plurality of image optimization models.
For example, if the minimum value of the input image size is B, the minimum allowable size may be set to 2B, 3B, 4B, or the like, or may be customized according to the user's requirement, which is not specifically limited in the present exemplary embodiment. Blocking an oversized image enables the accuracy of the processing of the image to be processed to be improved.
In an example embodiment of the present disclosure, the method for optimizing the to-be-processed image may further include performing the following operation for each reference face image. Specifically, determining the maximum limit size of an image optimization model corresponding to a reference face image; if the size of the reference face image is larger than the size allowed to be large, dividing the reference face image corresponding to the size of the reference face image into a plurality of new reference face images; and determining an image optimization model corresponding to the new reference face image according to the size of the new reference face image and the sizes of the input images of the plurality of image optimization models.
For example, the maximum limit size of each reference face image can be determined according to the input image size of each image optimization model; specifically, if the input image size of the image optimization model corresponding to the reference face image is C, the maximum limit size may be set to 2C, 3C, 4C, or the like, or may be customized according to the user's requirement, which is not specifically limited in this exemplary embodiment. Blocking an oversized image enables the accuracy of the processing of the image to be processed to be improved.
In the present exemplary embodiment, the optimization method of the image to be processed is described with reference to fig. 6, taking the image optimization model as an image deblurring model as an example, and taking three models as an example.
In the present exemplary embodiment, referring to fig. 6, step S601 may be performed first, in which a reference face image is extracted, that is, a plurality of reference face images are extracted from an image to be processed. Since the face size changes greatly, a minimum allowable size may be set, and then step S602 is performed to filter the image with the size smaller than the minimum allowable size, that is, delete the reference face image with the size smaller than the minimum allowable size, and filter out the non-important small face. Step S603 may then be performed to match a deblurred model for the reference face image.
Then, steps S604, S605 and S606 may be performed, which in this example embodiment includes three deblurring models, the reference face image matching the first deblurring model is input to the first deblurring model entry, the reference face image matching the second deblurring model is input to the second deblurring model entry, and the reference face image matching the third deblurring model is input to the third deblurring model entry. That is, according to the size of each human face, the input size of the deblurring model is judged to be the most similar to the input size of the deblurring model, and the input size is put into a processing queue corresponding to the deblurring model.
In this exemplary embodiment, after the assignment of the model is completed, step S608 may be executed to determine whether the size of the reference face image is larger than the model input size, and when the size of the reference face image is smaller than or equal to the model input size, step S611 may be executed to intercept the initial face image including the reference face image, specifically, the initial face image with the size same as that of the model may be cut out directly with the size of the reference face image as the center, and then step S612 is executed to deblur the model and send the deblurred model to the corresponding model. And finally, executing the step S613 and the step S614, acquiring a target face image according to the initial face image, and then replacing the region corresponding to the reference face image in the image to be processed with the target face image. When the size of the reference face image is larger than the input size of the model, step S609 and step S610 may be executed, the loss amount is retained according to the laplacian pyramid, the reference face image is downsampled to obtain an initial face image, step S612 is executed, the deblurring process is performed by using the deblurring model, step S613 and step S614 are executed, a target face image is obtained according to the initial face image, the target face image is used to replace the area corresponding to the reference face image in the image to be processed, and the smoothing process is performed. The manner of the smoothing process has already been described in detail above, and therefore, the description thereof is omitted here.
In the present exemplary embodiment, step S607 may be further executed to block the larger reference face image, and step S603 is executed again after the block. Specifically, the maximum allowable size of the deblurring model corresponding to the reference face image is determined, and if the size of the reference face image is larger than the maximum allowable size, the reference face image is subjected to blocking processing, and then the processing is carried out again according to the flow.
In summary, in the exemplary embodiment, the image optimization models with different input sizes are used to perform optimization processing on the reference face images with different sizes extracted from the image to be processed, so that the human reference face images are optimized in a targeted manner, and the optimization precision of the image to be processed can be improved; and a plurality of image optimization models can work simultaneously, so that the speed of optimizing the image to be processed is improved.
It is noted that the above-mentioned figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the present disclosure, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.
Further, referring to fig. 7, an apparatus 700 for optimizing an image to be processed is further provided in the present exemplary embodiment, and includes an image obtaining module 710, a model obtaining module 720, a model matching module 730, and an image optimizing module 740. Wherein:
the image obtaining module 710 may be configured to extract a reference face image corresponding to each face from the image to be processed, and determine a size of each reference face image.
The model obtaining module 720 may be configured to obtain a plurality of pre-trained image optimization models and determine an input image size corresponding to each image optimization model.
The model matching module 730 may be configured to match a corresponding image optimization model for each reference face image according to the size of the reference face image and the size of the input image.
In the present exemplary embodiment, the model matching module 730 may specifically perform the following operations with respect to the reference face image. Specifically, a mapping relation between the size of a reference face image and the size of each input image is obtained; and matching the image optimization model for the reference face image according to the mapping relation. The mapping relation comprises a difference value between the size of the reference face image and the size of each input image; or the mapping relation includes a ratio of the size of the reference face image to the difference between the sizes of the input images.
The image optimization module 740 may be configured to perform optimization processing on each reference face image by using the image optimization model to obtain a target face image, and obtain an optimized image corresponding to the image to be processed based on the target face image.
In an example embodiment of the present disclosure, the image optimization module 740 may specifically perform, for each reference face image, the following operations of determining the size of the reference face image and the size of the input image of the corresponding image optimization model; if the size of the reference face image is smaller than that of the input image, intercepting an initial face image with the same size as the input image, wherein the initial face image comprises the reference face image; and inputting the initial face image into the matching image optimization model to obtain a target face image. If the size of the reference face image is larger than that of the input image, downsampling the reference face image to obtain an initial face image with the same size as the input image; and inputting the initial face image into a matching image optimization model for optimization, and performing up-sampling on the optimized initial face image to obtain a target face image. When the optimized initial face image is up-sampled to obtain a target face image, firstly, the loss between the initial face image and a reference face image in the down-sampling process can be obtained; and performing up-sampling on the initial face image optimized based on the loss amount to obtain a target face image.
In an exemplary embodiment of the present disclosure, the apparatus 700 for optimizing an image to be processed may further include an image filtering module and an image blocking filtering module, where the image filtering module may be configured to determine a minimum allowable size according to a minimum value of sizes of input images; and if the size of the reference face image is smaller than the minimum allowable size, deleting the reference face image corresponding to the size of the reference face image.
The image partitioning module may be configured to determine a maximum allowable size according to a maximum value among sizes of the input images; if the size of the reference face image is larger than the size allowed to be large, dividing the reference face image corresponding to the size of the reference face image into a plurality of new reference face images; and determining an image optimization model corresponding to the new reference face image according to the size of the new reference face image and the sizes of the input images of the plurality of image optimization models.
In another exemplary embodiment of the present disclosure, the image partitioning module may be further configured to perform, on each reference face image, the following operation, specifically, determine a maximum allowable size of an image optimization model corresponding to the reference face image; if the size of the reference face image is larger than the size allowed to be large, dividing the reference face image corresponding to the size of the reference face image into a plurality of new reference face images; and determining an image optimization model corresponding to the new reference face image according to the size of the new reference face image and the sizes of the input images of the plurality of image optimization models.
The specific details of each module in the above apparatus have been described in detail in the method section, and details that are not disclosed may refer to the method section, and thus are not described again.
As will be appreciated by one skilled in the art, aspects of the present disclosure may be embodied as a system, method or program product. Accordingly, various aspects of the present disclosure may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.
Exemplary embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, various aspects of the disclosure may also be implemented in the form of a program product comprising program code for causing a terminal device to perform the steps according to various exemplary embodiments of the disclosure described in the above-mentioned "exemplary methods" section of this specification, when the program product is run on the terminal device.
It should be noted that the computer readable media shown in the present disclosure may be computer readable signal media or computer readable storage media or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
In the present disclosure, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In contrast, in the present disclosure, a computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
Furthermore, program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is to be limited only by the terms of the appended claims.