Data processing method, device, equipment and computer readable storage medium
1. A data processing method, comprising:
acquiring to-be-processed data of a first data type;
carrying out segmentation quantization on the data to be processed based on a first segmentation mode to obtain target data of a second data type; the second data type is a 1bit integer type with data precision smaller than that of the first data type;
processing the target data by utilizing a deep learning network to obtain a processing result of at least one network layer of the deep learning network; the target network parameter in the at least one network layer is a 1bit integer obtained by carrying out segmentation quantization on the network parameter based on a second segmentation mode.
2. The method of claim 1,
the data to be processed comprises at least one of the following: the image information, the voice information, the intermediate information obtained based on the image information and the intermediate information obtained based on the voice information.
3. The method of claim 1 or 2, wherein the first segmentation comprises: segmenting a first preset value and a second preset value based on a first preset segmentation threshold value; the first preset value and the second preset value are 1bit integer types;
the step of performing segmentation quantization on the data to be processed based on the first segmentation mode to obtain target data of a second data type includes:
under the condition that the data to be processed is larger than the first preset segmentation threshold value, quantizing the data to be processed into the target data of the first preset value in a segmentation mode;
and under the condition that the data to be processed is equal to the first preset segmentation threshold value, quantizing the data to be processed into the target data of the second preset value in a segmentation mode.
4. The method according to claim 1 or 2, wherein the processing the target data by using the deep learning network to obtain a processing result of at least one network layer of the deep learning network comprises:
multiplying the target network parameters of the current network layer of the deep learning network by the target data to obtain a 2-bit output result;
based on the first segmentation mode, carrying out segmentation quantization on the output result to obtain a sub-processing result of a second data type;
and continuously processing the sub-processing result and the target network parameter of the next network layer of the deep learning network until the processing result is obtained under the condition that the processing of the at least one network layer is completed.
5. The method according to claim 1 or 2, characterized in that the method further comprises:
acquiring training sample data of the first data type;
carrying out segmentation quantization on the training sample data based on the first segmentation mode to obtain target training data of a second data type; carrying out segmentation quantization on the network parameters of at least one network layer of the initial deep learning network based on the second segmentation mode to obtain training network parameters of a second data type;
processing the target training data by using the training network parameters of the initial deep learning network to obtain a current training result;
continuously training the training network parameters of the initial deep learning network based on the current training result and the real result corresponding to the training sample data until the obtained loss is less than or equal to a preset loss threshold value, and determining the deep learning network containing target network parameters.
6. The method according to claim 5, wherein the training network parameters of the initial deep learning network are continuously trained based on the current training result and a real result corresponding to the training sample data until a loss obtained is less than or equal to a preset loss threshold, and determining the deep learning network including target network parameters comprises:
performing loss calculation based on the current training result and a real result corresponding to the training sample data to obtain the current loss of the first data type;
determining a gradient value of the training network parameter of each network layer based on the current loss under the condition that the current loss is greater than the preset loss threshold; the gradient value is the first data type;
carrying out segmented quantization on the gradient value based on the second segmentation mode to obtain the current gradient value of a second data type;
updating the training network parameters of each corresponding network layer by adopting the current gradient value until the continuous training of the training network parameters of the initial deep learning network is completed under the condition of obtaining target network parameters, so as to obtain the deep learning network; and processing the target training data by the target network parameters to obtain a loss less than or equal to the preset loss threshold.
7. The method of claim 5, wherein the second segmentation comprises: segmenting a third preset value and a fourth preset value based on a second preset segmentation threshold value; the third preset value and the fourth preset value are 1bit integer types;
the step of carrying out segmentation quantization on the network parameters of at least one network layer of the initial deep learning network based on the second segmentation mode to obtain the training network parameters of the second data type comprises the following steps:
under the condition that the network parameters of at least one network layer of the initial deep learning network are larger than the second preset segmentation threshold value, quantizing the network parameters into the training network parameters of the third preset value in a segmentation mode;
and under the condition that the network parameter of at least one network layer of the initial deep learning network is smaller than the second preset segmentation threshold value, quantizing the network parameter into the training network parameter of the fourth preset value in a segmentation manner.
8. The method according to any one of claims 1 to 7,
the first data type is a multi-bit floating point type.
9. A data processing apparatus, comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring data to be processed of a first data type;
the quantization unit is used for carrying out segmented quantization on the data to be processed based on a first segmentation mode to obtain target data of a second data type; the second data type is a 1bit integer type with data precision smaller than that of the first data type;
the processing unit is used for processing the target data by utilizing a deep learning network to obtain a processing result of at least one network layer of the deep learning network; the target network parameter in the at least one network layer is a 1bit integer obtained by carrying out segmentation quantization on the network parameter based on a second segmentation mode.
10. An electronic device, comprising:
a memory for storing an executable computer program;
a processor for implementing the method of any one of claims 1 to 8 when executing an executable computer program stored in the memory.
11. A computer-readable storage medium, in which a computer program is stored for causing a processor, when executed, to carry out the method of any one of claims 1 to 8.
Background
The deep learning model is widely applied to the fields of computer vision, natural language processing and the like, and greatly promotes the development of industries such as security protection, language identification, machine translation and the like. Through training and learning on a large amount of labeled data, the deep learning model can realize recognition and understanding of information such as pictures and languages.
Deep learning models have powerful functions, but they typically consume a large amount of computing resources and consume a large amount of computing time in processing tasks.
Disclosure of Invention
The embodiment of the disclosure provides a data processing method, a data processing device, data processing equipment and a computer readable storage medium, which can improve the data processing efficiency of a deep learning network.
The technical scheme of the embodiment of the disclosure is realized as follows:
the embodiment of the disclosure provides a data processing method, which includes: acquiring to-be-processed data of a first data type; carrying out segmentation quantization on the data to be processed based on a first segmentation mode to obtain target data of a second data type; the second data type is a 1bit integer type with data precision smaller than that of the first data type; processing the target data by utilizing a deep learning network to obtain a processing result of at least one network layer of the deep learning network; the target network parameter in the at least one network layer is a 1bit integer obtained by carrying out segmentation quantization on the network parameter based on a second segmentation mode.
In the above method, the data to be processed includes at least one of: the image information, the voice information, the intermediate information obtained based on the image information and the intermediate information obtained based on the voice information.
In the above method, the first segmentation method includes: segmenting a first preset value and a second preset value based on a first preset segmentation threshold value; the first preset value and the second preset value are 1bit integer types; the step of performing segmentation quantization on the data to be processed based on the first segmentation mode to obtain target data of a second data type includes: under the condition that the data to be processed is larger than the first preset segmentation threshold value, quantizing the data to be processed into the target data of the first preset value in a segmentation mode; and under the condition that the data to be processed is equal to the first preset segmentation threshold value, quantizing the data to be processed into the target data of the second preset value in a segmentation mode.
In the above method, the processing the target data by using the deep learning network to obtain a processing result of at least one network layer of the deep learning network includes: multiplying the target network parameters of the current network layer of the deep learning network by the target data to obtain a 2-bit output result; based on the first segmentation mode, carrying out segmentation quantization on the output result to obtain a sub-processing result of a second data type; and continuously processing the sub-processing result and the target network parameter of the next network layer of the deep learning network until the processing result is obtained under the condition that the processing of the at least one network layer is completed.
The method further comprises the following steps: acquiring training sample data of the first data type; carrying out segmentation quantization on the training sample data based on the first segmentation mode to obtain target training data of a second data type; carrying out segmentation quantization on the network parameters of at least one network layer of the initial deep learning network based on the second segmentation mode to obtain training network parameters of a second data type; processing the target training data by using the training network parameters of the initial deep learning network to obtain a current training result; continuously training the training network parameters of the initial deep learning network based on the current training result and the real result corresponding to the training sample data until the obtained loss is less than or equal to a preset loss threshold value, and determining the deep learning network containing target network parameters.
In the above method, the continuously training the training network parameters of the initial deep learning network based on the current training result and the real result corresponding to the training sample data until the obtained loss is less than or equal to a preset loss threshold value, and determining the deep learning network including the target network parameters includes: performing loss calculation based on the current training result and a real result corresponding to the training sample data to obtain the current loss of the first data type; determining a gradient value of the training network parameter of each network layer based on the current loss under the condition that the current loss is greater than the preset loss threshold; the gradient value is the first data type; carrying out segmented quantization on the gradient value based on the second segmentation mode to obtain the current gradient value of a second data type; updating the training network parameters of each corresponding network layer by adopting the current gradient value until the continuous training of the training network parameters of the initial deep learning network is completed under the condition of obtaining target network parameters, so as to obtain the deep learning network; and processing the target training data by the target network parameters to obtain a loss less than or equal to the preset loss threshold.
In the above method, the second segmentation method includes: segmenting a third preset value and a fourth preset value based on a second preset segmentation threshold value; the third preset value and the fourth preset value are 1bit integer types; the step of carrying out segmentation quantization on the network parameters of at least one network layer of the initial deep learning network based on the second segmentation mode to obtain the training network parameters of the second data type comprises the following steps: under the condition that the network parameters of at least one network layer of the initial deep learning network are larger than the second preset segmentation threshold value, quantizing the network parameters into the training network parameters of the third preset value in a segmentation mode; and under the condition that the network parameter of at least one network layer of the initial deep learning network is smaller than the second preset segmentation threshold value, quantizing the network parameter into the training network parameter of the fourth preset value in a segmentation manner.
In the above method, the first data type is a multi-bit floating point type.
An embodiment of the present disclosure provides a data processing apparatus, including: the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring data to be processed of a first data type; the quantization unit is used for carrying out segmented quantization on the data to be processed based on a first segmentation mode to obtain target data of a second data type; the second data type is a 1bit integer type with data precision smaller than that of the first data type; the processing unit is used for processing the target data by utilizing a deep learning network to obtain a processing result of at least one network layer of the deep learning network; the target network parameter in the at least one network layer is a 1bit integer obtained by carrying out segmentation quantization on the network parameter based on a second segmentation mode.
In the above apparatus, the data to be processed includes at least one of: the image information, the voice information, the intermediate information obtained based on the image information and the intermediate information obtained based on the voice information.
In the above apparatus, the first division method includes: segmenting a first preset value and a second preset value based on a first preset segmentation threshold value; the first preset value and the second preset value are 1bit integer types; the quantization unit is further configured to quantize the to-be-processed data into the target data of the first preset value in a segmented manner when the to-be-processed data is greater than the first preset segmentation threshold; and under the condition that the data to be processed is equal to the first preset segmentation threshold value, quantizing the data to be processed into the target data of the second preset value in a segmentation mode.
In the above device, the processing unit is further configured to multiply the target network parameter of the current network layer of the deep learning network with the target data to obtain a 2-bit output result; based on the first segmentation mode, carrying out segmentation quantization on the output result to obtain a sub-processing result of a second data type; and continuously processing the sub-processing result and the target network parameter of the next network layer of the deep learning network until the processing result is obtained under the condition that the processing of the at least one network layer is completed.
In the above apparatus, the obtaining unit is further configured to obtain training sample data of the first data type; the quantization unit is further configured to perform segmented quantization on the training sample data based on the first segmentation mode to obtain target training data of a second data type; carrying out segmentation quantization on the network parameters of at least one network layer of the initial deep learning network based on the second segmentation mode to obtain training network parameters of a second data type; the device further comprises: the training unit is used for processing the target training data by using the training network parameters of the initial deep learning network to obtain a current training result; continuously training the training network parameters of the initial deep learning network based on the current training result and the real result corresponding to the training sample data until the obtained loss is less than or equal to a preset loss threshold value, and determining the deep learning network containing target network parameters.
In the above apparatus, the training unit is further configured to perform loss calculation based on the current training result and a real result corresponding to the training sample data, so as to obtain a current loss of the first data type; determining a gradient value of the training network parameter of each network layer based on the current loss under the condition that the current loss is greater than the preset loss threshold; the gradient value is the first data type; carrying out segmented quantization on the gradient value based on the second segmentation mode to obtain the current gradient value of a second data type; updating the training network parameters of each corresponding network layer by adopting the current gradient value until the continuous training of the training network parameters of the initial deep learning network is completed under the condition of obtaining target network parameters, so as to obtain the deep learning network; and processing the target training data by the target network parameters to obtain a loss less than or equal to the preset loss threshold.
In the above apparatus, the second division includes: segmenting a third preset value and a fourth preset value based on a second preset segmentation threshold value; the third preset value and the fourth preset value are 1bit integer types; the quantization unit is further configured to quantize the network parameter into the training network parameter of the third preset value in a segmented manner when the network parameter of at least one network layer of the initial deep learning network is greater than the second preset segmentation threshold; and under the condition that the network parameter of at least one network layer of the initial deep learning network is smaller than the second preset segmentation threshold value, quantizing the network parameter into the training network parameter of the fourth preset value in a segmentation manner.
In the above apparatus, the first data type is a multi-bit floating point type.
An embodiment of the present disclosure provides an electronic device, including: a memory for storing an executable computer program; a processor for implementing the above-described data processing method when executing the executable computer program stored in the memory.
The embodiment of the present disclosure provides a computer-readable storage medium, which stores a computer program for causing a processor to execute the above data processing method.
The data processing method, the data processing device, the data processing equipment and the computer readable storage medium provided by the embodiment of the disclosure acquire data to be processed of a first data type, quantize the data to be processed into target data of a 1-bit integer type with data precision smaller than that of the first data type in a segmented manner by adopting a first segmentation mode, process the target data by utilizing a 1-bit integer type deep learning network obtained by performing segmented quantization on network parameters by adopting a second segmentation mode on target network parameters of a network layer, and finally obtain a corresponding processing result; the data volume of the data to be processed is reduced, and the data processing efficiency of the deep learning network is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure.
Fig. 1 is an alternative flow chart of a data processing method provided in the embodiment of the present disclosure;
fig. 2 is a schematic structural diagram of an exemplary deep learning network provided by an embodiment of the present disclosure as a neural network;
FIG. 3 is a schematic flow chart illustrating an exemplary process result obtained by at least one network layer of a deep learning network according to an embodiment of the present disclosure;
FIG. 4 is an alternative flow chart of a data processing method provided by the embodiment of the disclosure;
FIG. 5 is an alternative flow chart of a data processing method provided by the embodiments of the present disclosure;
FIG. 6 is an alternative flow diagram of a method of training an initial deep learning network provided by the disclosed embodiments;
FIG. 7 is an alternative flow chart of a data processing method provided by the embodiments of the present disclosure;
FIG. 8 is an alternative flow chart of a data processing method provided by the embodiments of the present disclosure;
fig. 9 is a schematic flowchart of exemplary training of network parameters for each network layer in the process of training an initial deep learning network according to an embodiment of the present disclosure;
fig. 10 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure;
fig. 11 is a schematic structural diagram of an electronic device according to an embodiment of the present disclosure.
Detailed Description
For the purpose of making the purpose, technical solutions and advantages of the present disclosure clearer, the present disclosure will be described in further detail with reference to the accompanying drawings, the described embodiments should not be construed as limiting the present disclosure, and all other embodiments obtained by a person of ordinary skill in the art without making creative efforts shall fall within the protection scope of the present disclosure.
The deep learning model has strong capability, but the complex model brings huge calculation amount, and a great amount of time is consumed for training the deep learning model to be used in practical application. The training of the deep learning model involves a large number of matrix convolution calculations, typically in excess of 60% of the total model training. Currently, most of the commonly used depth models perform convolution calculation with 32-bit floating point numbers, and in the related art, in order to reduce the calculation amount of the depth learning network, data can be converted from high precision to low precision, for example, 32-bit floating point data is converted into 8-bit fixed point (INT8) data. Although the method can reduce the calculation amount of the deep learning network, the deep learning network can cause serious precision loss, and the deep learning network cannot perform tasks with higher precision requirements. The deep learning network training mode provided by the embodiment of the disclosure can reduce the calculation amount of the deep learning network, improve the processing efficiency of the deep learning network, and reduce the precision loss of the deep learning network, so that the deep learning network can be applied to tasks with higher precision requirements, such as image recognition, image classification, target detection and the like.
The data processing scheme provided by the embodiment of the disclosure can be applied to the training process of a deep learning network, such as the training process of some tasks with higher precision requirements, for example, the training process of an image classification task, an image recognition task and a target detection task. Or, the method can also be applied to processing some tasks with high precision requirements by using the trained deep learning network, for example, to image classification tasks, image recognition tasks, target detection tasks, voice recognition, semantic recognition and other scenes. The embodiment of the present disclosure does not limit a specific application scenario, as long as the data processing scheme provided by the embodiment of the present disclosure is included in the protection scope of the present disclosure.
The embodiment of the disclosure provides a data processing method, which can improve the data processing efficiency of a deep learning network. The data processing method provided by the embodiment of the disclosure is applied to electronic equipment.
An exemplary application of the electronic device provided by the embodiment of the present disclosure is described below, and the electronic device provided by the embodiment of the present disclosure may be implemented as various types of user terminals (hereinafter, referred to as terminals) such as AR glasses, a notebook computer, a tablet computer, a desktop computer, a set-top box, a mobile device (for example, a mobile phone, a portable music player, a personal digital assistant, a dedicated messaging device, and a portable game device), and may also be implemented as a server.
Next, an exemplary application when the electronic device is implemented as a terminal will be explained. Fig. 1 is an alternative flow chart of a data processing method provided by an embodiment of the present disclosure, which will be described with reference to the steps shown in fig. 1.
S101, acquiring to-be-processed data of a first data type.
In the embodiment of the disclosure, before the terminal performs the identification or classification task by using the deep learning network, the terminal acquires the data to be processed of the first data type.
Here, the first data type may be a high precision data type, for example, a data type of a multi-bit floating point type, such as a 32-bit floating point type, a 64-bit floating point type, or the like. The data to be processed may be data for deep learning network processing, may be one or more of various types of information such as image information, text information, voice information, and the like, and may also be intermediate information obtained based on information such as image information, text information, voice information, and the like. For example, the data to be processed may include input information during application or training of the deep learning network, intermediate information obtained by processing the input information by the deep learning network, output information of the deep learning network, and the like. The intermediate information may be an intermediate result obtained by processing the input information by an intermediate network layer of the deep learning network, or may also be activation data obtained by processing the input information by using an activation function, or the like.
For example, in an image classification scenario, the data to be processed may be a target image acquired in a certain scenario.
In the embodiment of the present disclosure, the data to be processed may be stored in a storage unit (e.g., a memory, etc.) of the terminal in advance, and the terminal may obtain the data to be processed of the first data type from its own storage unit; alternatively, the terminal may further obtain the to-be-processed data of the first data type from another device, for example, the terminal may obtain the to-be-processed data acquired from a certain scene from another electronic device such as a camera (e.g., a video camera).
S102, carrying out segmented quantization on data to be processed based on a first segmentation mode to obtain target data of a second data type; and the second data type is a 1bit integer type with data precision smaller than that of the first data type.
In the embodiment of the present disclosure, after acquiring the data to be processed, the terminal may perform segmented quantization on the data to be processed in a first segmentation manner, so as to obtain 1bit integer type target data whose data precision is smaller than that of the first data type. The data with the numerical value outside a certain limit range can be limited within a certain limit range through the first segmentation mode, so that in the data quantization process, the obtained target data of the second data type can be limited within a certain limit range through the quantization effect of the first segmentation mode, the transformation error generated by data type transformation can be reduced, and the precision loss of the data to be processed is reduced.
Here, since the data precision of the second data type is smaller than that of the first data type, the information amount of the target data of the second data type is smaller than that of the data to be processed of the first data type of 1-bit integer type; therefore, when the deep learning network is used for processing the target data, the data calculation amount can be reduced, and the processing efficiency of the deep learning network is improved.
S103, processing the target data by using the deep learning network to obtain a processing result of at least one network layer of the deep learning network; the target network parameter in at least one network layer is a 1bit integer type obtained by carrying out segmentation quantization on the network parameter based on a second segmentation mode.
In the disclosed embodiment, the deep learning network may include one or more network layers, and different network layers may correspond to different network parameters and processing procedures. When the deep learning network is used for processing the input information, the quantized target data of the second data type can be used to finally obtain the processing result of the network layer of the deep learning network.
Exemplarily, fig. 2 is a schematic structural diagram of an exemplary deep learning network provided by an embodiment of the present disclosure when the deep learning network is a neural network. As shown in FIG. 2, the deep learning network includes a 3-layer network layer, in which a hidden layer (middle layer) outputs a value h1,…,hpThe calculation formula of (2) is formula (1):
wherein, B is 1, …, q, xm+11 (offset term), f (x) 1/(1+ e)-x),WABIs the connection weight between the A-th node of the input layer and the B-th node of the hidden layer.
The calculation of the output layer y is formula (2):
wherein beta ispThe connection weight value between the B-th node of the hidden layer and the k-th node of the output layer.
It should be noted that the solving process of the network parameters W (matrix) and β (vector) is as follows:
w (connection weight between an input layer and a hidden layer in the deep learning network): according to the principle of the extreme learning machine, W may take any random value, for example, a random value between [ -1, 1 ]. And once assigned, will not change in subsequent network optimization processes. Therefore, the total number of adjustable parameters of the network is not influenced no matter how many input variables of the network are.
A hyper-parameter p: the number of hidden nodes of the deep learning network and the only hyper-parameter in the whole set of algorithm are obtained, and under-fitting or over-fitting can be caused if the number is too small or too large, and an optimal value can be determined only through experiments.
Training the model to obtain beta (the connection weight between the hidden layer and the output layer in the deep learning network): according to the principle of an extreme learning machine, the method can be summarized as solving a Moore-Penrose generalized inverse. In short, after β is obtained, a deep learning network can be obtained.
In the embodiment of the present disclosure, the target data of the second data type may be used as input information of the deep learning network, and may also be used as the above-mentioned activation data of the deep learning network. The processing result may be final output information of the deep learning network, or may be output information of one or more intermediate network layers, for example, the processing result may be final output information of the deep learning network, and for example, a gradient value corresponding to the weight parameter and a gradient value corresponding to the activation obtained from the gradient value of the output parameter in the back propagation process. Illustratively, in the disclosed embodiments, the single precision floating point number (FP32) type is used for backpropagation.
In the embodiment of the disclosure, the input information to be processed by the deep learning network is 1-bit integer data, and before the deep learning network is used for network processing, the network parameters of the deep learning network are quantized into 1-bit integer target network parameters by using a second sectional quantization mode, so that the unification of data types can be maintained, and the target data can be correctly processed by the deep learning network.
For example, in a scenario of performing target detection on an input image by using a deep learning network, a weight value (W) and an activation value (a) of a 32-bit floating point type of one or more network layers of the deep learning network may be converted into a numerical value of a 1-bit integer type, and then under the action of the weight value and the activation value of the 1-bit integer type, an object feature of the input image is extracted by using the one or more network layers, so as to obtain a feature extraction result of the one or more network layers.
In the disclosed embodiment, the processing result obtained based on the target data of the second data type is the second data type. In order to meet some practical requirements, after the processing result of the second data type is obtained, the processing result may be converted from the second data type to the first data type. For example, in the case where the network layer is a convolutional layer, since convolutional calculation is linear calculation, in order to ensure the accuracy of the processing result, the obtained processing result may be converted from the second data type to the first data type. For another example, in order to meet the storage requirement or the data precision requirement, or in order to meet the computation requirement of other network layers besides the at least one network layer, the obtained processing result may be converted from the second data type to the first data type.
It should be noted that the data processing scheme of the embodiment of the present disclosure may be applied to the information processing process of one or more network layers of the deep learning network, where the multiple network layers may be continuous network layers or discontinuous network layers, and the network layer may be an input layer, an output layer, or an intermediate layer. The data processing scheme of the embodiment of the present disclosure is described below by taking an information processing procedure of a convolutional layer as an example.
Fig. 3 is a schematic flowchart illustrating an exemplary process result obtained by at least one network layer of a deep learning network according to an embodiment of the present disclosure. As shown in fig. 3, the deep learning network is composed of a convolutional layer (network layer), the network parameters are an active value of a 32-bit floating point type (denoted as a (FP32) in fig. 3) and a weight value of the 32-bit floating point type (denoted as W (FP32) in fig. 3), a (FP32) can be quantized into an active value of a 1-bit integer type (denoted as a (INT1) in fig. 3) by a second segmentation method, and W (FP32) can be quantized into a weight value of a 1-bit integer type (denoted as W (INT1) in fig. 3) by a second segmentation method, so as to obtain a target network parameter; meanwhile, the to-be-processed data of the convolutional layer is quantized into target data of a 1-bit integer type from data of a 32-bit floating point type by a first segmentation mode (the target data is not shown in fig. 3). Then, under the action of a (INT1) and W (INT1), the processing result of the 1-bit integer type (second data type) of the convolutional layer (denoted as Z (INT1)) can be obtained according to the target data of the convolutional layer, and in order to ensure the accuracy of the processing result, the processing result Z (INT1) can be converted from the 1-bit integer type to the processing result of the 32-bit floating point type (first data type) (denoted as Z (FP32) in fig. 3).
In the embodiment of the disclosure, to-be-processed data of a first data type is obtained, the to-be-processed data is quantized into target data of a 1bit integer type with data precision smaller than that of the first data type in a segmented mode, and the target network parameters of a network layer are utilized to process the target data through a 1bit integer type deep learning network obtained by quantizing the network parameters in a segmented mode through a second segmented mode, so that corresponding processing results are finally obtained; the data volume of the data to be processed is reduced, and the data processing efficiency of the deep learning network is improved.
In some embodiments of the present disclosure, the above S102 may be implemented by S1021-S1022, which will be described in conjunction with the steps shown in fig. 4.
S1021, under the condition that the data to be processed is larger than a first preset segmentation threshold value, quantizing the data to be processed into target data of a first preset value in a segmentation mode; the first segmentation mode comprises the following steps: segmenting a first preset value and a second preset value based on a first preset segmentation threshold value; the first preset value and the second preset value are 1bit integer type.
And S1022, under the condition that the data to be processed is equal to the first preset segmentation threshold, quantizing the data to be processed into target data of a second preset value in a segmentation mode.
In the embodiment of the disclosure, the terminal can quantize the data to be processed through the first preset segmentation threshold value and the first preset value and the second preset value of the 1-bit integer type, so as to obtain the target data of the 1-bit integer type.
In some embodiments, for example, the first preset segmentation threshold may be 0, the first preset value is 1, and the second preset value is 0, so that the terminal may quantize the data to be processed into target data of 1 if the data to be processed is greater than 0, and quantize the data to be processed into target data of 0 if the data to be processed is equal to 0; that is, the first segmentation mode can be expressed by the following formula (3):
wherein, a represents the data to be processed, and round represents the random rounding function.
In some embodiments of the present disclosure, S103 may be implemented by S1031 to S1033, and S103 in fig. 1 is taken as an example and is explained with reference to the steps shown in fig. 5.
And S1031, multiplying the target network parameters of the current network layer of the deep learning network by the target data to obtain a 2-bit output result.
In the embodiment of the disclosure, after obtaining the 1-bit integer type target network parameter and the 1-bit integer type target data, the terminal may multiply the target network parameter and the target data in the process of performing data processing on each network layer of the deep learning network, so as to obtain the output result of each network layer. Here, since the target network parameter and the target data are both 1-bit integer types, the output result obtained by multiplying is a 2-bit output result.
S1032, based on the first segmentation mode, carrying out segmentation quantization on the output result to obtain a sub-processing result of the second data type.
The terminal obtains the output result of each network layer, and can quantize the output result in a first segmentation mode before inputting the output result to the next network layer, wherein the output result is quantized into a sub-processing result with a first preset value in a segmentation mode under the condition that the output result is larger than a first preset segmentation threshold value, and the output result is quantized into a sub-processing result with a second preset value in a segmentation mode under the condition that the output result is equal to the first preset segmentation threshold value, so that a 1-bit integer sub-processing result is obtained, and consistency of data types between the output result and target network parameters of the network layer to be input is maintained.
S1033, continuing to process the sub-processing result and the target network parameter of the next network layer of the deep learning network until the processing of at least one network layer is completed, and obtaining the processing result.
In this embodiment of the disclosure, when the terminal obtains the sub-processing result corresponding to the output result of the previous network layer, and when the next network layer is used to process the sub-processing result, as in the step in S1031, the sub-processing result is also multiplied by the target network parameter of the next network layer to obtain the 2-bit output result of the next network layer, and the output result of the next network layer is quantized by the method described in S1032, so as to obtain the sub-processing result corresponding to the next network layer, and then the sub-processing result corresponding to the next network layer is input into the next network layer of the next network layer to be processed continuously until the output result of the last network layer of the deep learning network is obtained, and the processing of the target data by the deep neural network is completed.
Exemplarily, in the process of processing target data by using the deep learning network with 3-layer network layers in fig. 2, the terminal multiplies the target network parameter corresponding to the 1 st network layer by the target data to obtain the 2-bit output result of the 1 st network layer, and then quantizes the 2-bit output result of the 1 st network layer in a first segmentation manner to obtain the 1-bit integer sub-processing result of the 1 st network layer; inputting the sub-processing result of the 1 st network layer into the 2 nd network layer, and multiplying the target network parameter corresponding to the 2 nd network layer by the sub-processing result of the 1 st network layer to obtain the 2bit output result of the 2 nd network layer; then, quantizing the 2-bit output result of the 2 nd network layer in a first segmentation mode to obtain a 1-bit integer sub-processing result of the 2 nd network layer; and inputting the sub-processing result of the 2 nd network layer into the 3 rd network layer, multiplying the target network parameter corresponding to the 3 rd network layer by the sub-processing result of the 2 nd network layer to obtain the 2bit output result of the 3 rd network layer, and finally quantizing the 2bit output result of the 3 rd network layer in a first segmentation mode to obtain a 1bit integer final processing result.
The data processing scheme provided by the embodiment of the present disclosure may be applied to a practical application using a deep learning network, for example, classifying the input image information using a trained deep learning network, or recognizing the input image information using the deep learning network, or performing target detection on the input image information using the deep learning network. In addition, the data processing scheme provided by the embodiment of the disclosure can also be applied to the training process of the deep learning network.
Illustratively, fig. 6 is an alternative flowchart of a method for training an initial deep learning network provided by the disclosed embodiment, and will be described with reference to the steps shown in fig. 6.
S201, training sample data of a first data type is obtained.
In the embodiment of the present disclosure, before the terminal trains the deep neural network, the terminal may first acquire sample data for training. In some embodiments, obtaining training sample data of the first data type may be implemented by S2011-S2012:
and S2011, acquiring a positive sample and a negative sample according to a preset configuration proportion.
In some actual detection or recognition scenarios, a certain proportion exists among various results, and therefore, in order to improve the detection or recognition accuracy of the deep learning network obtained through final training, the terminal may use the proportion as a configuration proportion, and perform quantity configuration of positive samples and negative samples through the configuration proportion.
And S2012, extracting the characteristics of the positive sample and the characteristics of the negative sample, and taking the extracted sample characteristics as the acquired training sample data.
After the number of the positive samples and the number of the negative samples are configured, the terminal can respectively extract the features of the positive samples and the negative samples, so that training sample data for training are obtained.
S202, carrying out segmented quantization on training sample data based on a first segmentation mode to obtain target training data of a second data type; and carrying out segmented quantization on the network parameters of at least one network layer of the initial deep learning network based on a second segmentation mode to obtain training network parameters of a second data type.
In the embodiment of the present disclosure, after the terminal acquires the training sample data, since the training sample data is usually multi-bit floating point type data, the terminal may quantize the training sample data in the first segmentation mode to obtain 1-bit integer type target training data. Meanwhile, in order to keep the data type of the deep learning network unified with the target training data, the terminal can quantize the network parameters of each network layer of the deep learning network according to the second segmentation mode, so that the 1-bit integer type training network parameters of each network layer are obtained.
In some embodiments of the present disclosure, the step of performing the segmentation quantization on the network parameters of at least one network layer of the initial deep learning network based on the second segmentation manner in S202 to obtain the training network parameters of the second data type may be implemented by S2021-S2022, and will be described with reference to the steps shown in fig. 7.
S2021, under the condition that the network parameter of at least one network layer of the initial deep learning network is larger than a second preset segmentation threshold value, quantizing the network parameter into a training network parameter of a third preset value in a segmentation mode; the second segmentation mode comprises the following steps: segmenting a third preset value and a fourth preset value based on a second preset segmentation threshold value; the third preset value and the fourth preset value are 1bit integer type.
S2022, under the condition that the network parameter of at least one network layer of the initial deep learning network is smaller than a second preset segmentation threshold value, quantizing the network parameter into a training network parameter of a fourth preset value in a segmentation mode.
In the embodiment of the disclosure, the terminal may quantize the network parameter through the second preset segmentation threshold, and the third preset value and the fourth preset value of the 1-bit integer, so as to obtain the training network parameter of the 1-bit integer.
In some embodiments, for example, the second preset segmentation threshold may be 0, the first preset value is 1, and the second preset value is-1, so that the terminal may quantize the network parameter to a training network parameter of 1 if the network parameter is greater than 0, and quantize the network parameter to a training network parameter of-1 if the network parameter is less than 0; that is, the second segmentation manner may be expressed by the following formula (4):
where W represents a network parameter.
In some embodiments of the present disclosure, in a process of applying the deep learning network, a first preset segmentation threshold is used to quantize data to be processed of the deep learning network, and a second preset segmentation threshold is used to quantize network parameters of the deep learning network, such as a weight value, an activation value, and the like, the first preset segmentation threshold and the second preset segmentation threshold may also be adjusted. For example, in the process of deep learning network application, each time data to be processed is acquired, a first preset segmentation threshold may be calculated once, and then the data to be processed is quantized by using the calculated first preset segmentation threshold. For another example, in the process of deep learning network training, the gradient value corresponding to the network parameter is quantized by using the second preset segmentation threshold, and the second preset segmentation threshold may be recalculated once every time the network parameter is adjusted by using the gradient value, so as to dynamically adjust the second preset segmentation threshold.
S203, processing the target training data by using the training network parameters of the initial deep learning network to obtain a current training result.
Under the condition that the terminal obtains 1-bit integer type target training data and 1-bit integer type training network parameters, the 1-bit integer type target training data can be input into the initial deep learning network, and the target training data is processed through the training network parameters of the initial deep learning network to obtain a training result of one-time training. Here, in each training process, the processing procedure of the network layer of the initial deep learning network on the target training data is the same as the procedure described in S1031 to S1033 described above.
S204, continuously training the training network parameters of the initial deep learning network based on the current training result and the real result corresponding to the training sample data until the obtained loss is less than or equal to a preset loss threshold value, and determining the deep learning network containing the target network parameters.
In the embodiment of the disclosure, the terminal can train the deep learning network by using the 1-bit integer type target training data and the training network parameters, so as to realize the back propagation process of the training. Here, after each training result is obtained by the terminal, the training result may be compared with a real result (usually, a real labeling result) corresponding to the training sample data, and when the comparison result does not satisfy a preset condition, the training network parameters of the initial deep learning network are continuously trained by using the training sample data, and the process is repeated until after a certain training is finished, and when a loss calculated for the training result obtained by the certain training is less than or equal to a preset loss threshold, the network parameters of each network layer of the initial deep learning network obtained by the certain training are used as the target network parameters, so as to obtain the deep learning network including the target network parameters.
In some embodiments, the above-mentioned S204 can be implemented by S2041-S2044, which will be described below with reference to the steps shown in fig. 8.
S2041, loss calculation is carried out based on the current training result and a real result corresponding to the training sample data, and current loss of the first data type is obtained.
In the embodiment of the disclosure, after the terminal obtains each training result, the terminal may compare the training result with a real result corresponding to training sample data to obtain a comparison result, where the training result may correspond to a predicted value, and the real result may correspond to a real value, so as to compare an output result of the initial deep learning network with the real result, and may calculate a loss of the predicted value relative to the real value according to the comparison result, so as to obtain a current loss corresponding to the training.
S2042, under the condition that the current loss is larger than a preset loss threshold value, determining a gradient value of a training network parameter of each network layer based on the current loss; the gradient values are of a first data type.
The terminal can compare the current loss of each training with a preset loss threshold value, under the condition that the current loss is larger than the preset loss threshold value, the gradient value of the training network parameter of each network layer is calculated according to the current loss value, and the training network parameter of each network layer of the initial deep learning network is updated through the gradient value in the future to obtain a new training network parameter.
S2043, carrying out segmented quantization on the gradient values based on the second segmentation mode to obtain the current gradient value of the second data type.
In the embodiment of the present disclosure, since the obtained gradient value is data of the first data type with higher data accuracy, the terminal may quantize the gradient value from the first data type to the second data type in the second segmentation manner to reduce the data amount.
S2044, updating the training network parameters of each corresponding network layer by adopting the current gradient value until the training network parameters of the initial deep learning network are continuously trained under the condition that the target network parameters are obtained, so as to obtain the deep learning network; and after the target network parameters process the target training data, the obtained loss is less than or equal to a preset loss threshold value.
In the embodiment of the present disclosure, when obtaining the gradient value of the second data type, the terminal may update the training network parameter of each network layer by using the gradient value of the second data type to obtain a new training network parameter, continue training the target training data by using the new training network parameter, obtain the current training result for the training, and return to continue using the method of S2041-2044 until, when the current loss is less than or equal to the preset loss threshold, the training network parameter corresponding to the current loss that is less than or equal to the preset loss threshold is used as the target network parameter to obtain the deep learning network, thereby implementing the training process of the initial deep learning network by back propagation.
In the training process, the target training data of the second data type has less information quantity relative to the training sample data of the first data type, and the network parameters of the second data type have less information quantity relative to the target network parameters of the first data type, so that the training efficiency of the initial deep learning network can be improved. It should be noted that the above steps shown in fig. 6 to 8 may be executed before S103, or may also be executed before S101.
Fig. 9 is a schematic flowchart of an exemplary training process of network parameters for each network layer in the training process of the initial deep learning network according to the embodiment of the present disclosure. As shown in FIG. 9, the deep learning network is composed of a convolutional layer, the target training data may be training sample data of 32-bit floating point type, and the quantized data of 1-bit integer type (not shown in FIG. 9), and the calculated gradient value of the first data type may be a gradient value of 32-bit floating point type, which may be represented as gz(FP 32). G can be first segmented by the second segmentation modez(FP32) is quantized to a gradient value of 1bit integer type (current gradient value, denoted as g in FIG. 9)z(INT 1)); then, g can be utilizedz(INT1) adjusting and updating network parameters of the convolutional layer, such as activation parameters and weighting parameters. Wherein, ga(INT1) may represent an updated activation parameter, gW(INT1) may represent the updated weight parameter. ga(INT1) and gW(INT1) are all 1bit integer type, and g can be used for meeting some storage or precision requirementsa(INT1) and gW(INT1) to g of 32bit floating point typea(FP32) and gW(FP32)。
In the training process of the initial deep learning network, the back propagation process of the convolution network can be realized by calling the instruction in the integer type calculation library, so that the calculation cost of the data type transformation process (quantization) is optimized, the integer type calculation library can be periodically maintained, the time cost for updating the integer type calculation library can be reduced, and the processing speed of the deep learning network is further accelerated. The acceleration to a great extent can be realized on the back propagation of the convolutional layer, and the training speed of the deep learning network is obviously improved. In tasks with high requirements on precision, such as image classification, target detection and the like, precision close to full precision can be achieved.
It is understood that the various method embodiments mentioned above in this disclosure may be combined with each other to form combined embodiments without departing from the principle logic.
The present disclosure further provides a data processing apparatus, and fig. 10 is a schematic structural diagram of the data processing apparatus provided in the embodiment of the present disclosure; as shown in fig. 10, the data processing apparatus 1 includes: an acquiring unit 10, configured to acquire to-be-processed data of a first data type; a quantization unit 20, configured to perform segmented quantization on the to-be-processed data based on a first segmentation mode to obtain target data of a second data type; the second data type is a 1bit integer type with data precision smaller than that of the first data type; the processing unit 30 is configured to process the target data by using a deep learning network to obtain a processing result of at least one network layer of the deep learning network; the target network parameter in the at least one network layer is a 1bit integer obtained by carrying out segmentation quantization on the network parameter based on a second segmentation mode.
In some embodiments, the data to be processed comprises at least one of: the image information, the voice information, the intermediate information obtained based on the image information and the intermediate information obtained based on the voice information.
In some embodiments, the first segmentation means comprises: segmenting a first preset value and a second preset value based on a first preset segmentation threshold value; the first preset value and the second preset value are 1bit integer types; the quantizing unit 20 is further configured to quantize the to-be-processed data into the target data of the first preset value in a segmented manner if the to-be-processed data is greater than the first preset segmentation threshold; and under the condition that the data to be processed is equal to the first preset segmentation threshold value, quantizing the data to be processed into the target data of the second preset value in a segmentation mode.
In some embodiments, the processing unit 30 is further configured to multiply the target network parameter of the current network layer of the deep learning network by the target data to obtain an output result of 2 bits; based on the first segmentation mode, carrying out segmentation quantization on the output result to obtain a sub-processing result of a second data type; and continuously processing the sub-processing result and the target network parameter of the next network layer of the deep learning network until the processing result is obtained under the condition that the processing of the at least one network layer is completed.
In some embodiments, the obtaining unit 10 is further configured to obtain training sample data of the first data type; the quantization unit 20 is further configured to perform segmented quantization on the training sample data based on the first segmentation mode to obtain target training data of a second data type; carrying out segmentation quantization on the network parameters of at least one network layer of the initial deep learning network based on the second segmentation mode to obtain training network parameters of a second data type; the apparatus further includes a training unit 40 (not shown in the figure) configured to process the target training data by using the training network parameters of the initial deep learning network to obtain a current training result; continuously training the training network parameters of the initial deep learning network based on the current training result and the real result corresponding to the training sample data until the obtained loss is less than or equal to a preset loss threshold value, and determining the deep learning network containing target network parameters.
In some embodiments, the training unit 40 is further configured to perform loss calculation based on the current training result and a real result corresponding to the training sample data, so as to obtain a current loss of the first data type; determining a gradient value of the training network parameter of each network layer based on the current loss under the condition that the current loss is greater than the preset loss threshold; the gradient value is the first data type; carrying out segmented quantization on the gradient value based on the second segmentation mode to obtain the current gradient value of a second data type; updating the training network parameters of each corresponding network layer by adopting the current gradient value until the continuous training of the training network parameters of the initial deep learning network is completed under the condition of obtaining target network parameters, so as to obtain the deep learning network; and processing the target training data by the target network parameters to obtain a loss less than or equal to the preset loss threshold.
In some embodiments, the second segmentation comprises: segmenting a third preset value and a fourth preset value based on a second preset segmentation threshold value; the third preset value and the fourth preset value are 1bit integer types; the quantizing unit 20 is further configured to quantize the network parameter segments into the training network parameter of the third preset value when the network parameter of at least one network layer of the initial deep learning network is greater than the second preset segment threshold; and under the condition that the network parameter of at least one network layer of the initial deep learning network is smaller than the second preset segmentation threshold value, quantizing the network parameter into the training network parameter of the fourth preset value in a segmentation manner.
In some embodiments, the first data type is a multi-bit floating point type.
Fig. 11 is a schematic structural diagram of the electronic device provided in the embodiment of the present disclosure, and as shown in fig. 10, the electronic device 2 includes: a memory 22 and a processor 23, wherein the memory 22 and the processor 23 are connected by a bus 21; a memory 22 for storing an executable computer program; the processor 23 is configured to implement the method provided by the embodiment of the present disclosure, for example, the data processing method provided by the embodiment of the present disclosure, when executing the executable computer program stored in the memory 22.
The present disclosure provides a computer-readable storage medium, which stores a computer program for causing the processor 23 to execute the method provided by the present disclosure, for example, the data processing method provided by the present disclosure.
In some embodiments of the present disclosure, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.
The computer-readable storage medium may also be a tangible device that retains and stores instructions for use by an instruction execution device, and may be a volatile storage medium or a non-volatile storage medium. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a U disk, a magnetic disk, an optical disk, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a static random access memory (ROM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a recordable encoding device, a punch card or recessed well structure such as those on which instructions are stored, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating battery waves, battery waves propagating through waveguides or other media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted over electrical wires.
In some embodiments of the disclosure, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts, or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.
In summary, according to the technical implementation scheme, to-be-processed data of a first data type is obtained, the to-be-processed data is quantized in a first segmentation mode in a segmented manner into target data of a 1-bit integer type with data precision smaller than that of the first data type, and the target network parameters of a network layer are utilized to process the target data through a 1-bit integer type deep learning network obtained by performing segmented quantization on the network parameters in a second segmentation mode, so that corresponding processing results are finally obtained; the data volume of the data to be processed is reduced, and the data processing efficiency of the deep learning network is improved.
The above description is only an example of the present disclosure, and is not intended to limit the scope of the present disclosure. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present disclosure are included in the protection scope of the present disclosure.
- 上一篇:石墨接头机器人自动装卡簧、装栓机
- 下一篇:用于图像处理的神经网络模型训练方法及装置