Model training method, device, equipment and computer storage medium
1. A model training method, comprising:
acquiring a first model obtained by training;
initializing parameters of a second model by using the parameters of the first model;
training the second model by using a preset training target to iteratively update parameters of the second model;
wherein the second model is larger in scale than the first model, and the first and second models are of the same type.
2. The method of claim 1, wherein initializing parameters of a second model using parameters of the first model comprises:
taking the parameter values of the first model as initial values of the first part parameters of the second model;
and filling the remaining second part of parameters of the second model according to a preset strategy.
3. The method of claim 2, wherein taking the parameter values of the first model as initial values of the first part of parameters of the second model comprises:
sequentially and correspondingly filling parameter values of the first model from the lowest network layer and the lowest dimension of the second model; alternatively, the first and second electrodes may be,
and filling the parameters in the first model to the corresponding parameter positions in the second model according to the corresponding relation of the same network layer type.
4. The method of claim 2, wherein filling the remaining second portion of parameters of the second model according to a preset strategy comprises at least one of:
filling the second part of parameters by adopting random numbers;
copying the initial values of the first part of parameters to the positions of other dimensions of the same level in a second part of parameters, and filling the rest positions in the second part of parameters to be zero;
copying the initial values of the first part of parameters to the positions of other levels with the same dimensionality in a second part of parameters, and filling the rest positions in the second part of parameters by adopting random numbers;
after copying the initial values of the first part of parameters to the positions of other levels of the same dimension in the second part of parameters, correspondingly copying the parameters with the initial values in the second model to the rest positions of the same level in the second part of parameters, and adding random noise to the parameter values of the rest positions.
5. The method of claim 4, wherein if the parameter dimension of the second model is not an integer multiple of the first model, then the way one, way three, or way four is employed;
and if the expected training time of the second model is less than or equal to a preset time threshold, adopting the second mode.
6. The method of any of claims 1-5, wherein the first and second models are both pre-trained language models.
7. A model training apparatus comprising:
the model obtaining unit is used for obtaining a trained first model;
the initialization unit is used for initializing the parameters of the second model by using the parameters of the first model;
the model training unit is used for training the second model by utilizing a preset training target so as to iteratively update the parameters of the second model;
wherein the second model is larger in scale than the first model, and the first and second models are of the same type.
8. The apparatus according to claim 7, wherein the initialization unit is specifically configured to use the parameter values of the first model as initial values of the first part parameters of the second model; and filling the remaining second part of parameters of the second model according to a preset strategy.
9. The apparatus according to claim 8, wherein the initialization unit, when taking the parameter values of the first model as initial values of the first partial parameters of the second model, is specifically configured to:
sequentially and correspondingly filling parameter values of the first model from the lowest network layer and the lowest dimension of the second model; alternatively, the first and second electrodes may be,
and filling the parameters in the first model to the corresponding parameter positions in the second model according to the corresponding relation of the same network layer type.
10. The apparatus according to claim 8, wherein the initialization unit, when filling the remaining second part of parameters of the second model according to a preset policy, specifically adopts at least one of the following manners:
filling the second part of parameters by adopting random numbers;
copying the initial values of the first part of parameters to the positions of other dimensions of the same level in a second part of parameters, and filling the rest positions in the second part of parameters to be zero;
copying the initial values of the first part of parameters to the positions of other levels with the same dimensionality in a second part of parameters, and filling the rest positions in the second part of parameters by adopting random numbers;
after copying the initial values of the first part of parameters to the positions of other levels of the same dimension in the second part of parameters, correspondingly copying the parameters with the initial values in the second model to the rest positions of the same level in the second part of parameters, and adding random noise to the parameter values of the rest positions.
11. The apparatus of claim 10, wherein the initialization unit employs the manner one, the manner three, or the manner four if the parameter dimension of the second model is not an integer multiple of the first model;
and if the expected training time of the second model is less than or equal to a preset time threshold, the initialization unit adopts the second mode.
12. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
13. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-6.
14. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-6.
Background
With the continuous development of deep learning and natural language processing technology in recent years, a mode of pre-training model based on large-scale linguistic data and downstream task fine adjustment gradually becomes a classic framework. The improvement of the effect of the pre-training model is usually accompanied by the rapid expansion of the data volume and the scale of the model parameters, and the expansion is gradually expanded from the initial hundred million level to the billion level and even continues to expand.
The cost of training a large scale pre-trained model from scratch is enormous, which presents a significant challenge in terms of both time and effort costs. Therefore, how to perform model training with high efficiency and low cost becomes an urgent problem to be solved.
Disclosure of Invention
In view of the above, the present disclosure provides a model training method, apparatus, device and computer storage medium, so as to improve the efficiency of model training and reduce the cost.
According to a first aspect of the present disclosure, there is provided a model training method, comprising:
acquiring a first model obtained by training;
initializing parameters of a second model by using the parameters of the first model;
training the second model by using a preset training target to iteratively update parameters of the second model;
wherein the second model is larger in scale than the first model, and the first and second models are of the same type.
According to a second aspect of the present disclosure, there is provided a model training apparatus comprising:
the model obtaining unit is used for obtaining a trained first model;
the initialization unit is used for initializing the parameters of the second model by using the parameters of the first model;
the model training unit is used for training the second model by utilizing a preset training target so as to iteratively update the parameters of the second model;
wherein the second model is larger in scale than the first model, and the first and second models are of the same type.
According to a third aspect of the present disclosure, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described above.
According to a fourth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method as described above.
According to a fifth aspect of the disclosure, a computer program product comprising a computer program which, when executed by a processor, implements the method as described above.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a flow chart of a model training method provided by an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a model parameter matrix provided by an embodiment of the present disclosure;
fig. 3a to fig. 3b are schematic diagrams of two types of initialization for the first part parameter provided by the embodiment of the present disclosure;
fig. 4a to 4d are schematic diagrams of four types of initialization for the second part of parameters according to the embodiment of the disclosure;
FIG. 5 is a schematic diagram of a model training apparatus provided in an embodiment of the present disclosure;
FIG. 6 is a block diagram of an electronic device used to implement embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Since it is relatively easy to obtain a small model which is well trained, if the knowledge that the small model has been learned can be utilized in the process of training the large model, the convergence of the large model can be accelerated.
If a traditional knowledge distillation method is adopted, the small model is used as a Teacher model, and the large model is used as a Student model. After the parameters of the large model are initialized randomly in the training process, the output distribution of the small model is fitted to obtain the knowledge of the small model to help the training of the small model while learning the training task, so that the convergence of the model is accelerated. However, the above-described conventional knowledge distillation method has the following drawbacks:
1) only the training targets provided by the small models are actually fused in the training process, the knowledge of the small models is not directly learned, and the effect of the large models is improved to a limited extent.
2) With the progress of training, the capability of the large model gradually breaks through the upper limit of the capability of the small model, so that the small model gradually becomes the restriction of training the large model. To prevent this, it is often necessary to dynamically adjust the proportion of the small models in the training, further increasing the complexity of the training.
3) Two models need to be utilized simultaneously in the training process, and the pressure of video memory is increased.
In view of the above defects, the present disclosure no longer employs the conventional knowledge distillation method, and transforms a new idea to realize the acceleration of the training of the large model based on the already trained small model. Fig. 1 is a flowchart of a model training method provided in an embodiment of the present disclosure, and as shown in fig. 1, the method may include the following steps:
in 101, a trained first model is obtained.
In 102, parameters of a second model are initialized with parameters of the first model, wherein the second model is larger in scale than the first model and the second model are of the same type.
In 103, the second model is trained to iteratively update parameters of the second model using preset training objectives.
As can be seen from the embodiment shown in FIG. 1, the parameters of the small model are fully utilized in a model parameter expansion mode and used for parameter initialization of the large model, training of the large model is accelerated based on the trained small model, training efficiency of the large model is improved, and time and labor cost are reduced.
The above step 102 is described in detail with reference to the following embodiments.
The first and second models referred to in this disclosure are of the same type and may be various deep learning models, including but not limited to: a pre-trained language model based on transformers, a classification model based on CNN (Convolutional Neural Networks) or the like, a ranking model based on DNN (Deep Neural Networks) or the like, and the like.
The first model is a model that has been trained in advance to be smaller in scale than the second model in the present disclosure. The deep learning model is composed of a plurality of network layers, parameters in each network layer have certain characteristic dimensions, and the characteristic dimensions are determined by input and output dimensions. Therefore, the scale of the model is also represented in the number of network layers and the feature dimension. Generally, a larger scale model has more network layers and/or has more feature dimensions.
For example, if parameters of a model are expressed in a matrix form, network levels to which the parameters belong are divided in the vertical direction, and feature dimensions of the parameters are divided in the horizontal direction. As shown in fig. 2, assuming a model with 4 network layers, it can be represented as a matrix, which is illustrated by 4 long rectangles, each of which contains parameters in one network layer, the dimensions being determined by the dimensions of the model input and output.
In the present disclosure, the parameters of the second model are initialized differently from the conventional way, and instead of randomizing all the parameters, the parameters of the second model are initialized with the parameters of the first model that have been trained, in such a way that the knowledge of the first model is inherited and "digested".
In the parameter initialization process of applying the parameters of the first model to the second model, since the characteristic dimension and the number of network layers of the first model, which is a small model, are often smaller than those of the second model, which is a large model, in the present disclosure, step S1 may be performed first, with the parameter values of the first model as the initial values of the first partial parameters of the second model. And then, executing step S2, and filling the remaining second part of parameters of the second model according to a preset strategy.
It should be noted that, in the expressions such as "first" and "second" in the embodiments of the present disclosure, for example, "first" and "second" in "first model", "second model", "first partial parameter", "second partial parameter", and the like are not limited in terms of number, order, and the like, and are merely used to distinguish the models or partial parameters in terms of names.
In the step S1, the following two ways may be included, but not limited to:
the first mode is as follows: and sequentially filling parameter values of the first model from the lowest network layer and the lowest dimension of the second model.
This method is applicable to the case where the network hierarchy types in the first model and the second model are the same, for example, in a pre-training language model based on a Transformer, each network layer is a Transformer layer, the first model may be mapped from the lowest network layer and the lowest dimension of the second model to corresponding positions of the second model one by one, and parameter values of the first model are used to fill the mapped parameter positions in the second model. As shown in fig. 3a, it can be considered to fill the parameter values of the first model into the lower left corner of the second model in a one-to-one correspondence. The left slash shading in fig. 3a indicates the parameter values in the first model.
The second mode is as follows: and filling the parameters in the first model to the corresponding parameter positions in the second model according to the corresponding relation between the same network layer type and the same characteristic dimension.
This approach applies to the case where the network hierarchy types are not the same in both the first and second models. For example, for a classification model, its network layer may include a CNN layer, a Softmax layer, and the like. Then the first model needs to be mapped according to the same network layer type. As shown in fig. 3b, assuming that level 1 in the first model and level 1 in the second model are the same network layer type and level 2 in the first model and level 3 in the second model are the same network layer type, the parameter values of level 1 of the first model are populated to level 1 of the second model according to the correspondence of the characteristic dimensions and the parameter values of level 2 of the first model are populated to level 3 of the second model according to the correspondence of the characteristic dimensions.
It should be noted that, in the above two manners, there may also be an expansion of the first model due to the comparison of the second model with the first model in the feature dimension, that is, some feature dimensions that the first model does not have are added. Therefore, when the parameters of the first model are filled in step S1, the parameters need to be filled in accordance with the correspondence of the feature dimensions. In fig. 3a and 3b, it is assumed that the feature dimension on the right in the second model is the feature dimension added in comparison with the first model, and the feature dimension on the left is consistent with the first model.
In the above step S2, the remaining second part of parameters of the second model may be filled according to a preset strategy in, but not limited to, the following four ways:
the first mode is as follows: and filling the second part of parameters with random numbers.
As shown in fig. 4a, for the remaining second partial parameters except for the already initialized first partial parameters, padding with random numbers may be performed. The method is suitable for any scene, and the freedom degree of large model parameters is reserved while small model parameters are used. That is, there is a faster convergence rate than a full cold start (all parameters randomized) approach, while maintaining a sufficient upper limit on capacity.
The second mode is as follows: and copying the initial value of the first part of parameters to the position of the same level in the second part of parameters, and filling the rest positions in the second part of parameters to be zero.
For the second partial parameter, a horizontal copy may be made within the same hierarchy, i.e. the initial values of the first partial parameter are horizontally copied to other feature dimensions within the same hierarchy. As shown in fig. 4b, in the parameters of the second model, the initial values of the first part of parameters in level 1 and level 2, respectively, are copied to the other dimensions in the horizontal direction. In fig. 4b, for example, the characteristic dimension of the parameter in the second model is 2 times that of the first model, that is, the initial value of the first part of the parameter may be laterally copied once. If it is more than 2 times, it is laterally replicated a plurality of times. After the lateral replication is completed, zero is filled in for the remaining positions of other levels.
Because the output of the small model (namely the first model) is simulated as much as possible on the characteristic dimension, the large model (namely the second model) can have a good performance in the initial training stage, and the convergence speed is high. This may be done if there is a high requirement for the desired training time of the second model, e.g. the desired time is less than or equal to a preset time threshold.
But the freedom of the model parameters is reduced to some extent, since a large number of parameters are set to zero, so that the part of the zeroed parameters is uniform at the time of gradient update. Moreover, this method has a limitation that the large model is required to be an integral multiple of the small model in the feature dimension, otherwise, only other methods can be adopted.
The third mode is as follows: and copying the initial value of the first part of parameters to the position of the same dimension in the second part of parameters, and filling the rest positions in the second part of parameters by adopting random numbers.
This is actually to copy the initial values of the first part of parameters vertically between different levels to the same feature dimension of other levels. As shown in fig. 4c, in the parameters of the second model, the initial values of the first part of parameters in level 1 and level 2 are longitudinally copied to level 3 and level 4, respectively, and correspond dimensionally when copied. In fig. 4c, the hierarchy of parameters in the second model is 2 times that of the first model. If more than 2 times, it is replicated longitudinally a plurality of times. In addition, it should be noted that there is no strict limitation on the level multiple between the second model and the first model in this manner, and the method may be applied if the level multiple is not an integer multiple. For example, if the second model has 5 levels and the first model has 2 levels, the first 4 levels of the second model may be copied as shown in fig. 4c, and then the 5 th level may be copied using the parameters of any one of the levels in the first model.
After the above-described vertical copying, the remaining positions may be filled with random numbers. I.e. the parameters of the remaining dimensions in each level are filled with random numbers.
The method ensures the parameter freedom degree and the model convergence effect of the large model while fully utilizing the parameters of the small model.
The fourth mode is that: after the initial values of the first part of parameters are copied to the positions with the same dimensionality in the second part of parameters, the parameters with the initial values in the second model are correspondingly copied to the rest positions with the same hierarchy in the second part of parameters, and random noise is added to the parameter values of the rest positions.
This approach is somewhat similar to the third approach, in that the initial values of the first part of parameters are first copied vertically between different levels to the same feature dimension of other levels. See the relevant description in the third mode. The difference is that after the vertical copying is completed, the horizontal copying is further performed at each level, and the initialized parameters are horizontally copied to the positions of the remaining dimensions. As shown in fig. 4d, taking the second model as an example of being 2 times of the characteristic dimension of the first model, after the longitudinal copying between different levels is completed, the parameter values of the dimension of the first half in the second model are all laterally copied to the second half, and then random noise is added to the parameter values of the second half, wherein the random noise is schematically represented by black dots.
The method further improves the convergence speed of the second model in the early stage, and has more advantages in convergence effect. And the added random noise can also make up the loss of the freedom degree of the model parameters caused by parameter replication.
The above step 103 is described in detail with reference to the following embodiments.
The first model and the second model may employ the same training objectives. I.e., the same loss is set to iteratively update the parameters, there may be some differences and improvements in the setting of the loss. But in principle the overall idea of the training objectives adopted by the two models is consistent.
The first and second models may use the same training data set or may use separate training data sets.
In the present disclosure, the second model is not knowledge-distilled from the first model during the training process, that is, the second model is a separate training process, and the second model is trained independently without participation of the first model. The first model actually provides "knowledge" of the training of the second model only on parameter initialization. The method avoids the limitation that the capability of a large model is limited by a small model in the conventional knowledge distillation method, also avoids the memory pressure brought by simultaneously utilizing the two models in the training process, and reduces the expenditure on hardware and time.
It has been mentioned in the above embodiments that the above methods provided by the embodiments of the present disclosure can be applied to training of any deep learning model. One of the application scenarios is listed here:
the training of pre-training language models with hundred million-level parameter scale is completed at present, and needs to be expanded to a billion level along with the requirement of improving the effect of the pre-training language models. Then the pre-trained language model of billion level can be used as the first model, and then the pre-trained language model of billion level parameter scale to be trained, namely the second model, is initialized by using the parameters of the first model. The second model is then trained individually based on the initialized model parameters of the second model. Wherein the model types of the billion-level pre-trained language model and the billion-level pre-trained language model are the same, for example, the model types of the pre-trained language model are both based on Transformer. But the network level number and the feature dimension of the language model at the billion level are larger than those of the pre-training language model at the billion level. Both use the same training objectives, for example: the difference between the predicted result and the true value of the text that is masked is minimized. Additionally, billions of levels of language models can employ larger scale training data.
The above is a detailed description of the method provided by the present disclosure, and the following is a detailed description of the apparatus provided by the present disclosure with reference to the embodiments.
Fig. 5 is a schematic diagram of a model training apparatus provided in an embodiment of the present disclosure, where the apparatus may be an application located at a server end, or may also be a functional unit such as a Software Development Kit (SDK) or a plug-in the application located at the server end, or may also be located in a computer terminal with higher computing power, which is not particularly limited in this embodiment of the present disclosure. As shown in fig. 5, the apparatus 500 may include: a model acquisition unit 501, an initialization unit 502, and a model training unit 503. The main functions of each component unit are as follows:
a model obtaining unit 501, configured to obtain a trained first model.
An initialization unit 502, configured to initialize parameters of the second model with parameters of the first model. The scale of the second model is larger than that of the first model, and the types of the first model and the second model are the same.
The model training unit 503 is configured to train the second model to iteratively update parameters of the second model by using a preset training target.
As an achievable way, the initialization unit 502 may take the parameter values of the first model as initial values of the first part parameters of the second model; and filling the residual second part of parameters of the second model according to a preset strategy.
When the parameter value of the first model is used as the initial value of the first partial parameter of the second model, the initialization unit 502 may be specifically configured to: sequentially and correspondingly filling parameter values of the first model from the lowest network layer and the lowest dimension of the second model; or filling the parameters in the first model to the corresponding parameter positions in the second model according to the corresponding relation of the same network layer type.
When the remaining second part of parameters of the second model are filled according to a preset policy, the initialization unit 502 may specifically adopt at least one of the following manners:
and in the first mode, random numbers are adopted to fill the second part of parameters.
And secondly, copying the initial values of the first part of parameters to the positions of other dimensions of the same level in the second part of parameters, and filling the rest positions in the second part of parameters to be zero.
And thirdly, copying the initial values of the first part of parameters to the positions of other levels of the same dimension in the second part of parameters, and filling the rest positions in the second part of parameters by adopting random numbers.
After copying the initial values of the first part of parameters to the positions of other levels of the same dimension in the second part of parameters, correspondingly copying the parameters with the initial values in the second model to the remaining positions of the same level in the second part of parameters, and adding random noise to the parameter values of the remaining positions.
As a preferred embodiment, if the parameter dimension of the second model is not an integer multiple of the first model, the initialization unit 502 may adopt the above-mentioned first, third or fourth mode.
As another preferred embodiment, if the expected training time of the second model is less than or equal to the preset time threshold, the initialization unit 502 may adopt the second mode.
The first model and the second model may employ the same training objectives. I.e., the same loss is set to iteratively update the parameters, there may be some differences and improvements in the setting of the loss. The first and second models may use the same training data set or may use separate training data sets.
The model training unit 503 does not perform knowledge distillation on the first model during the training of the second model, that is, the second model is an independent training process, and the second model is trained independently without participation of the first model.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and for the relevant points, reference may be made to the partial description of the method embodiment.
In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
As shown in fig. 6, a block diagram of an electronic device of a model training method according to an embodiment of the present disclosure is shown. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 6, the apparatus 600 includes a computing unit 601, which can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM)602 or a computer program loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the device 600 can also be stored. The calculation unit 601, the ROM 602, and the RAM 603 are connected to each other via a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.
A number of components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, a mouse, or the like; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
The computing unit 601 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of the computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 601 performs the various methods and processes described above, such as the model training method. For example, in some embodiments, the model training method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 608.
In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM 802 and/or the communication unit 609. When the computer program is loaded into RAM 603 and executed by the computing unit 601, one or more steps of the model training method described above may be performed. Alternatively, in other embodiments, the computing unit 601 may be configured to perform the model training method in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller 30, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server may be a cloud Server, which is also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility existing in the traditional physical host and virtual Private Server (VPs) service. The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.