Robot path navigation method and system based on improved DDPG algorithm
1. The robot path navigation method based on the improved DDPG algorithm is characterized by comprising the following steps:
acquiring current state information and a target position of the robot;
inputting the current state information and the target position of the robot into the trained improved DDPG network to obtain optimal executable action data;
the robot completes collision-free path navigation according to the optimal executable action data;
wherein the improved DDPG network is based on the DDPG network, and the calculation of the reward value of the DDPG network is completed by utilizing a curiosity reward mechanism model; the curiosity reward mechanism model comprises: a plurality of LSTM models which are connected in series in sequence; in the LSTM models which are sequentially connected in series, the input ends of all the LSTM models are connected with the output end of the current network of the Actor, the output end of the last LSTM model is connected with the input end of the CNN model, and the output end of the CNN model is connected with the input end of the current network of the Actor.
2. The robot path navigation method based on the improved DDPG algorithm as claimed in claim 1, wherein the current state information and the target position of the robot are inputted into the trained improved DDPG network to obtain the optimal executable action data; the method specifically comprises the following steps:
and inputting the current state information and the target position of the robot into the trained improved DDPG network, and generating optimal executable action data by an Actor module of the improved DDPG network.
3. The robot path navigation method based on the improved DDPG algorithm as claimed in claim 1, wherein the current state information and the target position of the robot are inputted into the trained improved DDPG network to obtain the optimal executable action data; wherein, the improved DDPG network comprises:
the system comprises an Actor module, an experience playback pool and a criticic module which are connected in sequence;
the Actor module comprises an Actor current network and an Actor target network which are sequentially connected;
the Critic module comprises a Critic current network and a Critic target network which are sequentially connected;
wherein, the Actor current network is connected with all LSTM models of the curiosity reward mechanism model; the Actor current network is also connected to the output of the CNN model of the curiosity reward mechanism model.
4. The robot path navigation method based on the improved DDPG algorithm as claimed in claim 1, wherein the current state information and the target position of the robot are inputted into the trained improved DDPG network to obtain the optimal executable action data; wherein the training step of the trained improved DDPG network comprises the following steps:
(1) constructing a training set; the training set comprises the state of the robot with known robot navigation paths at each moment;
(2) and inputting the training set into the improved DDPG network to finish the training of an Actor module, a Critic module and a curiosity reward mechanism model of the improved DDPG network.
5. The robot path navigation method based on the improved DDPG algorithm of claim 4, wherein said training of the curiosity incentive mechanism model is completed, and the training step comprises:
(a) robot selection in state StAction A corresponding to the followingtAnd generating a next state S by interacting with the environmentt+1And a reward value R;
(b) empirical data (S) generated by interaction of the robot with the environmentt,At,R,St+1Done) is stored in the experience playback pool, a stack structure is newly added in the experience playback pool so as to access the experience data according to time sequence, and done represents whether robot navigation is finished or not;
(c) inputting experience data with a time sequence in a stack structure into an LSTM network, wherein as shown in FIG. 2, a first LSTM model only inputs robot state information at a corresponding time; the input of the non-first LSTM model consists of two parts, one part is robot state information of the corresponding moment, and the other part is the previous momentOutput values of the scaled LSTM model; the last LSTM model outputs the predicted value S of the robot state at the next momentt+1';
(d) The actual next state St+1And predicted next state St+1' the difference between them being the internal reward riWhile awarding the interior riWith the original external award reThe sum is used as the total reward R of the robot exploration environment; the actual next state St+1And predicted next state St+1' the difference between them is used as the first constraint in the training process;
(e) the state S of the robot at the current momenttAnd the predicted value S of the state of the robot at the next momentt+1' input to the convolutional neural network CNN, and output the inverse prediction action At';
(f) Reverse predicted action At' with actual action AtThe difference value is used as a second constraint condition in the training process, and the curiosity reward mechanism model is trained by utilizing the gradient back propagation to finish the training of the curiosity reward mechanism model.
6. The robot path navigation method based on the improved DDPG algorithm of claim 1, wherein the improved DDPG network is based on a DDPG network, and a stack structure is added to an experience playback pool of the DDPG network; storing two batches of data in an experience playback pool, wherein one batch of data is a sample obtained by original random sampling, and the other batch of data is a time sequence sample obtained by a stack structure; the time sequence sample obtained by the stack structure is used for training a curiosity reward mechanism model; and randomly sampling the obtained samples for use in the training of an Actor module and a Critic module of the DDPG network.
7. The robot path navigation method based on the improved DDPG algorithm of claim 1, wherein the current state information comprises: the current position of the robot, the current angular velocity of the robot, the current linear velocity of the robot and the current environment information of the robot.
8. The robot path navigation system based on the improved DDPG algorithm is characterized by comprising the following steps:
an acquisition module configured to: acquiring current state information and a target position of the robot;
an output module configured to: inputting the current state information and the target position of the robot into the trained improved DDPG network to obtain optimal executable action data;
a navigation module configured to: the robot completes collision-free path navigation according to the optimal executable action data;
wherein the improved DDPG network is based on the DDPG network, and the calculation of the reward value of the DDPG network is completed by utilizing a curiosity reward mechanism model; the curiosity reward mechanism model comprises: a plurality of LSTM models which are connected in series in sequence; in the LSTM models which are sequentially connected in series, the input ends of all the LSTM models are connected with the output end of the current network of the Actor, the output end of the last LSTM model is connected with the input end of the CNN model, and the output end of the CNN model is connected with the input end of the current network of the Actor.
9. An electronic device, comprising: one or more processors, one or more memories, and one or more computer programs; wherein a processor is connected to the memory, the one or more computer programs being stored in the memory, the processor executing the one or more computer programs stored in the memory when the electronic device is running, to cause the electronic device to perform the method of any of the preceding claims 1-7.
10. A computer-readable storage medium storing computer instructions which, when executed by a processor, perform the method of any one of claims 1 to 7.
Background
The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
With the development of artificial intelligence technology, robots have gradually entered our daily lives from the original industrial production field. Particularly, in recent years, there is a vigorous trend in the service industry field. The demand of the human society for mobile robots is also becoming stronger. The path planning of the robot is a key problem to be solved in the field of robots. Path planning for mobile robots is a complex problem, and an autonomous mobile robot is required to find an obstacle-free path from an initial position to a target position according to a constraint condition. As the environment faced by robots becomes more complex, it is required that the robots have the ability to anticipate obstacles and avoid collisions therewith at a higher level.
The traditional navigation solution such as genetic algorithm, simulated annealing algorithm and the like has better effect in navigation. However, these methods all design a universal solution under the assumption that the environment is known. As robots are used in all industries, the environment in which the robot is located becomes more and more complex. Previous solutions do not solve these problems well. In recent years, a deep reinforcement learning method combining reinforcement learning and deep learning is widely applied to the field of robot path navigation. Deep learning has unique advantages in aspects of feature extraction, object perception and the like, and is widely applied to the fields of computer vision and the like. Reinforcement learning has a better decision-making ability, and can achieve the maximum return or achieve a specific target through learning strategies in the interaction process with the environment. The robot navigation problem in the complex environment is successfully solved by the deep reinforcement learning combining the deep learning and the reinforcement learning. The Deep Deterministic Policy Gradient (DDPG) algorithm is one of the earliest proposed deep reinforcement learning networks. As a classic algorithm in deep reinforcement learning, the DDPG algorithm is directed to a strategy learning method of a continuous, high-latitude behavior space. Compared with the prior reinforcement learning method, the DDPG algorithm has great advantages in the aspect of continuous control problem and is applied to numerous fields of robot path navigation, automatic driving, mechanical arm control and the like.
However, sensitivity to hyper-parameters and tendency to divergent reward values have long been one of the problems that DDPG has difficult to solve well. In reinforcement learning, the feedback of the reward value R is usually hard-coded manually, and since the reward of each step cannot be simply predicted, the design of the reward function is usually sparse, so that the robot cannot obtain immediate feedback, and the learning ability is not high.
In the process of implementing the invention, the inventor finds that the following technical problems exist in the prior art:
the robot path navigation realized based on the prior art has the problem of inaccurate navigation.
Disclosure of Invention
In order to solve the defects of the prior art, the invention provides a robot path navigation method and a system based on an improved DDPG algorithm;
in a first aspect, the invention provides a robot path navigation method based on an improved DDPG algorithm;
the robot path navigation method based on the improved DDPG algorithm comprises the following steps:
acquiring current state information and a target position of the robot;
inputting the current state information and the target position of the robot into the trained improved DDPG network to obtain optimal executable action data;
the robot completes collision-free path navigation according to the optimal executable action data;
wherein the improved DDPG network is based on the DDPG network, and the calculation of the reward value of the DDPG network is completed by utilizing a curiosity reward mechanism model; the curiosity reward mechanism model comprises: a plurality of LSTM models which are connected in series in sequence; in the LSTM models which are sequentially connected in series, the input ends of all the LSTM models are connected with the output end of the current network of the Actor, the output end of the last LSTM model is connected with the input end of the CNN model, and the output end of the CNN model is connected with the input end of the current network of the Actor.
In a second aspect, the present invention provides a robot path navigation system based on an improved DDPG algorithm;
a robot path navigation system based on an improved DDPG algorithm comprises:
an acquisition module configured to: acquiring current state information and a target position of the robot;
an output module configured to: inputting the current state information and the target position of the robot into the trained improved DDPG network to obtain optimal executable action data;
a navigation module configured to: the robot completes collision-free path navigation according to the optimal executable action data;
wherein the improved DDPG network is based on the DDPG network, and the calculation of the reward value of the DDPG network is completed by utilizing a curiosity reward mechanism model; the curiosity reward mechanism model comprises: a plurality of LSTM models which are connected in series in sequence; in the LSTM models which are sequentially connected in series, the input ends of all the LSTM models are connected with the output end of the current network of the Actor, the output end of the last LSTM model is connected with the input end of the CNN model, and the output end of the CNN model is connected with the input end of the current network of the Actor.
In a third aspect, the present invention further provides an electronic device, including: one or more processors, one or more memories, and one or more computer programs; wherein a processor is connected to the memory, the one or more computer programs are stored in the memory, and when the electronic device is running, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute the method according to the first aspect.
In a fourth aspect, the present invention also provides a computer-readable storage medium for storing computer instructions which, when executed by a processor, perform the method of the first aspect.
Compared with the prior art, the invention has the beneficial effects that:
the invention utilizes the total sum of the internal reward generated by curiosity and the external reward of the algorithm as the total reward generated by the interaction of the robot and the environment. The reward function module is embedded with a long-short term memory artificial neural network (LSTM) and a Convolutional Neural Network (CNN). A plurality of past states are input into the LSTM network, the prediction of the next state is output, and the difference value between the predicted value and the actual state of the next state is used as an internal reward. In human society, people often have past experience in predicting what happens next, and embedding LSTM networks into curiosity mechanisms is just for reference to this mental feature. While using the CNN network to perform a reverse prediction of the action for the next state generated by the previous network. Curiosity has been considered by some scientists as one of the basic attributes of intelligence, and robot path navigation based on curiosity can make a robot more intelligent, and even under the condition that reward is sparse and even no external reward exists, the robot can feel like a human.
The invention uses the thinking characteristics of human beings for reference, and embeds a curiosity mechanism in the reward function module. Meanwhile, the latest batch states are input into a curiosity mechanism of the robot as experience data, and an LSTM network with a long-term and short-term memory function is utilized to predict the next state, so that the curiosity-based prediction can keep the time sequence. Meanwhile, the difference between the predicted next state and the actual next state is used as an internal reward value, and the problem of sparse reward of the original DDPG algorithm can be solved.
The invention uses the CNN network with the feature extraction function to predict the next state S of the LSTM networkt+1' with actual state StAs input, output to action AtPredicted value A oft', will actually perform action AtPredicted action A with CNN networktThe difference between' serves as a constraint. The LSTM network and the CNN network are trained simultaneously using back propagation of the gradient. After the CNN module is added, the state characteristics influencing the action-related keys can be extracted.
Advantages of additional aspects of the invention will be set forth in part in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to provide a further understanding of the invention, and are incorporated in and constitute a part of this specification, illustrate exemplary embodiments of the invention and together with the description serve to explain the invention and not to limit the invention.
FIG. 1 is a block diagram of an algorithm of a robot path navigation method based on improved curiosity and DDPG algorithm according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of an LSTM model algorithm embedded in a curiosity mechanism according to an embodiment of the invention;
fig. 3 is a schematic diagram of CNN module algorithm embedded in the curiosity mechanism according to the embodiment of the present invention.
Detailed Description
It is to be understood that the following detailed description is exemplary and is intended to provide further explanation of the invention as claimed. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the invention. As used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and it should be understood that the terms "comprises" and "comprising", and any variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The embodiments and features of the embodiments of the present invention may be combined with each other without conflict.
Example one
The embodiment provides a robot path navigation method based on an improved DDPG algorithm;
as shown in fig. 1, the robot path navigation method based on the improved DDPG algorithm includes:
s101: acquiring current state information and a target position of the robot;
s102: inputting the current state information and the target position of the robot into the trained improved DDPG network to obtain optimal executable action data;
s103: the robot completes collision-free path navigation according to the optimal executable action data;
wherein the improved DDPG network is based on the DDPG network, and the calculation of the reward value of the DDPG network is completed by utilizing a curiosity reward mechanism model; the curiosity reward mechanism model comprises: a plurality of LSTM models which are connected in series in sequence; in the LSTM models which are sequentially connected in series, the input ends of all the LSTM models are connected with the output end of the current network of the Actor, the output end of the last LSTM model is connected with the input end of the CNN model, and the output end of the CNN model is connected with the input end of the current network of the Actor.
As shown in fig. 3, the CNN model includes three convolutional layers connected in sequence.
The improved DDPG network is based on the DDPG network, and a stack structure is additionally arranged on an experience playback pool of the DDPG network; the experience playback pool stores two batches of data, one is a sample obtained by original random sampling, and the other is a time sequence sample obtained by a stack structure. And the time sequence samples obtained by the stack structure are used for training the curiosity reward mechanism model. And randomly sampling the obtained samples for use in the training of an Actor module and a Critic module of the DDPG network.
Further, the current state information includes: the current position of the robot, the current angular velocity of the robot, the current linear velocity of the robot and the current environment information of the robot.
Further, S102: inputting the current state information and the target position of the robot into the trained improved DDPG network to obtain optimal executable action data; the method specifically comprises the following steps:
and inputting the current state information and the target position of the robot into the trained improved DDPG network, and generating optimal executable action data by an Actor module of the improved DDPG network.
Further, S102: inputting the current state information and the target position of the robot into the trained improved DDPG network to obtain optimal executable action data; wherein, the improved DDPG network comprises:
the system comprises an Actor module, an experience playback pool and a criticic module which are connected in sequence;
the Actor module comprises an Actor current network and an Actor target network which are sequentially connected;
the Critic module comprises a Critic current network and a Critic target network which are sequentially connected;
wherein, the Actor current network is connected with all LSTM models of the curiosity reward mechanism model; the Actor current network is also connected to the output of the CNN model of the curiosity reward mechanism model.
Further, S102: inputting the current state information and the target position of the robot into the trained improved DDPG network to obtain optimal executable action data; wherein the training step of the trained improved DDPG network comprises the following steps:
s1021: constructing a training set; the training set comprises the state of the robot with known robot navigation paths at each moment;
s1022: and inputting the training set into the improved DDPG network to finish the training of an Actor module, a Critic module and a curiosity reward mechanism model of the improved DDPG network.
Further, the training of the curiosity reward mechanism model is completed, and the training step comprises the following steps:
s10221: robot selection in state StAction A corresponding to the followingtAnd generating a next state S by interacting with the environmentt+1And a reward value R;
s10222: empirical data (S) generated by interaction of the robot with the environmentt,At,R,St+1Done) is stored in the experience playback pool, a stack structure is newly added in the experience playback pool so as to access the experience data according to time sequence, and done represents whether robot navigation is finished or not;
s10223: empirical data with a time order in the stack structure is input into the LSTM network, as shown in FIG. 2, the first LSTM model is onlyInputting robot state information at corresponding time; the input of the non-first LSTM model consists of two parts, one part is robot state information at the corresponding moment, and the other part is the output value of the LSTM model at the previous moment; the last LSTM model outputs the predicted value S of the robot state at the next momentt+1';
S10224: the actual next state St+1And predicted next state St+1' the difference between them being the internal reward riWhile awarding the interior riWith the original external award reThe sum is used as the total reward R of the robot exploration environment; the actual next state St+1And predicted next state St+1' the difference between them is used as the first constraint in the training process;
s10225: the state S of the robot at the current momenttAnd the predicted value S of the state of the robot at the next momentt+1' input to the convolutional neural network CNN, and output the inverse prediction action At';
S10226: reverse predicted action At' with actual action AtThe difference value is used as a second constraint condition in the training process, and the curiosity reward mechanism model is trained by utilizing the gradient back propagation to finish the training of the curiosity reward mechanism model.
It should be understood that the timing sequence between the retained samples described in S10222 is independent of the original random sampling mechanism. In order to avoid the over-fitting (over fit) problem during training and need to break the correlation between samples, DDPG often selects a set of data for network training in a random sampling manner during network training. In order to obtain a sample with the time sequence for training the LSTM network, the invention is provided with a stack structure independent of a random sampling module, the time sequence of the sample is maintained by utilizing the characteristic of stack first-in first-out, data is stored at the top of a stack, and a batch (batch) size data sample is taken from the top of the stack when the data is taken. When the empirical data is stored, the data is stored in the stack mechanism and the original queue mechanism. When data is fetched, an original random sampling mode is kept for a queue mechanism and is used for training a network Actor module and a criticic module. For the stack mechanism, the characteristic of stack data fetching is kept, and the latest experience data with time sequence is guaranteed to be fetched.
It should be understood that, in S10223, the person is usually predicted to the next occurrence according to previous experience, and by taking the thought characteristics of the person as a reference, the state sequence with time sequence is input into the LSTM network, and the memory function of the LSTM is used to predict the next state St+1'. The specific calculation method is as follows:
St+1′=L(St-n,St-(n-1),...,St-2,St-1,St;θ)
where S represents the state at a certain time and θ represents the parameters of the LSTM network. The state sequence with time sequence obtains the predicted value S of the next state after passing through the LSTMt+1';
It should be understood that the next state S to be predicted in S10224t+1' Next State S resulting from interaction with the Environment in practicet+1The difference between them is used as an internal reward value, while in order to avoid the LSTM network predicting paradoxical solutions, the actual next state S is used heret+1And the predicted value S of the next statet+1The difference between' serves as a constraint for LSTM network training. The specific calculation is as follows:
ri=||St+1′-St+1||
R=re+ri
Min(||St+1′-St+1||)
wherein r isiIs based on the actual next state S generated by the improved curiosity mechanismt+1And the predicted value S of the next statet+1The difference between' as an internal reward. r iseIs the external prize of the DDPG algorithm, and R is the sum of the internal prize and the external prize based on the modified curiosity algorithm as a total prize value. Min (| | S)t+1′-St+1| |) is a constraint of the LSTM network.
It should be understood that the states S10225 and S10226 are converted into the state S by using the feature extraction function of the CNN networktAnd a state S predicted by curiosityt+1' reverse predicted action in input CNN network At' will make the actual action A at the same timetAnd predicted action AtThe difference between' as another constraint, using the CNN network, can extract the state features that affect the action-related keys. The specific calculation is as follows:
At′=H(St,St+1′;w)
Min(At,A′t)
where w is a parameter of the CNN network, At' is a pair action A generated through the CCN networktThe predicted value of (2).
The LSTM network and the CNN network may be trained simultaneously by backpropagation of the gradient through a first constraint and a second constraint generated on the predicted action.
Meanwhile, a CNN network with a feature extraction function is embedded, the next state predicted by the LSTM network and the actual last state are used as the input of the CNN network, and the CNN network outputs the predicted value of the action. The difference between the actual action and the action predicted by the CNN network is taken as a constraint. The LSTM network and the CNN network are trained simultaneously using back propagation of the gradient. After the CNN module is added, the state characteristics influencing the action-related keys can be extracted.
Selecting experience data with a batch size from an experience playback pool in a random sampling mode to train a Critic network and an Actor network, and updating parameters through gradient back propagation;
and copying the network parameters from the actual network to the target network at regular intervals by using a soft update mode between the actual network and the target network.
The LSTM network, as shown in fig. 2, uses the thinking characteristics of human beings, and in human society, people often have past experience to predict what happens next, and the invention embeds a curiosity mechanism in the reward function module. Meanwhile, the latest batch states are input into a curiosity mechanism of the robot as experience data, an LSTM network with a long-term and short-term memory function is utilized to predict the next state, a state sequence with the time sequence is input into the LSTM, and the memory function of the LSTM is utilized to predict the next state so that the curiosity prediction can be kept. And meanwhile, the difference between the predicted next state and the actual next state is used as an internal reward value, and meanwhile, in order to avoid the LSTM predicting an absurd solution, the difference between the actual value of the state and the predicted value of the next state is used as a constraint condition.
The CNN network module, as shown in fig. 3, embeds a CNN network with a feature extraction function in the curiosity mechanism, takes the next state predicted by the LSTM network and the actual previous state as inputs, outputs a predicted value of the action, and takes the difference between the actual action and the action predicted by the CNN network as a constraint condition. The LSTM network and the CNN network are trained simultaneously using back propagation of the gradient. After the CNN module is added, the state characteristics influencing the action-related keys can be extracted.
Example two
The embodiment provides a robot path navigation system based on an improved DDPG algorithm;
a robot path navigation system based on an improved DDPG algorithm comprises:
an acquisition module configured to: acquiring current state information and a target position of the robot;
an output module configured to: inputting the current state information and the target position of the robot into the trained improved DDPG network to obtain optimal executable action data;
a navigation module configured to: the robot completes collision-free path navigation according to the optimal executable action data;
wherein the improved DDPG network is based on the DDPG network, and the calculation of the reward value of the DDPG network is completed by utilizing a curiosity reward mechanism model; the curiosity reward mechanism model comprises: a plurality of LSTM models which are connected in series in sequence; in the LSTM models which are sequentially connected in series, the input ends of all the LSTM models are connected with the output end of the current network of the Actor, the output end of the last LSTM model is connected with the input end of the CNN model, and the output end of the CNN model is connected with the input end of the current network of the Actor.
It should be noted here that the above-mentioned obtaining module, the output module and the navigation module correspond to steps S101 to S103 in the first embodiment, and the above-mentioned modules are the same as examples and application scenarios implemented by the corresponding steps, but are not limited to the disclosure of the first embodiment. It should be noted that the modules described above as part of a system may be implemented in a computer system such as a set of computer-executable instructions.
In the foregoing embodiments, the descriptions of the embodiments have different emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
The proposed system can be implemented in other ways. For example, the above-described system embodiments are merely illustrative, and for example, the division of the above-described modules is merely a logical division, and in actual implementation, there may be other divisions, for example, multiple modules may be combined or integrated into another system, or some features may be omitted, or not executed.
EXAMPLE III
The present embodiment also provides an electronic device, including: one or more processors, one or more memories, and one or more computer programs; wherein, a processor is connected with the memory, the one or more computer programs are stored in the memory, and when the electronic device runs, the processor executes the one or more computer programs stored in the memory, so as to make the electronic device execute the method according to the first embodiment.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate arrays FPGA or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and so on. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include both read-only memory and random access memory, and may provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.
In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or instructions in the form of software.
The method in the first embodiment may be directly implemented by a hardware processor, or may be implemented by a combination of hardware and software modules in the processor. The software modules may be located in ram, flash, rom, prom, or eprom, registers, among other storage media as is well known in the art. The storage medium is located in a memory, and a processor reads information in the memory and completes the steps of the method in combination with hardware of the processor. To avoid repetition, it is not described in detail here.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.
Example four
The present embodiments also provide a computer-readable storage medium for storing computer instructions, which when executed by a processor, perform the method of the first embodiment.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.