Chip power supply network fast-convex current estimation method and system based on deep learning
1. A chip power supply network fast-forward current estimation method based on deep learning is characterized by comprising the following steps:
(1) generating power grids with different bottom layer current excitations and different decoupling capacitor densities and positions as neural network training samples;
(2) and (3) expanding the training samples generated in the step (1) by utilizing the linear uniformity and the superposition of the power grid, so as to realize data enhancement.
(3) And constructing a convolutional neural network, wherein the input of the convolutional neural network is the distribution of current excitation represented in a two-bit matrix form, and the output of the convolutional neural network is the Bump current response represented in a one-dimensional vector form. Training the constructed neural network through the final neural network training set obtained in the step (2);
(4) and calculating the time domain current response of the Bump in the static scene through the trained convolutional neural network.
2. The deep learning-based chip power supply network Bump current estimation method according to claim 1, wherein in step (1), the positions of the Bump and the current source are extracted and grouped, and the number of ports in each group exceeding a threshold is removed, wherein the threshold is selected according to required calculation speed and hardware processing capacity, and the sizes of the bottom current excitation can be added in a randomized mode, the decoupling capacitance density and the decoupling capacitance position can be added in a randomized mode, and a corresponding power supply network simulation file can be generated.
3. The method for estimating the crowning current of the chip power supply network based on the deep learning as claimed in claim 1, wherein the step (2) is specifically as follows: and acquiring training data of the neural network, randomly giving different weights to different training data, and performing linear combination to generate new training data so as to realize data enhancement. Specifically, since the power grid system is a linear time-invariant system, it satisfies the superposition. When the linear superposition is excited by different bottom layer currents, the current of each Bump end of the bottom layer also satisfies the following linear superposition relation:
in the formula IsourceRepresents the distribution of the current excitation, IbumpThe Bump current response is shown as a response,representing the corresponding relation between the two, k represents a constant term coefficient, Isource,aAnd Isource,bRepresents the current excitation distribution in the case of a and b, respectively, Ibump,aAnd Ibump,bRepresenting the Bump current response in both cases a and b, respectively.
4. The method for estimating the crowning current of the chip power supply network based on the deep learning as claimed in claim 1, wherein the step (3) is specifically as follows: the constructed convolutional neural network has a six-layer structure, wherein the 1 st layer to the 4 th layer comprise a convolutional layer, a batch standardization operation unit, a LeakyRelu activation layer and a pooling layer which are sequentially connected; the fifth layer is a full connection layer to carry out nonlinear transformation; finally, an output layer is connected.
5. The method for estimating the salient fast current of the chip power supply network based on the deep learning of the claim 1 is characterized in that a Loss function is formed by linear combination of L1Loss, PeakLoss and ZeroConstrain, and the calculation formula is as follows:
Loss=L1Loss+α*PeakLoss+β*ZeroConstraint
PeakLoss=(Ti-Pk)2+(Tj-Pl)2+|(k-i)(Ti-Pi)(Tk-Pk)|+|(l-j)(Tj-Pj)(Tl-Pl)|
Ti=max(target),Tj=min(target),Pk=max(prediction),Pl=min(prediction)
ZeroConstraint=sum(prediction)
in the formula, Loss represents a total Loss function used in training, lllos represents an absolute value error between each predicted value and an actual value, n represents the length of the predicted values, namely the number of bumps, i represents the ith value of the vector of the predicted values, PeakLoss represents an error between the maximum minimum value of the predicted values of the test data and a label, alpha represents the weight of PeakLoss in the total Loss function, zeroconstrainant represents an error between the sum of the predicted values and 0, beta represents the weight of zeroconstrainant in the total Loss function, TiDenotes the maximum value of the label, TjDenotes the minimum value of the label, PkRepresents the maximum value of the predicted value, PlThe minimum value of the predicted value is represented, target represents a label, and prediction represents the predicted value.
6. The system is characterized by comprising a power supply network generation module, a training data acquisition module, a lightweight convolutional neural network module and a Bump current calculation module.
The power supply network generation module is used for generating power supply networks with different bottom layer current excitations and different decoupling capacitor densities and positions to serve as neural network training samples.
And the training data acquisition module is used for expanding the training samples generated in the step (1) by utilizing the linear uniformity and the superposition of the power grid to realize data enhancement.
The convolutional neural network module is used for constructing a convolutional neural network, the input of the convolutional neural network is the distribution of current excitation expressed in a two-bit matrix form, and the output of the convolutional neural network is the Bump current response expressed in a one-dimensional vector form. Training the constructed neural network through the final neural network training set obtained in the step (2);
and the Bump current calculation module calculates the time domain current response of the Bump in the static scene through the trained convolutional neural network.
7. The deep learning based on chip power supply network fast current estimation system of claim 6, wherein the power supply network generation module extracts the positions of the Bump and the current source and then groups the Bump and the current source, and removes the ports in each group which exceed the threshold number, wherein the threshold is selected according to the required calculation speed and hardware processing capacity, and the size of the bottom current excitation can be added in a randomized mode, the density and the position of the decoupling capacitor can be added in a randomized mode, and a corresponding power supply network simulation file can be generated.
8. The deep learning-based on-chip power supply network fast and convex current estimation system of claim 6, wherein the training data acquisition module acquires training data of a neural network, randomly gives different weights to different training data, and then performs linear combination, so as to generate new training data to realize data enhancement. Specifically, since the power grid system is a linear time-invariant system, it satisfies the superposition. When the linear superposition is excited by different bottom layer currents, the current of each Bump end of the bottom layer also satisfies the linear superposition relation:
in the formula IsourceRepresents the distribution of the current excitation, IbumpThe Bump current response is shown as a response,representing the corresponding relation between the two, k represents a constant term coefficient, Isource,aAnd Isource,bRepresents the current excitation distribution in the case of a and b, respectively, Ibump,aAnd Ibump,bRepresenting the Bump current response in both cases a and b, respectively.
9. The deep learning-based chip power supply network fast current estimation system according to claim 6, wherein the convolutional neural network module is constructed into a neural network with a six-layer structure, and the 1 st layer and the 4 th layer comprise a convolutional layer, a batch standardization operation unit, a LeakyRelu activation layer and a pooling layer which are connected in sequence; the fifth layer is a full connection layer to make nonlinear conversion, and finally connected with an output layer.
10. The chip power supply network fast current estimation system based on deep learning of claim 6, wherein the neural network Loss function constructed by the convolutional neural network module is formed by linear combination of L1Loss, PeakLoss and ZeroConstraint, and the calculation formula is as follows:
Loss=L1Loss+α*PeakLoss+β*ZeroConstraint
PeakLoss=(Ti-Pk)2+(Tj-Pl)2+|(k-i)(Ti-Pi)(Tk-Pk)|+|(l-j)(Tj-Pj)(Tl-Pl)|
Ti=max(target),Tj=min(target),Pk=max(prediction),Pl=min(prediction)
ZeroConstraint=sum(prediction)
in the formula, Loss represents a total Loss function used in training, L1Loss represents an absolute value error between each predicted value and an actual value, n represents the length of the predicted value, namely the number of Bump, i represents the ith value of the predicted value vector, and PeakLoss represents the maximum and minimum of the predicted values of the test dataError between value and label, α represents weight of PeakLoss in total loss function, ZeroConstraint represents error between sum of predicted values and 0, β represents weight of ZeroConstraint in total loss function, TiDenotes the maximum value of the label, TjDenotes the minimum value of the label, PkRepresents the maximum value of the predicted value, PlThe minimum value of the predicted value is represented, target represents a label, and prediction represents the predicted value.
Background
With the development of the super-large-scale integrated circuit technology, the working voltage of the chip is lower, but the current density is higher. This results in a system with lower and lower noise margins, which puts higher demands on the stability of the power supply system over the entire operating frequency band. The integrity of the power supply has more and more important influence on important indexes in circuit design such as system reliability, signal-to-noise ratio and bit error rate, EMI/EMC and the like. Therefore, advanced integrated circuit designs typically have very stringent specifications for power integrity to ensure circuit stability. In power integrity design, designers often need to iteratively modify design defects to ensure that the final design meets all design criteria, which has prompted the appearance of an ECO design flow.
ECO is a design process of a new type of very large scale integrated circuit that has emerged in recent years. Designers make only local modifications (5% -10%) to the product to correct functional errors or to meet non-functional design requirements (e.g., timing and power). This concept of minimizing changes to the current design can save design cost and design time as much as possible, and thus is widely used in the work of reducing design complexity and shortening design cycle.
When the bottom layer IP design changes, designers want to quickly determine the distribution of the impact of such changes on the top layer in the product iteration process, so multiple simulations of the circuit are required in the physical design phase, resulting in this phase being the most time-consuming part of the overall circuit design flow. Therefore, increasing the simulation speed of the ECO phase for each modified version has a crucial effect on shortening the entire design cycle. The modified version generated in the ECO design flow has larger similarity with the original version, so that the simulation time of each modified version can be reduced and the design efficiency is improved by introducing the simulation information of the original version.
However, currently mainstream EDA tools are not optimized in view of the similarity between the ECO versions. The mainstream EDA software still sees the modified version as a new design, starting with a new simulation from scratch. Obviously, although this method can ensure the accuracy of the simulation, the time cost of the simulation is ignored. Product designers prefer that during their iterative design, EDA software can determine the changes caused by design changes at the expense of slightly reducing simulation accuracy for local changes in the product, using as much reduction in simulation time as a return, as the most consistent information determined by the ECO design is utilized by the EDA software.
In summary, a method and a system for estimating the crowning current of the chip power supply network based on deep learning are provided, the similarity between the ECO original version and the ECO modified version is fully utilized, and the method and the system are the key for improving the simulation speed and shortening the design cycle.
Disclosure of Invention
The invention aims to provide a chip power supply network fast current estimation method and system based on deep learning, aiming at the defects of the prior art of the simulation speed of the EDA software in the ECO stage. When the power integrity test is carried out, the method can quickly and accurately simulate each modified version of the ECO stage to obtain the response of the Bump port to the current input current source.
The purpose of the invention is realized by the following technical scheme: a chip power supply network fast-forward current estimation method based on deep learning comprises the following steps:
(1) generating power grids with different bottom layer current excitations and different decoupling capacitor densities and positions as neural network training samples;
(2) and (3) expanding the training samples generated in the step (1) by utilizing the linear uniformity and the superposition of the power grid, so as to realize data enhancement.
(3) And constructing a convolutional neural network, wherein the input of the convolutional neural network is the distribution of current excitation represented in a two-bit matrix form, and the output of the convolutional neural network is the Bump current response represented in a one-dimensional vector form. Training the constructed neural network through the final neural network training set obtained in the step (2);
(4) and calculating the time domain current response of the Bump in the static scene through the trained convolutional neural network.
Further, in the step (1), after the positions of the Bump and the current source are extracted, grouping is carried out, ports exceeding the threshold number in each group are removed, the threshold is selected according to the required calculation speed and hardware processing capacity, the excitation size of the bottom layer current can be added in a random mode, the decoupling capacitance density and the decoupling capacitance position can be added in a random mode, and a corresponding power supply network simulation file is generated.
Further, the step (2) is specifically as follows: and acquiring training data of the neural network, randomly giving different weights to different training data, and performing linear combination to generate new training data so as to realize data enhancement. Specifically, since the power grid system is a linear time-invariant system, it satisfies the superposition. When the linear superposition is excited by different bottom layer currents, the current of each Bump end of the bottom layer also satisfies the following linear superposition relation:
in the formula IsourceRepresents the distribution of the current excitation, IbumpThe Bump current response is shown as a response,representing the corresponding relation between the two, k represents a constant term coefficient, Isource,aAnd Isource,bRepresents the current excitation distribution in the case of a and b, respectively, Ibump,aAnd Ibump,bRepresenting the Bump current response in both cases a and b, respectively.
Further, the step (3) is specifically: the constructed convolutional neural network has a six-layer structure, wherein the 1 st layer to the 4 th layer comprise a convolutional layer, a batch standardization operation unit, a LeakyRelu activation layer and a pooling layer which are sequentially connected; the fifth layer is a full connection layer to carry out nonlinear transformation; finally, an output layer is connected.
Further, the Loss function is formed by linear combination of L1Loss, PeakLoss and ZeroConstraint, and the calculation formula is as follows:
Loss=L1Loss+α*PeakLoss+β*ZeroConstraint
PeakLoss=(Ti-Pk)2+(Tj-Pl)2+|(k-i)(Ti-Pi)(Tk-Pk)|+|(l-j)(Tj-Pj)(Tl-Pl)|
Ti=max(target),Tj=min(target),Pk=max(prediction),Pl=min(prediction)
ZeroConstraint=sum(prediction)
in the formula, Loss represents a total Loss function used in training, L1Loss represents an absolute value error between each predicted value and an actual value, n represents the length of the predicted value, namely the number of bumps, i represents the ith value of a vector of the predicted values, PeakLoss represents an error between the maximum minimum value of the predicted values of the test data and a label, alpha represents the weight of PeakLoss in the total Loss function, zeroconstrainant represents an error between the sum of the predicted values and 0, beta represents the weight of zeroconstrainant in the total Loss function, T1 Loss represents an absolute value error between each predicted value and the actual value, n represents the length of the predicted value, namely the number of bumps, i represents the ith value of a vector of the predicted values, PeakLoss represents an error between the maximum minimum value of the test data and the label, alpha represents the weight of the PeakLoss in the total Loss function, Zeroconstrainant represents an error between the sum of the predicted values and 0, and beta represents a weight of the zeroconstrainant in the total Loss function, and T1 Loss represents an error between the total Loss functioniDenotes the maximum value of the label, TjDenotes the minimum value of the label, PkRepresents the maximum value of the predicted value, PlThe minimum value of the predicted value is represented, target represents a label, and prediction represents the predicted value.
The invention also provides a chip power supply network fast and convex current estimation system based on deep learning, which comprises a power supply network generation module, a training data acquisition module, a lightweight convolutional neural network module and a Bump current calculation module.
The power supply network generation module is used for generating power supply networks with different bottom layer current excitations and different decoupling capacitor densities and positions to serve as neural network training samples.
And the training data acquisition module is used for expanding the training samples generated in the step (1) by utilizing the linear uniformity and the superposition of the power grid to realize data enhancement.
The convolutional neural network module is used for constructing a convolutional neural network, the input of the convolutional neural network is the distribution of current excitation expressed in a two-bit matrix form, and the output of the convolutional neural network is the Bump current response expressed in a one-dimensional vector form. Training the constructed neural network through the final neural network training set obtained in the step (2);
and the Bump current calculation module calculates the time domain current response of the Bump in the static scene through the trained convolutional neural network.
Furthermore, the power supply network generation module extracts positions of the Bump and the current source and then groups the positions, and removes ports exceeding the threshold number in each group, wherein the threshold is selected according to the required calculation speed and hardware processing capacity, and the size of bottom layer current excitation can be added in a random mode, the density and the positions of decoupling capacitors can be added in a random mode, and a corresponding power supply network simulation file is generated.
Further, the training data acquisition module acquires training data of the neural network, and performs linear combination after randomly giving different weights to different training data, so as to generate new training data to realize data enhancement. Specifically, since the power grid system is a linear time-invariant system, it satisfies the superposition. When the linear superposition is excited by different bottom layer currents, the current of each Bump end of the bottom layer also satisfies the linear superposition relation:
in the formula IsourceRepresents the distribution of the current excitation, IbumpThe Bump current response is shown as a response,representing the corresponding relation between the two, k represents a constant term coefficient, Isource,aAnd Isource,bRepresents the current excitation distribution in the case of a and b, respectively, Ibump,aAnd Ibump,bRepresenting the Bump current response in both cases a and b, respectively.
Furthermore, the neural network constructed by the convolutional neural network module has a six-layer structure, wherein the 1 st layer to the 4 th layer comprise a convolutional layer, a batch standardization operation unit, a LeakyRelu activation layer and a pooling layer which are sequentially connected; the fifth layer is a full connection layer to make nonlinear conversion, and finally connected with an output layer.
Further, a neural network Loss function constructed by the convolutional neural network module is formed by linear combination of L1Loss, PeakLoss and ZeroConstraint, and the calculation formula is as follows:
Loss=L1Loss+α*PeakLoss+β*ZeroConstraint
PeakLoss=(Ti-Pk)2+(Tj-Pl)2+|(k-i)(Ti-Pi)(Tk-Pk)|+|(l-j)(Tj-Pj)(Tl-Pl)|
Ti=max(target),Tj=min(target),Pk=max(prediction),Pl=min(prediction)
ZeroConstraint=sum(prediction)
in the formula, Loss represents a total Loss function used in training, L1Loss represents an absolute value error between each predicted value and an actual value, n represents the length of the predicted value, namely the number of bumps, i represents the ith value of a vector of the predicted values, PeakLoss represents an error between the maximum minimum value of the predicted values of the test data and a label, alpha represents the weight of PeakLoss in the total Loss function, zeroconstrainant represents an error between the sum of the predicted values and 0, beta represents the weight of zeroconstrainant in the total Loss function, T1 Loss represents an absolute value error between each predicted value and the actual value, n represents the length of the predicted value, namely the number of bumps, i represents the ith value of a vector of the predicted values, PeakLoss represents an error between the maximum minimum value of the test data and the label, alpha represents the weight of the PeakLoss in the total Loss function, Zeroconstrainant represents an error between the sum of the predicted values and 0, and beta represents a weight of the zeroconstrainant in the total Loss function, and T1 Loss represents an error between the total Loss functioniIndicating the most important of the labelsLarge value, TjDenotes the minimum value of the label, PkRepresents the maximum value of the predicted value, PlThe minimum value of the predicted value is represented, target represents a label, and prediction represents the predicted value.
The invention has the beneficial effects that: the method can rapidly extract the characteristics of the power supply network, can accurately predict the Bump current response according to the distribution of current excitation, and has the error between the prediction result and the result of the common commercial software less than 1%. The invention reasonably uses the data enhancement method, and greatly reduces the cost required by data acquisition under the condition of ensuring similar accuracy. Meanwhile, the calculation speed of the invention can reach tens of times of that of the common commercial software.
Drawings
FIG. 1 is a block diagram of a method for estimating a fast crowbar current of a chip power supply network based on deep learning according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of a lightweight convolutional neural network construction according to an embodiment of the present invention;
FIG. 3 is a distribution diagram of the error between the predicted result and the calculated result of the commercial simulation software in a specific scenario;
fig. 4 is a comparison graph of error results of whether data enhancement is used in a specific scenario.
Detailed description of the invention
The invention is described in further detail below with reference to the figures and specific examples.
The invention provides a chip power supply network fast-raised current estimation method based on deep learning, which comprises the following steps:
(1) generating power grids with different bottom layer current excitations and different decoupling capacitor densities and positions as neural network training samples; and grouping after extracting positions of the Bump and the current source, removing ports exceeding the threshold number in each group, wherein the threshold is selected according to the required calculation speed and hardware processing capacity, and randomly adding the excitation size of the bottom layer current and randomly adding the density and the position of the decoupling capacitor and generating a corresponding power network simulation file.
(2) And (3) expanding the training samples generated in the step (1) by utilizing the linear uniformity and the superposition of the power grid, so as to realize data enhancement. And acquiring training data of the neural network, randomly giving different weights to different training data, and performing linear combination to generate new training data so as to realize data enhancement. Specifically, since the power grid system is a linear time-invariant system, it satisfies the superposition. When the linear superposition is excited by different bottom layer currents, the current of each Bump end of the bottom layer also satisfies the following linear superposition relation:
in the formula IsourceRepresents the distribution of the current excitation, IbumpThe Bump current response is shown as a response,representing the corresponding relation between the two, k represents a constant term coefficient, Isource,aAnd Isource,bRepresents the current excitation distribution in the case of a and b, respectively, Ibump,aAnd Ibump,bRepresenting the Bump current response in both cases a and b, respectively.
(3) And constructing a convolutional neural network, wherein the input of the convolutional neural network is the distribution of current excitation represented in a two-bit matrix form, and the output of the convolutional neural network is the Bump current response represented in a one-dimensional vector form. Training the constructed neural network through the final neural network training set obtained in the step (2); the constructed convolutional neural network has a six-layer structure, wherein the 1 st layer to the 4 th layer comprise a convolutional layer, a batch standardization operation unit, a LeakyRelu activation layer and a pooling layer which are sequentially connected; the fifth layer is a full connection layer to carry out nonlinear transformation; finally, an output layer is connected.
The Loss function is formed by linear combination of L1Loss, PeakLoss and ZeroConstraint, and the calculation formula is as follows:
Loss=L1Loss+α*PeakLoss+β*ZeroConstraint
PeakLoss=(Ti-Pk)2+(Tj-Pl)2+|(k-i)(Ti-Pi)(Tk-Pk)|+|(l-j)(Tj-Pj)(Tl-Pl)|
Ti=max(target),Tj=min(target),Pk=max(prediction),Pl=min(prediction)
ZeroConstraint=sum(prediction)
in the formula, Loss represents a total Loss function used in training, L1Loss represents an absolute value error between each predicted value and an actual value, n represents the length of the predicted value, namely the number of bumps, i represents the ith value of a vector of the predicted values, PeakLoss represents an error between the maximum minimum value of the predicted values of the test data and a label, alpha represents the weight of PeakLoss in the total Loss function, zeroconstrainant represents an error between the sum of the predicted values and 0, beta represents the weight of zeroconstrainant in the total Loss function, T1 Loss represents an absolute value error between each predicted value and the actual value, n represents the length of the predicted value, namely the number of bumps, i represents the ith value of a vector of the predicted values, PeakLoss represents an error between the maximum minimum value of the test data and the label, alpha represents the weight of the PeakLoss in the total Loss function, Zeroconstrainant represents an error between the sum of the predicted values and 0, and beta represents a weight of the zeroconstrainant in the total Loss function, and T1 Loss represents an error between the total Loss functioniDenotes the maximum value of the label, TjDenotes the minimum value of the label, PkRepresents the maximum value of the predicted value, PlThe minimum value of the predicted value is represented, target represents a label, and prediction represents the predicted value.
(4) And calculating the time domain current response of the Bump in the static scene through the trained convolutional neural network.
As shown in fig. 1, the invention further provides a chip power supply network fast and convex current estimation system based on deep learning, which comprises a power supply network generation module, a training data acquisition module, a lightweight convolutional neural network module and a Bump current calculation module.
The power supply network generation module is used for extracting the positions of a current source and top-layer Bump from a design exchange file, generating power supply networks with different bottom-layer excitation and different decoupling capacitor densities and positions, and automatically generating a chip power consumption model (CPM) and files required in the process according to requirements, and comprises an automatic generation top-layer constraint file (GSR), a command line execution file (TCL), configuration files of quiescent current and capacitance of each module, position files (PLOC) of the current source and the Bump, a region power consumption distribution file (BPA) and the like.
The configuration files of the quiescent current and the capacitance of each module indicate information such as load capacitance, characteristic impedance, current response of each state and the like of each module, so that the EDA software can generate the corresponding current module conveniently.
The current source and Bump location files are extracted from the design swap file. The number of ports of the CPM cannot be excessive due to limitations of computing power. During extraction, the current sources and the Bump are firstly grouped, and each group is combined into a pair of CPM ports. Furthermore, the number of current sources is still too high, and direct calculation results in too large a calculation amount in generating CPM. By specifying the maximum port number of each group and screening out representative ports in each group, the CPM calculation time can be effectively reduced.
The regional power consumption distribution file is used for actually distributing each current module and each decoupling capacitor module defined by a user to corresponding positions of a circuit.
And the training data acquisition module is used for generating training data of the compact convolutional neural network. By viewing the bottom layer as a 32 x 32 image, power consumption is randomly assigned to each pixel and the total power consumption is guaranteed to be constant. And automatically generating a script, and acquiring each Bump current output vector at the top layer as a label by utilizing the static simulation of a commercial tool. Commercial tools have high accuracy, but for different forms of excitation current, the whole circuit needs to be solved by simulation each time. With the increasing scale of circuits, the computational complexity increases exponentially. Since the power supply network is a linear time-invariant system, the input and output satisfy the superposability. The original training data can be made to generate new training data by different linear combination modes. A designer can obtain a large number of training sample-label pairs which accord with the actual situation by only extracting a small amount of original training data:
data enhancement operations are performed in two ways:
1) each time data is generated during training, there is a probability of 0.5 times some value between 1-2.
2) The data set is directly expanded, and the expansion method comprises the following steps: multiplying the current data by a certain value between 0 and 2 to generate three new sample-label pairs; two adjacent sets of data are added to generate a new sample-label pair.
The lightweight convolutional neural network module is used for constructing a compact convolutional neural network and comprises a convolutional layer, a batch processing layer, a pooling layer, an activation layer, a full connection layer and a final output layer. The device has a six-layer structure, wherein the 1 st layer to the 4 th layer comprise a convolution layer, a batch standardization operation unit, a LeakyRelu activation layer and a pooling layer which are sequentially connected; the fifth layer is that a full connection layer carries out nonlinear transformation, and the dimensionality of the vector of the full connection layer is 4096 and 768 in sequence; finally, an output layer is connected. In the compact convolutional neural network model, the calculation process of convolutional layer operation is as follows:
in the formula, outputconvIs the three-dimensional size (length, width and depth of the image), input, of the output image data per convolution layerconvIs the three-dimensional size of the input image, pad denotes filling pixels around the image,kernal is the three-dimensional size of the convolution kernel and stride is the step size of the convolution kernel.
And for each convolutional layer, using batch standardization operation to accelerate the convergence speed and stability of the network, wherein the formula of the batch standardization operation is as follows:
wherein Input is each batch of data Input,the data are normalized data, Output batch normalization operation Output batch data, mu and sigma are mean and variance of each batch of data respectively, gamma and beta are scaling and translation variables respectively, and epsilon is smaller constant data added for increasing training stability;
the activating function connected with each convolutional layer selects a LeakyRelu function, the training period can be shortened, and the calculation mode of the LeakyRelu function is as follows:
Outputactivation=Inputactivation if Inputactivation≥0
Inputactivation/α if Inputactivation<0
wherein, InputactivationIs input data, Output, of the LeakyRelu functionactivationIs the output data of the LeakyRelu function, and α is a fixed parameter.
As shown in fig. 2, the size of the input image block is 1 × 32 × 32, where 32 × 32 represents the length and width of the image block, and 1 represents the number of channels of the image block. After 3 × 3 convolution, batch processing and LeakyReLU of the 1 st layer, the obtained characteristic size is 64 × 32 × 32, and after 2 × 2 pooling operation, a data image with the characteristic size of 64 × 16 × 16 is obtained. By the same operation, the output image dimensions of the 2 nd to 4 th layers are 128 × 16 × 16, 256 × 8 × 8 and 512 × 4 × 4, 1024 high-dimensional features with the size of 1024 × 2 × 2 are obtained through the 5 th pooling layer and are transmitted into a multi-layer perceptron (MLP) of the full connection layer for probability regression, the dimensions of vectors of the full connection layer are 1 × 1 × 4096, 1 × 1 × 768 and 1 × 1 × 1 × output dimensions in sequence, a dropout layer is adopted in the middle of the full connection layer and p is set to be 0.5, network parameters are reduced, and overfitting is prevented.
In the model training, the model is formed by linear combination of L1Loss, PeakLoss and ZeroConstraint, and the calculation formula is as follows:
Loss=L1Loss+α×PeakLoss+β×ZeroConstraint
PeakLoss=(Ti-Pk)2+(Tj-Pl)2+|(k-i)(Ti-Pi)(Tk-Pk)|+|(l-j)(Tj-Pj)(Tl-Pl)|
Ti=max(target),Tj=min(target),Pk=max(prediction),Pl=min(prediction)
ZeroConstraint=sum(prediction)
where L1loss represents each predicted value f (x)i) And the actual value yiThe PeakLoss represents the error between the maximum and minimum predicted values and the actual value of the test data, and the ZeroConstraint represents the error between the sum of the predicted values and 0. Updating the weight parameter θ using a standard random gradient descent (SGD) having the formula:
where η is the learning rate, θkIs the weight parameter for the k-th time.
The Bump current calculation module is used for calculating the time domain current response of the Bump in a static scene, and mainly comprises the following three steps: (1) determining the size of each current excitation of the bottom layer (2), inputting the current excitation into the trained convolutional neural network (3), and determining each current output value of the Bump.
The method can rapidly extract the characteristics of the power supply network, can accurately predict the Bump current response according to the distribution of current excitation, and has the error between the prediction result and the result of the common commercial software less than 1%. As shown in fig. 3, for a power grid example with 70 pump, applying 500 different distribution current excitations and using commercial software and the present invention to obtain calculated results, respectively, it can be seen that the average value of the error is less than 1% and the maximum value of the error is less than 5% from the statistical results.
The invention reasonably uses the data enhancement method, and greatly reduces the cost required by data acquisition under the condition of ensuring similar accuracy. As shown in fig. 4, for the same power grid example, 4500 sets of the sample-tag sets collected were used for the control set, without data enhancement; the experimental group only uses 50 groups of collected sample label groups, but the linear uniformity and the superposition of the power grid carry out linear combination after different training data are randomly endowed with different weights, so that new training data are generated to realize data enhancement, and the two methods obtain similar precision.
The calculation speed of the invention is far faster than that of commercial software, and can reach tens of times of that of the commercial software according to actual tests.
The above-described embodiments are intended to illustrate rather than to limit the invention, and any modifications and variations of the present invention are within the spirit of the invention and the scope of the appended claims.