FPGA layout method for realizing layout legalization by utilizing netlist local re-synthesis
1. An FPGA layout method for realizing layout legalization by using netlist partial re-synthesis is characterized by comprising the following steps:
obtaining a user design and carrying out logic synthesis processing on the user design to obtain a corresponding global user netlist, wherein the global user netlist comprises a plurality of functional modules forming a hierarchical structure, and each functional module comprises a plurality of instances;
the FPGA chip is laid out based on the global user netlist, and a target area on the FPGA chip is determined, wherein the target area is an area of which the layout result does not accord with a layout target;
determining a functional module to be optimized in the current global user netlist according to an example positioned in the target region in the current layout result;
carrying out logic synthesis processing on the functional module to be optimized again according to an optimization target corresponding to the layout target to obtain an optimized local netlist;
and updating the current global layout netlist by using the optimized local netlist, and re-executing the step of laying out the FPGA chip based on the global user netlist by using the updated global layout netlist until all regions of the FPGA chip meet a layout target.
2. The method according to claim 1, wherein the determining the functional module to be optimized in the current global user netlist according to the instance in the current layout result, which is located in the target region, comprises:
and determining a functional module, of all functional modules of the current global user netlist, of which the number of to-be-optimized instances contained in the functional module meets a preset condition, as the to-be-optimized functional module, wherein the to-be-optimized instances are instances located in the target region.
3. The method according to claim 2, wherein the preset conditions include: the number of the contained instances to be optimized reaches a first predetermined threshold value, and/or the ratio of the internally contained instances to be optimized among all instances reaches a second predetermined threshold value.
4. The method of claim 2,
when the number of the to-be-optimized examples contained in at least two functional modules meets the preset condition, selecting the functional module which meets the preset condition and has the highest proportion of the to-be-optimized examples contained in the functional module in all the examples as the to-be-optimized functional module, and/or selecting the functional module which meets the preset condition and is positioned in the preset level of the hierarchical structure as the to-be-optimized functional module.
5. The method of claim 2, further comprising:
if the number of the to-be-optimized instances contained in all the functional modules of the current global user netlist does not meet the preset condition, expanding the area range contained in the target area, and executing the step of determining the to-be-optimized functional module in the current global user netlist again according to the instances located in the target area in the current layout result by using the new target area.
6. The method according to any one of claims 1 to 5, wherein the layout target includes that the layout congestion degree of all areas on the FPGA chip is less than a congestion degree threshold, and the optimization target in the corresponding logic synthesis processing is to reduce the layout area of the functional module to be optimized;
the determining the target area on the FPGA chip includes: and determining an area which comprises a preset area range and has the layout congestion degree reaching a congestion degree threshold value on the FPGA chip as the target area.
7. The method according to any one of claims 1 to 5, wherein the layout objective includes that a predetermined path in the FPGA chip meets a corresponding timing requirement, and the optimization objective in the corresponding logic synthesis processing is to reduce the hierarchical structure of the functional module to be optimized and/or reduce the signal fan-out of the functional module to be optimized;
the determining the target area on the FPGA chip includes: and determining an area which comprises a predetermined area range and comprises a predetermined path which does not meet the corresponding timing requirement on the FPGA chip as the target area.
8. The method of claim 7, wherein when the FPGA chip is a multi-die FPGA structure comprising a plurality of interconnected FPGA dies, the predetermined paths within the FPGA chip further include paths between different FPGA dies along which cross-die signals are located.
9. The method according to any one of claims 1 to 5, wherein, under a layout result obtained by laying out the FPGA chip based on the global user netlist, the target regions on the FPGA chip determined to be obtained include one or more, the layout targets corresponding to a plurality of target regions are the same, or the layout targets corresponding to at least two target regions are different.
10. The method according to any one of claims 1 to 5, wherein the layout targets corresponding to the target regions determined under a plurality of layout results obtained by laying out the FPGA chip based on different global user netlists are all the same, or the layout targets corresponding to the target regions determined under at least two layout results are different.
11. The method according to any one of claims 1 to 5, characterized in that, when the functional module to be optimized is subjected to logic synthesis processing again, the registers and the lookup table in the functional module to be optimized are subjected to logic synthesis processing again.
12. The method according to any of claims 1-5, wherein updating the current global placement netlist with the optimized local netlist comprises:
and replacing the pre-optimization examples in the current global layout netlist with the optimized examples and updating the connection lines among all the examples, wherein the post-optimization examples are examples contained in the post-optimization local netlist, and the pre-optimization examples are examples contained in the functional module to be optimized.
13. The method according to any one of claims 1 to 5,
if the layout position of the example before optimization under the last layout result is a legal position, the example after optimization is appointed to be placed at the layout position of the example before optimization under the last layout result when the updated global layout netlist is used for layout of the FPGA chip, otherwise, the example after optimization is appointed to be placed at other legal positions on the FPGA chip, the example after optimization is an example contained in the local netlist after optimization, and the example before optimization is an example contained in the functional module to be optimized.
Background
A Field-Programmable Gate Array (FPGA) is a chip widely used in household appliances, large machinery and even aerospace. The use of FPGA chips does not require Electronic Design Automation (EDA) tools. Layout is an important ring in EDA tools, which has a large impact on the speed of operation of the EDA tool itself, and the ultimate quality of the processed circuit. In recent years, the circuit scale of FPGA chips has rapidly increased to make them more powerful, but at the same time, it has also presented challenges to the corresponding EDA tools.
The main function of the layout is to map the instances in the user netlist to the layout positions of the FPGA chip with actual physical coordinates one by one under the optimization target, and the analytic layout algorithm becomes one of the mainstream directions of the current layout algorithm due to the characteristic that the analytic layout algorithm can quickly obtain the global optimal solution by using a mathematical method. However, the solution result of the analytic layout algorithm often has the problem of overlapping of the examples, and the precondition of the legal layout is that the examples cannot overlap, and generally, the layout results obtained by the prior solution of the analytic layout algorithm for several times have a large amount of overlap, and the solution results need to be expanded by multiple iterations until reaching a relatively ideal low overlap, but even the last iteration often cannot completely eliminate the overlap of the layout results, so the analytic layout algorithm usually needs to be combined with the legal process to obtain a reasonable layout result. The circuit structure of the user netlist at the present stage is often large, and many iterations of the trial and error are often needed to spread out the example of the overcrowded area, but the slow speed of the decrease of the crowding degree causes the following common problems: the processing time of the iterative unfolding process is usually longer due to too slow unfolding, so that the layout efficiency is reduced; and, the spread is excessive, increasing the winding delay and failing to achieve the timing goal.
Disclosure of Invention
The invention provides an FPGA layout method for realizing layout legalization by local resynthesis of a netlist aiming at the problems and the technical requirements, and the technical scheme of the invention is as follows:
an FPGA layout method for realizing layout legalization by using netlist partial re-synthesis comprises the following steps:
acquiring a user design and carrying out logic comprehensive processing on the user design to obtain a corresponding global user netlist, wherein the global user netlist comprises a plurality of functional modules forming a hierarchical structure, and each functional module comprises a plurality of instances;
the FPGA chip is laid out based on the global user netlist, a target area on the FPGA chip is determined, and the target area is an area with a layout result not conforming to a layout target;
determining a functional module to be optimized in the current global user netlist according to an example positioned in the target region in the current layout result;
carrying out logic synthesis processing on the functional module to be optimized again according to the optimization target corresponding to the layout target to obtain an optimized local netlist;
and updating the current global layout netlist by using the optimized local netlist, and re-executing the step of laying out the FPGA chip based on the global user netlist by using the updated global layout netlist until all regions of the FPGA chip meet the layout target.
The further technical scheme is that the method for determining the functional module to be optimized in the current global user netlist according to the example located in the target area in the current layout result comprises the following steps:
and determining a functional module, of all functional modules of the current global user netlist, of which the number of the to-be-optimized instances contained in the functional module meets a preset condition, as the to-be-optimized functional module, wherein the to-be-optimized instances are instances located in the target area.
The further technical scheme is that the preset conditions comprise: the number of the contained instances to be optimized reaches a first predetermined threshold value, and/or the ratio of the internally contained instances to be optimized among all instances reaches a second predetermined threshold value.
According to a further technical scheme, when the number of the to-be-optimized examples contained in at least two functional modules meets a preset condition, the functional module which meets the preset condition and has the highest proportion of the to-be-optimized examples contained in the functional module in all the examples is selected as the to-be-optimized functional module, and/or the functional module which meets the preset condition and is positioned at a preset level of a hierarchical structure is selected as the to-be-optimized functional module.
The further technical scheme is that the method also comprises the following steps:
and if the number of the to-be-optimized examples contained in all the functional modules of the current global user netlist does not meet the preset condition, expanding the area range contained in the target area, and determining the to-be-optimized functional module in the current global user netlist according to the examples located in the target area in the current layout result by using the new target area.
The further technical scheme is that the layout target comprises that the layout crowdedness of all areas on the FPGA chip is smaller than a crowdedness threshold value, and the corresponding optimization target in the logic comprehensive processing is to reduce the layout area of the functional module to be optimized;
determining a target area on the FPGA chip, including: and determining an area which contains a predetermined area range and has the layout congestion degree reaching a congestion degree threshold value on the FPGA chip as a target area.
The further technical scheme is that the layout target comprises that a preset path in an FPGA chip meets the corresponding time sequence requirement, and the optimization target in the corresponding logic comprehensive processing is to reduce the hierarchical structure of the functional module to be optimized and/or reduce the signal fan-out of the functional module to be optimized;
determining a target area on the FPGA chip, including: and determining an area which contains a predetermined area range and a predetermined path which does not meet the corresponding time sequence requirement on the FPGA chip as a target area.
The further technical scheme is that when the FPGA chip is a multi-die FPGA structure comprising a plurality of mutually connected FPGA dies, the preset path in the FPGA chip also comprises a path where a cross-die signal between different FPGA dies is located.
According to a further technical scheme, under the condition of a layout result obtained by laying out the FPGA chip based on the global user netlist, one or more target regions on the FPGA chip are determined, the layout targets corresponding to the target regions are the same, or the layout targets corresponding to at least two target regions are different.
The further technical scheme is that the layout targets corresponding to the target regions determined under a plurality of layout results obtained by laying out the FPGA chip based on different global user netlists are the same, or the layout targets corresponding to the target regions determined under at least two times of layout results are different.
The further technical scheme is that when the logic comprehensive processing is carried out on the functional module to be optimized again, the logic comprehensive processing is carried out on the register and the lookup table in the functional module to be optimized again.
The further technical scheme is that the current global layout netlist is updated by using the optimized local netlist, and the method comprises the following steps:
and replacing the pre-optimization examples in the current global layout netlist with the optimized examples and updating the connecting lines among all the examples, wherein the optimized examples are examples contained in the optimized local netlist, and the pre-optimization examples are examples contained in the functional module to be optimized.
The method comprises the following steps that if the layout position of an example before optimization under the last layout result is a legal position, when the FPGA chip is laid by using the updated global layout netlist, the optimized example is appointed to be placed at the layout position of the example before optimization under the last layout result, otherwise, the optimized example is appointed to be placed at other legal positions on the FPGA chip, the optimized example is an example contained in the optimized local netlist, and the optimized example is an example contained in a function module to be optimized.
The beneficial technical effects of the invention are as follows:
the method selects a target area according to a layout target, selects a function module to be optimized with proper level according to a corresponding method by an example in the target area, and independently carries out logic synthesis on the function module to be optimized, namely carries out incremental optimization updating on a global user netlist, so that the target area can be unfolded as soon as possible to achieve the layout target, and the processing amount of each time can be reduced, thereby improving the unfolding efficiency of the local area in the layout process and achieving the layout target as soon as possible.
Drawings
Fig. 1 is a flowchart of an FPGA layout method of the present application.
FIG. 2 is a schematic diagram of a hierarchical structure of a global user netlist in an example of the present application.
FIG. 3 is a flow diagram that illustrates the determination of a functional module to be optimized, according to an embodiment.
Detailed Description
The following further describes the embodiments of the present invention with reference to the drawings.
The application discloses an FPGA layout method for realizing layout legalization by local re-synthesis of a netlist, which comprises the following steps, please refer to FIG. 1:
step 1, obtaining a user design and carrying out logic comprehensive processing on the user design to obtain a corresponding global user netlist. The logic comprehensive processing mainly comprises four processes of reading, translating, optimizing and mapping, and is a conventional process for developing the FPGA chip by using the EDA, and the specific operation of the logic comprehensive processing is not expanded.
The global user netlist comprises a plurality of functional modules forming a hierarchical structure, and the hierarchical structure of the global user netlist is defined in a read-in user design (RTL-level description file). The hierarchical structure of the global user netlist is usually a tree structure formed based on the calling relations among the function modules, the hierarchical structure indicates the calling relations among the function modules, the lowest layer of the hierarchical structure is a plurality of instances, and each function module can call one or more function modules located at the next layer or not along the direction from the highest layer to the lowest layer of the hierarchical structure, so that each function module comprises a plurality of instances, can directly comprise the instances at the lowest layer and/or comprises one or more function modules located at the next layer. A functional module that directly includes only the lowermost instance without calling any other functional module is defined as the lowermost functional module, and the other functional module is at a level that is higher than the highest level among all the functional modules that it calls.
For example, the hierarchical structure formed by the global user netlist, referring to fig. 2, the lowest layer of the hierarchical structure is instances a to K, and in addition, the hierarchical structure further includes function modules V1 to V6. The function modules V4, V5, and V6 each include only the bottommost instance directly without calling any other function module, and thus these three function modules are the bottommost function modules. The function modules V4 and V5 called by the function module V2 are both located at the lowest layer, and thus the function module V2 is located at the second layer from bottom to top, and similarly the function module V3 is also. And function block V1 is located at the highest level of the hierarchical structure.
As shown in fig. 2, the function module V3 calls the function module V6 at the lower layer and includes the instance J at the lowest layer, so that the instances included in the function module V3 are the instance J and the instance K in the function module V6. Function module V5 does not call the underlying function modules and directly includes the 4 lowest level instances, and therefore the instances included in function module V5 are instance E, instance F, instance G, and instance H. Others may be analogized.
In practical application, the global user netlist needs to be flattened (Flatten) to be suitable for subsequent layout, but the flattened global user netlist can also embody a hierarchical structure, hierarchical information can be put into the hierarchical structure during flattening, so that the hierarchical structure of an instance can be estimated according to the instance name, for example, the instance a in fig. 2 can be named as V1/V2/InstA, and V1 and V2 are hierarchies representing the instance a (InstA).
And 2, laying out the FPGA chip based on the global user netlist to obtain a layout result of the whole chip after layout, namely mapping the instances in the global user netlist to layout positions of the FPGA chip with actual physical coordinates one by one, and realizing layout by adopting various conventional layout methods, such as a conventional analytic layout algorithm.
After the layout is completed and the current layout result is obtained, a target area on the FPGA chip can be determined, wherein the target area is an area of which the layout result does not accord with the layout target, the layout target is preset, and the size of an area range contained in the target area is also preset.
In the present application, the area range included in the target area is bounded by a basic module inside the FPGA, and the basic module in the FPGA mainly includes CLBs (basic logic units), BRAMs, IOBs, DSPs, PCs, and the like, for example, the area range of the target area is 6 × 8 CLBs.
And 3, determining a to-be-optimized functional module in the current global user netlist according to the example located in the target region in the current layout result, please refer to the flowchart shown in fig. 3.
In one embodiment, the implementation method of the step is as follows: and determining a functional module, of all functional modules of the current global user netlist, of which the number of the to-be-optimized instances contained in the functional module meets a preset condition, as the to-be-optimized functional module, wherein the to-be-optimized instances are instances located in the target area. For example, in fig. 2, the shaded instances A, B, D, E, F, G are to-be-optimized instances located in the target area, the function modules V4, V5, V2, and V1 all include to-be-optimized instances, and the to-be-optimized instances included in each function module are as shown in the table below, and then the function module to be optimized is selected according to the number of to-be-optimized instances included in the function modules V4, V5, V2, and V1.
Optionally, when the number of to-be-optimized instances included in one functional module reaches a first predetermined threshold, and/or the ratio of the to-be-optimized instances included in the functional module in all instances in the functional module reaches a second predetermined threshold, it is determined that the number of to-be-optimized instances included in the functional module satisfies a preset condition. For example, if the first predetermined threshold is set to 4 and the second predetermined threshold is set to 0.7, then it can be determined, by combining the data in the table above, that the number of instances to be optimized contained in the function module V2 satisfies the preset condition. If the first predetermined threshold is set to 2 and the second predetermined threshold is set to 0.7, then it can be determined from the data in the table above that both the function block V5 and the function block V2 contain the number of instances to be optimized that satisfy the predetermined condition.
When the number of the to-be-optimized examples contained in at least two functional modules meets the preset condition, selecting the functional module which meets the preset condition and has the highest proportion of the to-be-optimized examples contained in the functional module from all the examples in the functional module as the to-be-optimized functional module, and/or selecting the functional module which meets the preset condition and is positioned at the preset level of the hierarchical structure as the to-be-optimized functional module. The preset hierarchy is predefined, for example, the preset hierarchy is usually selected as the highest hierarchy among all the functional modules meeting the preset condition, or the preset hierarchy is selected as the lowest hierarchy among all the functional modules meeting the preset condition. For example, in the above example, on the basis that the first predetermined threshold is 2, the second predetermined threshold is 0.7, and it is determined that the numbers of the to-be-optimized instances contained in the function module V5 and the function module V2 all satisfy the preset condition, and the occupation ratios of the to-be-optimized instances in the function module V5 and the function module V2 are equal, the function module V5 located at a lower level may be selected as the to-be-optimized function module.
If the number of the to-be-optimized instances contained in all the functional modules of the current global user netlist does not meet the preset condition, optionally, expanding the range of the region contained in the target region, and performing the step of determining the to-be-optimized functional module in the current global user netlist according to the instances located in the target region in the current layout result again by using the new target region, that is, returning to the step 2 to determine the new target region again until the to-be-optimized functional module is found. For example, setting the second predetermined threshold to 0.8, it may be determined, by combining the data in the table above, that the number of to-be-optimized instances included in all the functional modules does not satisfy the preset condition, at this time, the area range of the target area may be expanded, for example, the area range of the target area is expanded from 6 × 8 CLBs to 8 × 8 CLBs, and then the functional module to be optimized is re-determined. And/or, optionally, adjusting the first predetermined threshold and/or the second predetermined threshold until the functional module to be optimized is found.
And 4, carrying out logic comprehensive processing on the functional module to be optimized again according to the optimization target corresponding to the layout target to obtain an optimized local netlist.
After the functional module to be optimized is subjected to logic synthesis again, the examples contained in the functional module to be optimized can change, the examples contained in the functional module to be optimized before the logic synthesis is carried out again are called as examples before optimization, the examples contained after the logic synthesis is carried out again are called as examples after optimization, and the local netlist after optimization is the local netlist corresponding to the functional module to be optimized, so that the local netlist after optimization also contains the examples after optimization.
Optionally, when the to-be-optimized functional module performs the logic synthesis processing again, the register and the lookup table inside the basic logic unit CLB in the to-be-optimized functional module perform the logic synthesis processing again, and do not perform the processing on other basic modules.
And 5, updating the current global layout netlist by using the optimized local netlist, namely combining the optimized local netlist with the rest parts which are not subjected to logic synthesis again in the current global layout netlist to form the updated global layout netlist. Specifically, the optimized instances are used for replacing the instances before optimization in the current global layout netlist, and the connecting lines among all the instances are updated according to the circuit structure. And then, re-executing the step of carrying out layout on the FPGA chip based on the global user netlist by using the updated global layout netlist, namely re-executing the step 2-5, and carrying out subsequent detailed layout to obtain a final layout result until all regions of the FPGA chip meet the layout target, namely, no target region which does not meet the layout target exists on the FPGA chip. When the updated global layout netlist is used for re-layout, the updated global layout netlist also needs to be subjected to flattening processing.
And when the updated global layout netlist is used for layout again, if the layout position of the example before optimization under the last layout result is a legal position, the example after optimization is appointed to be placed at the layout position of the example before optimization under the last layout result when the updated global layout netlist is used for layout of the FPGA chip, otherwise, the example after optimization is appointed to be placed at other legal positions on the FPGA chip.
In the layout method, the layout targets are determined according to actual conditions, when the layout targets are different, the optimization targets used when the functional module to be optimized is subjected to logic synthesis processing again are also different, and the optimization targets correspond to the layout targets.
Optionally, in an embodiment, the layout target in the layout method includes that the layout congestion degrees of all the areas on the FPGA chip are less than the congestion degree threshold, and when the target area on the FPGA chip is determined, the area on the FPGA chip, which includes the predetermined area range and whose layout congestion degree reaches the congestion degree threshold, is determined as the target area. The predetermined area range is a preset area range, such as 6 × 8 CLBs in the above example. And the corresponding optimization target in the logic comprehensive processing is to reduce the layout area of the functional module to be optimized. In this embodiment, after the functional module to be optimized is subjected to the logic synthesis again, the number of instances inside the functional module to be optimized is reduced, that is, the number of instances after optimization is less than the number of instances before optimization.
Optionally, in another embodiment, the layout target in the layout method includes that a predetermined path in the FPGA chip meets a corresponding timing requirement, and when a target area on the FPGA chip is determined: and determining an area which contains a predetermined area range and a predetermined path which does not meet the corresponding time sequence requirement on the FPGA chip as a target area. The optimization goal in the corresponding logic synthesis processing is to reduce the hierarchical structure of the functional module to be optimized and/or reduce the signal fan-out of the functional module to be optimized.
In the above embodiments, the predetermined path in the FPGA chip includes all paths or part of paths in the FPGA chip, and typically, the predetermined path in the FPGA chip includes a path with a special timing characteristic in the FPGA chip and/or a path with a special signal characteristic in the FPGA chip, and may be generally configured in advance. The timing requirements corresponding to different predetermined paths may be the same or different, and may also be generally preconfigured. For example, when a path with special timing characteristics is selected as the predetermined path, a critical path in the FPGA chip may be selected as the predetermined path. For example, when a path with special signal characteristics is selected as a predetermined path, a path for transmitting a clock signal in the FPGA chip may be selected as the predetermined path. The FPGA layout method of this embodiment may be applicable to a single-die FPGA structure or to a multi-die FPGA structure, where the multi-die FPGA structure includes a plurality of connected FPGA dies, and for the multi-die FPGA structure, when a path of a special signal feature is selected as a predetermined path, the predetermined path in the FPGA chip further includes a path where a cross-die signal between different FPGA dies is located.
As described above, the layout objects in the FPGA layout method provided by the present application have a plurality of different meanings, and in the loop iteration process of the FPGA layout method, the same layout object is always used or a plurality of different layout objects are used in a fusion manner, which is respectively introduced as follows according to different situations:
in one embodiment, under a layout result obtained by laying out the FPGA chip based on the global user netlist, it is determined that the target regions on the FPGA chip include one or more target regions, and layout targets corresponding to the target regions are the same or layout targets corresponding to at least two target regions are different. That is, in this embodiment, one layout target or a plurality of different layout targets may be used in a single iteration process, for example, after obtaining one layout result, it is determined that the layout congestion degree of the target area a reaches the congestion degree threshold, and it is determined that the target area B includes a predetermined path that does not meet the corresponding timing requirement, so that when performing logic synthesis processing again on the functional module to be optimized determined in the target area a, the layout area of the functional module to be optimized is reduced, and when performing logic synthesis processing again on the functional module to be optimized determined in the target area B, the hierarchical structure of the functional module to be optimized is reduced, and/or the signal fanout of the functional module to be optimized is reduced.
In another embodiment, the layout targets corresponding to the target regions determined under a plurality of layout results obtained by laying out the FPGA chip based on different global user netlists are all the same, or the layout targets corresponding to the target regions determined under at least two layout results are different. That is, in this embodiment, different iteration processes use the same layout target or different layout targets, for example, the target area whose layout congestion degree reaches the congestion degree threshold may be optimized in multiple iterations, or the target area whose layout congestion degree reaches the congestion degree threshold may be optimized in multiple iterations, and then the target area including the critical path that does not satisfy the timing requirement is optimized, and then the target area including the path where the cross-die signal that does not satisfy the timing requirement is located is optimized.
What has been described above is only a preferred embodiment of the present application, and the present invention is not limited to the above embodiment. It is to be understood that other modifications and variations directly derivable or suggested by those skilled in the art without departing from the spirit and concept of the present invention are to be considered as included within the scope of the present invention.
- 上一篇:石墨接头机器人自动装卡簧、装栓机
- 下一篇:一种提升运行速度的FPGA芯片设计方法