Table reconstruction method and device for table picture and related equipment
1. A table reconstruction method of a table picture is characterized by comprising the following steps:
acquiring a form picture, and identifying the character position in the form picture to obtain an identification result;
generating a text box according to the recognition result, and determining the central point of the text box;
connecting the central points of all the text boxes according to a preset triangular network extraction mode to obtain a triangular network corresponding to the table picture;
performing frequency statistics on the sides of each triangle appearing in the triangle network based on a preset edge outline extraction mode to obtain a statistical result, and obtaining an edge outline corresponding to the triangle network according to the statistical result;
performing main direction extraction on the edge outer frame based on a preset main direction extraction mode to obtain a main direction corresponding to the edge outer frame;
and carrying out topology analysis and electronization on the table picture based on the main direction to obtain a reconstructed table.
2. The method of claim 1, wherein generating a text box according to the recognition result and determining the center point of the text box comprises:
analyzing the recognition result by adopting a differentiable binarization processing algorithm to obtain a text box;
and based on a preset central point acquisition mode, carrying out central point extraction on each text box to obtain a central point corresponding to each text box.
3. The method according to claim 1, wherein after generating a text box according to the recognition result and determining the center point of the text box, the method further comprises:
and identifying the content of all the text boxes to obtain the character content corresponding to each text box.
4. The method according to claim 1, wherein the connecting the center points of all the text boxes according to a preset triangle network extraction manner to obtain the triangle network corresponding to the table picture comprises:
constructing an initial triangle by adopting a point-by-point insertion algorithm based on the central points corresponding to all the text boxes, wherein the initial triangle surrounds all the central points;
selecting one of the internal center points of the initial triangle as a new vertex;
if the circumscribed circle of the triangle formed by connecting the new vertex and any two vertexes of the initial triangle contains the initial triangle, deleting the common side of the circumscribed circle and the initial triangle to form a convex polygon;
and connecting the new vertex with each vertex of the convex polygon to obtain a middle triangle, taking the middle triangle as an initial triangle, returning to the step of selecting one of the remaining central points as the new vertex of the initial triangle, and continuing to execute until all the remaining central points are executed, so as to obtain a triangle network corresponding to the table picture.
5. The method according to claim 1, wherein the performing of the main direction extraction on the edge outline based on a preset main direction extraction manner includes:
performing gradient calculation on all edges of the edge outer frame to obtain a gradient value corresponding to each edge;
establishing a histogram based on the gradient values, wherein the histogram has two peaks in a horizontal direction and a vertical direction;
acquiring the character direction of the text box based on a preset character direction acquisition mode;
and determining two main directions of the edge outer frame based on the two peak values of the histogram in the horizontal direction and the vertical direction, and taking the main direction which is the same as the character direction as the horizontal main direction.
6. The method according to any one of claims 1 to 5, wherein the performing topology analysis and electronization on the table picture based on the main direction to obtain a reconstructed table comprises:
calculating the inclination degree of the table picture, and if the inclination degree is greater than a preset inclination value, performing angle adjustment operation on the table picture until the inclination degree of the table picture is less than or equal to the preset inclination value;
randomly selecting one side of the edge outer frame as a scanning side, respectively scanning based on the vertical main direction and the horizontal main direction, and if a straight line is scanned, generating a cell until all sides are scanned;
acquiring center points of all the unit specifications;
checking the cells based on a preset checking direction, if the distance between the center point of the cell and the center point of the text box corresponding to the cell is smaller than a preset distance value, combining two adjacent cells in the same direction as the preset checking direction until the checking in the vertical main direction and the horizontal main direction is finished, and obtaining a topological graph of the table picture;
and filling character content corresponding to the text box into the cells corresponding to the topological graph to obtain a reconstructed table.
7. The method of claim 1, wherein after the topology analyzing and the electronizing the table picture based on the main direction to obtain a reconstructed table, the method further comprises:
and performing confidence evaluation on the reconstruction table.
8. A table reconstruction device for a table picture, comprising:
the table picture acquisition module is used for acquiring a table picture and identifying the character position in the table picture to obtain an identification result;
the central point acquisition module is used for generating a text box according to the identification result and determining the central point of the text box;
the triangular network acquisition module is used for connecting the central points of all the text boxes according to a preset triangular network extraction mode to obtain a triangular network corresponding to the table picture;
the edge outer frame acquisition module is used for carrying out frequency statistics on the sides of each triangle appearing in the triangle network based on a preset edge outer frame extraction mode to obtain a statistical result, and obtaining an edge outer frame corresponding to the triangle network according to the statistical result;
the main direction obtaining module is used for extracting the main direction of the edge outer frame based on a preset main direction extracting mode to obtain a main direction corresponding to the edge outer frame;
and the reconstruction module is used for carrying out topology analysis and electronization on the table picture based on the main direction to obtain a reconstructed table.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements a table reconstruction method for a table picture according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, in which a computer program is stored, which, when being executed by a processor, implements a table reconstruction method for a table picture according to any one of claims 1 to 7.
Background
The form is an important carrier for expressing information, and improves the convenience of acquiring and searching the information. At present, forms are mainly manually entered into computers, and form information is manually entered and scanned, wherein scanning is a method performed when forms are presented in the form of unstructured digital files (such as pictures).
The above methods all have some problems, manual entry of form information has the problem of low entry efficiency, the scanned form picture can improve the entry efficiency, but when the form in the scanned form picture has cross-cell or other complex conditions, the prior art generally extracts the content of the form picture directly through rows or columns, but the content of different rows or columns of the form picture is easily divided into the same row or column, so that the form with blank cells is generated, the form structure is difficult to accurately and completely reflect, and the identification accuracy is low.
Therefore, the problem of low accuracy in the conventional method for electronically reconstructing the table is solved.
Disclosure of Invention
The embodiment of the invention provides a table reconstruction method and device of a table picture, computer equipment and a storage medium, and improves the accuracy of table reconstruction of the table picture.
A table reconstruction method of a table picture comprises the following steps:
acquiring a form picture, and identifying the character position in the form picture to obtain an identification result;
generating a text box according to the recognition result, and determining the central point of the text box;
connecting the central points of all the text boxes according to a preset triangular network extraction mode to obtain a triangular network corresponding to the table picture;
performing frequency statistics on the sides of each triangle appearing in the triangle network based on a preset edge outline extraction mode to obtain a statistical result, and obtaining an edge outline corresponding to the triangle network according to the statistical result;
performing main direction extraction on the edge outer frame based on a preset main direction extraction mode to obtain a main direction corresponding to the edge outer frame;
and carrying out topology analysis and electronization on the table picture based on the main direction to obtain a reconstructed table.
A form reconstruction apparatus of a form picture, comprising:
the table picture acquisition module is used for acquiring a table picture and identifying the character position in the table picture to obtain an identification result;
the central point acquisition module is used for generating a text box according to the identification result and determining the central point of the text box;
the triangular network acquisition module is used for connecting the central points of all the text boxes according to a preset triangular network extraction mode to obtain a triangular network corresponding to the table picture;
the edge outer frame acquisition module is used for carrying out frequency statistics on the sides of each triangle appearing in the triangle network based on a preset edge outer frame extraction mode to obtain a statistical result, and obtaining an edge outer frame corresponding to the triangle network according to the statistical result;
the main direction obtaining module is used for extracting the main direction of the edge outer frame based on a preset main direction extracting mode to obtain a main direction corresponding to the edge outer frame;
and the reconstruction module is used for carrying out topology analysis and electronization on the table picture based on the main direction to obtain a reconstructed table.
A computer device includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements a table reconstruction method of the above table picture when executing the computer program.
A computer-readable storage medium, in which a computer program is stored, which, when executed by a processor, implements the above table reconstruction method for a table picture.
The table reconstruction method, the table reconstruction device, the computer equipment and the storage medium of the table picture, provided by the embodiment of the invention, are used for acquiring the table picture and identifying the character position in the table picture to obtain an identification result; generating a text box according to the recognition result, and determining the central point of the text box; connecting the central points of all the text boxes according to a preset triangular network extraction mode to obtain a triangular network corresponding to the table picture; performing frequency statistics on the sides of each triangle appearing in the triangle network based on a preset edge outline extraction mode to obtain a statistical result, and obtaining an edge outline corresponding to the triangle network according to the statistical result; performing main direction extraction on the edge outer frame based on a preset main direction extraction mode to obtain a main direction corresponding to the edge outer frame; and carrying out topology analysis and electronization on the table picture based on the main direction to obtain a reconstructed table. The method comprises the steps of identifying character positions in a table picture, determining all text boxes corresponding to the character positions and central points of the text boxes, extracting the central points of the text boxes on the basis of a triangular network, carrying out a series of topology analysis, and carrying out table reconstruction by using topology result information of the text boxes in the table picture, so that a topology base is provided for the table reconstruction, and the accuracy of the table reconstruction is effectively improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings needed to be used in the description of the embodiments of the present invention will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without inventive labor.
Fig. 1 is a schematic diagram of an application environment of a table reconstruction method for a table picture according to an embodiment of the present invention;
FIG. 2 is a flowchart of a table reconstruction method for a table picture according to an embodiment of the present invention;
fig. 3 is an exemplary diagram of a text box and a center point of the text box in the table reconstruction method for the table picture according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating an example of a triangle network of a table reconstruction method for a table picture according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating an example of an edge outer frame of a table reconstruction method for a table picture according to an embodiment of the present invention;
FIG. 6 is a diagram illustrating a table reconstructing apparatus for a table picture according to an embodiment of the present invention;
FIG. 7 is a schematic diagram of a computer device according to an embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The table reconstruction method for the table picture provided by the application can be applied to an application environment as shown in fig. 1, wherein a computer device communicates with a server through a network. The computer device may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, among others. The server may be implemented as a stand-alone server or as a server cluster consisting of a plurality of servers.
In an embodiment, as shown in fig. 2, a table reconstructing method for a table picture is provided, which is described by taking the method applied to the server in fig. 1 as an example, and includes the following steps S101 to S106:
s101, obtaining a form picture, and identifying the character position in the form picture to obtain an identification result.
In step S101, the source of the table picture includes, but is not limited to, taking or scanning a paper document to obtain the table picture, crawling the table picture on the network, and converting the document into a picture to obtain the table picture. It is easy to understand that the table pictures acquired by these approaches often have an oblique angle, and it is not easy to reconstruct the table through the intersection points between lines. The table frames in the table picture include, but are not limited to, a simple table frame with rows and columns in order, and a table frame with merge cell operations on rows or columns.
The recognition results include, but are not limited to, character positions.
Methods for identifying character positions in the table picture include, but are not limited to, CTPN Text detection algorithm (Connectionist Text forward Network), morphological operations, and maximum stable extremum region Text detection algorithm.
Through the steps, the character positions in the form picture are identified, the character positions in the form picture are detected, so that the subsequent text box identification can be conveniently carried out, the character contents in the text box can be identified only if the region where the text is located, namely the text box, is found, and the character contents corresponding to the character positions are correspondingly filled in the form reconstruction process.
And S102, generating a text box according to the recognition result, and determining the central point of the text box.
In step S102, it should be noted that the method of generating the Text box according to the recognition result includes, but is not limited to, a DB algorithm (differential Binarization algorithm) and a CTPN Text detection algorithm (Connectionist Text forward Network).
Taking fig. 3 as an example, the text box is a rectangular box formed around the character content in each cell, the center point of the text box is a black point at the center in the text box in fig. 3, and the number of the text boxes is consistent with the number of the center points.
For step S102, it is specifically:
and analyzing the recognition result by adopting a differentiable binarization processing algorithm to obtain a text box.
And based on a preset central point acquisition mode, extracting the central point of each text box to obtain the central point corresponding to each text box.
The preset central point obtaining mode is to establish a rectangular coordinate system for each text box, preferably, a point at the upper left corner of the text box is used as an origin of the rectangular coordinate system, a mean value X of X coordinates of the upper left corner and the lower right corner of an X coordinate axis of the rectangular coordinate system is obtained as an X coordinate of the text box, a mean value Y of Y coordinates of the upper left corner and the lower right corner of a Y coordinate axis of the rectangular coordinate system is obtained as a Y coordinate of the text box, wherein the (X, Y) is the central point of the text box.
And by the recognition result, coordinate information of the text box is obtained, the text box is generated, and the central point of each text box is accurately found, so that the recognition precision of the text box is improved, and the topological analysis of the text box is facilitated.
After step S102, the method further comprises:
and identifying the content of all the text boxes to obtain the character content corresponding to each text box.
The character content refers to all characters appearing in any text box in the form picture, and the character content includes, but is not limited to, numbers, language words and symbols of various countries.
The above methods for recognizing Character content include, but are not limited to, OCR (Optical Character Recognition), STR (Scene Text Recognition).
The character content in each text box is obtained by identifying the content of the text box in the table picture, so that the character content is filled into the corresponding text box in the table reconstruction process at the later stage, and the reconstructed table is obtained.
S103, according to a preset triangular network extraction mode, connecting the center points of all the text boxes to obtain a triangular network corresponding to the table picture.
In step S103, it should be noted that the triangle network (Delaunay) refers to a series of connected but non-overlapping sets of triangles, and the circumscribed circle of the triangles does not contain any other points of the area.
The triangle network extraction method is a method for extracting a triangle network formed by all central points.
The connection operation adopts methods including but not limited to a point-by-point insertion algorithm, a flanging algorithm, a splitting and combining algorithm and a Bowyer-Watson algorithm.
Taking fig. 4 as an example, fig. 4 is a triangular network extracted on the basis of fig. 3, and the triangular network connects all the centroids into a series of connected but non-overlapping triangles.
And connecting the central points of all the text boxes by a preset triangular network extraction mode to obtain topological graphs corresponding to triangular networks which are connected by all the central points and do not overlap, so as to conveniently perform subsequent operation on the triangular networks, and simultaneously, providing theoretical support for analyzing the topological structures of the cells of the table pictures, thereby improving the accuracy of table reconstruction.
As for step S103, it specifically includes the following steps a to d:
a. and constructing an initial triangle by adopting a point-by-point insertion algorithm based on the central points corresponding to all the text boxes, wherein the initial triangle surrounds all the central points.
b. And selecting one of the internal center points of the initial triangle as a new vertex.
c. And if the circumscribed circle where the triangle formed by connecting the new vertex and any two vertexes of the initial triangle contains the initial triangle, deleting the common side of the circumscribed circle and the initial triangle to form the convex polygon.
d. And connecting the new vertex with each vertex of the convex polygon to obtain a middle triangle, taking the middle triangle as the initial triangle, returning and selecting one of the remaining central points as the new vertex of the initial triangle to continue execution until all the remaining central points are executed, and obtaining the triangle network corresponding to the table picture.
For the step a, the point-by-point insertion algorithm is an algorithm that selects central points one by one and adds the central points into an initial triangle. An initial triangle refers to a triangle that can be constructed to encompass all center points.
The above-mentioned center points may be stored in a linked list, a sequence, during the processing, to be selected for constructing triangles.
For step c above, the circumscribed circle refers to a circle intersecting each vertex of the polygon.
The method comprises the steps of adding a new vertex into an initial triangle to form two triangles with a common edge, combining the two triangles with the common edge into a convex polygon, checking the convex polygon according to a maximum empty circle criterion to see whether the new vertex is in a circumscribed circle of the initial triangle, and if so, carrying out local optimization treatment, namely deleting the common edge of the circumscribed circle and the initial triangle.
And adding the central points of all the text boxes into the initial triangles one by one through a point-by-point insertion algorithm, and performing local optimization processing to obtain topological graphs corresponding to the triangular networks of the triangular sets which are connected by all the central points and do not overlap, so that theoretical support is provided for analyzing the topological structures of the cells of the table pictures, and the accuracy of table reconstruction is improved.
And S104, performing frequency statistics on each triangle side appearing in the triangle network based on a preset edge outline extraction mode to obtain a statistical result, and obtaining an edge outline corresponding to the triangle network according to the statistical result.
In step S104, taking fig. 5 as an example, fig. 5 is an edge outline corresponding to the triangular network of fig. 4, and the edge outline is a light-colored edge.
Here, the preset edge outline extraction method is a method of extracting an edge outline corresponding to a triangular network.
Specifically, based on a preset edge outline extraction mode, frequency statistics is carried out on the edges of each triangle appearing in the triangle network, and a statistical result is obtained.
And reserving all the edges with the statistical result of 1 to obtain the edge outer frame corresponding to the triangular network.
Since in a triangular network all triangles are connected but do not overlap, the edge common to every two adjacent triangles appears 2 times, but the edges on the outer contour all appear only 1 time. Therefore, the statistical result of each edge can be confirmed by carrying out frequency statistics on the edge of each triangle appearing in the triangular network, so that all the edges with the statistical result of 1 are reserved according to the statistical structure, and the edge outer frame corresponding to the triangular network is formed by all the edges with the statistical result of 1, thereby further providing theoretical support for analyzing the topological structure of the cell of the table picture and improving the accuracy of table reconstruction.
And S105, performing main direction extraction on the edge outer frame based on a preset main direction extraction mode to obtain a main direction corresponding to the edge outer frame.
In step S105, the preset main direction extraction method is a method for extracting the main direction of the edge outline.
The main direction refers to the horizontal direction and the vertical direction of the edge outer frame.
The extraction method includes, but is not limited to, a gradiometer algorithm, a local optimization algorithm, and a segmentation and combination algorithm.
By extracting the main direction of the edge outer frame, the table can be conveniently reconstructed from different main directions, so that the reconstruction accuracy of the table in the horizontal direction and the vertical direction is ensured.
As for the above step S105, it specifically includes the following steps e to h:
e. and performing gradient calculation on all edges of the edge outer frame to obtain a gradient value corresponding to each edge.
f. Based on the gradient values, a histogram is created, wherein the histogram has two peaks in the horizontal and vertical directions.
g. And acquiring the character direction of the text box based on a preset character direction acquisition mode.
h. Based on the two peaks in the horizontal and vertical directions of the histogram, two principal directions of the edge outline are determined, and the principal direction that is the same as the character direction is taken as the horizontal principal direction.
For the step e, preferably, a rectangular coordinate system is established with the origin of the upper left corner of the edge outer frame, and gradient calculation is performed on all the edges of the edge outer frame to obtain a gradient value corresponding to each edge.
For the step f, establishing a histogram based on the gradient values, wherein the frequency of the histogram is a preset gradient frequency, and the abscissa of the histogram is a gradient direction. From the histogram, two peaks of the table picture in the horizontal and vertical directions can be determined.
In a specific embodiment, the step f is described, where the preset gradient frequency is 5 degrees, and then the frequency of occurrence of the gradient value is counted by taking every 5 degrees as a unit to create a histogram, where an abscissa of the histogram has 72 entries, each entry represents a range of 5 degrees, and an ordinate is the number of sides that meet the gradient range.
In the step g, the preset character direction extracting method is a method of extracting a direction of the character content. The extraction includes, but is not limited to, reading the character orientation of the form input.
For the step h, two main directions of the table can be determined according to two peak values in the horizontal direction and the vertical direction of the histogram, in the character passing direction, the main direction consistent with the character direction is determined as the horizontal direction of the edge outline, and the main direction inconsistent with the character direction is determined as the vertical direction of the edge outline.
The method comprises the steps of establishing a histogram through gradient calculation, determining two peak values of an edge outer frame in the horizontal direction and the vertical direction through the histogram, and determining the horizontal direction and the vertical direction of the edge outer frame according to the character direction of a table picture, so that two main directions are established for table reconstruction, subsequent topological analysis of the table reconstruction from different directions is facilitated, and the accuracy of the table reconstruction in different main directions is improved.
And S106, carrying out topology analysis and electronization on the table picture based on the main direction to obtain a reconstructed table.
In step S106, specifically, the table picture is subjected to topology analysis and electronization from the horizontal direction and the vertical direction, respectively, to obtain a cell frame of the reconstructed table, and the character contents of different cells are filled into corresponding cells, so as to obtain the reconstructed table.
The table picture is subjected to topology analysis and electronization in the horizontal direction and the vertical direction, so that errors caused by analysis in a single direction are avoided, and the accuracy of the reconstructed table in different main directions is improved.
For step S106, it specifically includes the following steps a to E:
A. and calculating the inclination degree of the table picture, and if the inclination degree is greater than a preset inclination value, performing angle adjustment operation on the table picture until the inclination degree of the table picture is less than or equal to the preset inclination value.
B. And randomly selecting one side of the edge outer frame as a scanning side, respectively scanning based on the vertical main direction and the horizontal main direction, and if a straight line is scanned, generating a cell until all sides are scanned.
C. And acquiring the center points of all the unit cells.
D. And checking the cells based on a preset checking direction, if the distance between the center point of the cell and the center point of the text box corresponding to the cell is smaller than a preset distance value, combining two adjacent cells in the same direction as the preset checking direction until the checking in the vertical main direction and the horizontal main direction is finished, and obtaining the topological graph of the table picture.
E. And filling the character content corresponding to the text box into the cell corresponding to the topological graph to obtain a reconstructed table.
For the step a, since the table picture may have an angle during scanning, the table picture needs to be adjusted for this purpose.
Preferably, the inclination degree of the table picture may be obtained by the histogram of step f, and the degree corresponding to the degree range where the abscissa of the histogram of step f appears for the first time is used as the inclination degree of the table picture.
For step B, the scanning includes scanning from left to right in the horizontal direction and scanning from top to bottom in the vertical direction.
The randomly selected scanning band is provided with a buffer area, and the buffer area means that the table picture may have small-angle inclination after being adjusted, so that the scanning line is designed into the scanning line with the buffer area with a certain width, and straight line extraction can be completed.
When the scanning line moves, if no straight line exists, the scanning is continued along the direction consistent with the main direction, and if the straight line exists, two-point coordinates of the straight line are recorded and used for generating the cell until all the straight lines are scanned.
For the step D, the preset verifying direction includes a horizontal direction and a vertical direction.
Taking a specific embodiment as an example for explanation, when checking is performed in the vertical direction, the preset distance value is 1/3 with the cell height, the distance between the center point of the cell and the center point of the text box corresponding to the cell is calculated, and if the distance between the center point of the text box corresponding to the cell and the center point of the cell is greater than 1/3 cell height, two adjacent cells are merged and the straight line connecting the two center points is deleted.
The angle adjustment is carried out on the table picture, so that errors caused by the inclination degree of the table picture are reduced, the topology analysis and the electronization are carried out on the adjusted table picture, the topology analysis is carried out on the processed table picture in different directions through the scanning line with the buffer area, the cell frame of the table is reconstructed, the reconstructed cell frame is further verified, the accuracy of the reconstructed table is further improved, and finally the character content corresponding to the text box is filled into the cell corresponding to the topology map, so that the reconstructed table is obtained.
Further, after step S106, the table reconstruction method of the table picture includes:
and performing confidence evaluation on the reconstruction table.
Specifically, the confidence evaluation refers to calculating confidence by comparing the reconstructed table with the table picture.
Reconstructing the part of the table which is similar to the table picture in topology, and giving high confidence level;
if the reconstructed form and the form picture have a topological part with deviation larger than the preset deviation value, giving low confidence;
and summing and averaging all the confidence degrees to obtain the confidence degree of the reconstructed table.
And in the process of reconstructing the training form picture, the confidence evaluation is carried out on the reconstructed form, and if the confidence is lower, the picture with low confidence can be modified and adjusted in a targeted manner through manual intervention, so that the reconstruction of the form is optimized, and the accuracy of the reconstruction of the form is improved.
The form reconstruction method of the form picture, provided by the embodiment of the invention, is used for acquiring the form picture, and identifying the character position in the form picture to obtain an identification result; generating a text box according to the recognition result, and determining the central point of the text box; connecting the central points of all the text boxes according to a preset triangular network extraction mode to obtain a triangular network corresponding to the table picture; performing frequency statistics on each triangle side appearing in the triangle network based on a preset edge outline extraction mode to obtain a statistical result, and obtaining an edge outline corresponding to the triangle network according to the statistical result; performing main direction extraction on the edge outer frame based on a preset main direction extraction mode to obtain a main direction corresponding to the edge outer frame; and carrying out topology analysis and electronization on the table picture based on the main direction to obtain a reconstructed table. By identifying the character positions in the table picture, firstly determining all text boxes corresponding to the character positions and the central points of the text boxes, accurately extracting the topology information of all the text boxes of the table picture based on triangular network extraction of the central points of the text boxes, and reconstructing the table by using the topology information of the text boxes in the table picture, reconstructing the topology information derived from the table picture, and effectively improving the accuracy of table reconstruction.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present invention.
In an embodiment, a table reconstructing apparatus of a table picture is provided, and the table reconstructing apparatus of the table picture corresponds to the table reconstructing method of the table picture in the above embodiment one to one. As shown in fig. 6, the table reconstructing apparatus for the table picture includes a table picture obtaining module 11, a central point obtaining module 12, a triangle network obtaining module 13, an edge outer frame obtaining module 14, a main direction obtaining module 15, and a reconstructing module 16. The functional modules are explained in detail as follows:
and the form picture acquisition module 11 is configured to acquire a form picture, and identify a character position in the form picture to obtain an identification result.
And a central point obtaining module 12, configured to generate a text box according to the recognition result, and determine a central point of the text box.
And the triangular network acquisition module 13 is configured to perform connection operation on the central points of all the text boxes according to a preset triangular network extraction mode to obtain a triangular network corresponding to the table picture.
The edge outline acquisition module 14 is configured to perform frequency statistics on edges of each triangle appearing in the triangle network based on a preset edge outline extraction manner to obtain a statistical result, and obtain an edge outline corresponding to the triangle network according to the statistical result.
And the main direction obtaining module 15 is configured to perform main direction extraction on the edge outer frame based on a preset main direction extraction mode, so as to obtain a main direction corresponding to the edge outer frame.
And the reconstruction module 16 is used for carrying out topology analysis and electronization on the table picture based on the main direction to obtain a reconstructed table.
In one embodiment, the central point obtaining module 12 further includes:
and the text box acquisition unit is used for analyzing the recognition result by adopting a differentiable binarization processing algorithm to acquire the text box.
And the central point acquisition unit is used for extracting the central point of each text box based on a preset central point acquisition mode to obtain the central point corresponding to each text box.
In one embodiment, the central point obtaining module 12 includes:
and the character content acquisition module is used for identifying the content of all the text boxes to acquire the character content corresponding to each text box.
In one embodiment, the triangle network obtaining module 13 further includes:
and the initial triangle constructing unit is used for constructing an initial triangle by adopting a point-by-point insertion algorithm based on the central points corresponding to all the text boxes, wherein the initial triangle surrounds all the central points.
And the new vertex selecting unit is used for selecting one of the internal center points of the initial triangle as a new vertex.
And the convex polygon construction unit is used for deleting the common side of the circumscribed circle and the initial triangle to form the convex polygon if the circumscribed circle where the triangle formed by connecting the new vertex and any two vertexes of the initial triangle contains the initial triangle.
And the triangle network acquisition unit is used for connecting the new vertex with each vertex of the convex polygon to obtain a middle triangle, taking the middle triangle as the initial triangle, returning and selecting one of the remaining central points as the new vertex of the initial triangle for continuous execution until all the remaining central points are executed completely, and obtaining the triangle network corresponding to the table picture.
In one embodiment, the main direction obtaining module 15 further includes:
and the gradient calculation unit is used for performing gradient calculation on all edges of the edge outer frame to obtain a gradient value corresponding to each edge.
A histogram acquisition unit for creating a histogram based on the gradient values, wherein the histogram has two peaks in a horizontal direction and a vertical direction.
And the character direction acquiring unit is used for acquiring the character direction of the text box based on a preset character direction acquiring mode.
And a main direction obtaining unit for determining two main directions of the edge outer frame based on the two peak values in the horizontal direction and the vertical direction of the histogram, and taking the main direction same as the character direction as the horizontal main direction.
In one embodiment, the reconstruction module 16 further comprises:
and the inclination degree calculating unit is used for calculating the inclination degree of the table picture, and if the inclination degree is greater than a preset inclination value, the angle adjustment operation is carried out on the table picture until the inclination degree of the table picture is less than or equal to the preset inclination value.
And the scanning unit is used for randomly selecting one side of the edge outer frame as a scanning side, respectively scanning on the basis of the vertical main direction and the horizontal main direction, and generating a cell if a straight line is scanned until all sides are scanned.
And the central point acquisition unit is used for acquiring the central points of all the units.
And the checking unit is used for checking the cells based on a preset checking direction, and if the distance between the center point of the cell and the center point of the text box corresponding to the cell is smaller than a preset distance value, combining two adjacent cells in the same preset checking direction until the checking in the vertical main direction and the horizontal main direction is finished to obtain the topological graph of the table picture.
And the reconstruction unit is used for filling the character content corresponding to the text box into the cell corresponding to the topological graph to obtain a reconstruction table.
In one embodiment, the table reconstructing apparatus for the table picture further includes:
and the confidence evaluation module is used for carrying out confidence evaluation on the reconstructed table.
Wherein the meaning of "first" and "second" in the above modules/units is only to distinguish different modules/units, and is not used to define which module/unit has higher priority or other defining meaning. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or modules is not necessarily limited to those steps or modules explicitly listed, but may include other steps or modules not explicitly listed or inherent to such process, method, article, or apparatus, and such that a division of modules presented in this application is merely a logical division and may be implemented in a practical application in a further manner.
For specific limitations of the table reconstruction apparatus for the table picture, reference may be made to the above limitations of the table reconstruction method for the table picture, and details are not repeated here. All or part of the modules in the table reconstructing device of the table pictures can be realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 7. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing data involved in the table reconstruction method of the table picture. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a table reconstruction method for a table picture.
In one embodiment, a computer device is provided, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and when the processor executes the computer program, the processor implements a table reconstruction method for a table picture in the above embodiments, such as the extensions from S101 to S106 shown in fig. 2 and other extensions of the method and related steps. Alternatively, the processor, when executing the computer program, implements the functions of the respective modules/units of the table reconstructing apparatus of the table picture in the above-described embodiment, for example, the functions of the modules 11 to 16 shown in fig. 6. To avoid repetition, further description is omitted here.
The Processor may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like which is the control center for the computer device and which connects the various parts of the overall computer device using various interfaces and lines.
The memory may be used to store the computer programs and/or modules, and the processor may implement various functions of the computer device by running or executing the computer programs and/or modules stored in the memory and invoking data stored in the memory. The memory may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, video data, etc.) created according to the use of the cellular phone, etc.
The memory may be integrated in the processor or may be provided separately from the processor.
In one embodiment, a computer-readable storage medium is provided, on which a computer program is stored, and the computer program is executed by a processor to implement the table reconstruction method of the table picture in the above embodiments, such as the extensions of S101 to S106 and other extensions and related steps of the method shown in fig. 2. Alternatively, the computer program, when executed by the processor, implements the functions of the modules/units of the table reconstructing apparatus of the table picture in the above-described embodiment, for example, the functions of the modules 11 to 16 shown in fig. 6. To avoid repetition, further description is omitted here.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present invention, and are intended to be included within the scope of the present invention.
- 上一篇:石墨接头机器人自动装卡簧、装栓机
- 下一篇:一种机器翻译引擎的构建方法、装置和设备