Method for constructing body model of regulation based on knowledge graph
1. A method for constructing an ontology model of regulations based on a knowledge-graph, for processing a text of the regulations by means of the knowledge-graph to form an ontology model, comprising the steps of:
step S1, obtaining the regulation text and identifying the part of speech of each word in the regulation text, further dividing the paragraphs of the regulation text according to subject, object and predicate;
step S2, constructing the subject and the object into a first graph node and a second graph node respectively;
step S3, constructing the predicate and the connection words in the object into a relation graph node;
step S4, respectively constructing a first relationship link for the first graph node and the second graph node, and linking with the relationship graph node, and further constructing a second relationship link for linking the first graph node and the second graph node;
step S5, determining whether the regulatory text has an index number and when determining that the index number exists, constructing the index number as an index map node, and constructing a third relationship connection for the index map node and respectively linking with the first map node, the second map node and the relationship map node;
step S6, abstracting the regulation text and determining entity classes corresponding to the first graph node, the second graph node and the relationship graph node, and relationship classes corresponding to the first relationship link, the second relationship link and the third relationship link;
step S7, building the onto-model based on the entity category, the relationship category, and the regulation text.
2. The method for constructing a regulatory ontology model based on a knowledge-graph of claim 1, wherein:
wherein the step S7 includes the following sub-steps:
step S7-1, constructing a corresponding model file based on the entity category and the relationship category;
step S7-2, performing entity extraction and marking on the regulation text based on the entity category to obtain a marked text;
and step S7-3, importing the mark text into the model file to obtain the ontology model.
3. The method for constructing a regulatory ontology model based on a knowledge-graph of claim 1, wherein:
wherein there are two paths between the first graph node and the second graph node: one is an independent triplet; the other is a predicate relationship.
Background
The airworthiness rule is the most basic part in the field of aviation, and the airworthiness rule must be completely matched in the initial stage of aircraft design, otherwise, the later modification cost is greatly increased. At present, the rules of airplane design are audited by field experts, in the industrial field, a large amount of labor and time cost is needed for cultivating one field expert, the domestic manpower is insufficient, and the complex auditing flow becomes the bottleneck of airplane design.
Knowledge graph is a graph-based storage method, mainly applied to search engines and intelligent question answering, and currently, there is a research for applying the knowledge graph to natural language processing. The judicial assistant function is mainly as follows: arranging time lines and clues of the legal documents for the legal officers to check; when a plurality of documents exist, the logical relations among the documents are integrated to obtain the probability of causal relation between two events.
However, when the knowledge graph is applied to the airworthiness rule, the terms of the airworthiness rule are more, and the text structure is complex, so that a large segment of characters are used as a graph node, and the content cannot be refined. Meanwhile, the conventional knowledge graph also has the defect that the sequence, Boolean calculation and conditional logic cannot be modeled, so that when the knowledge graph is applied to the airworthiness rule, the modeling content is further wide, and an ontology capable of completely expressing the airworthiness regulation is difficult to construct.
Disclosure of Invention
In order to solve the problems, the invention provides a method for constructing a regulation database by using a knowledge graph, which adopts the following technical scheme:
applied to the fields of database construction, intelligent question answering and the like
The invention provides a method for constructing a regulation ontology model based on a knowledge graph, which is used for processing a regulation text through the knowledge graph so as to form a functional ontology model, and is characterized by comprising the following steps: step S1, obtaining the regulation text and identifying the part of speech of each word in the regulation text, further dividing the paragraphs of the regulation text according to subject, object and predicate; step S2, constructing the subject and the object into a first graph node and a second graph node respectively; step S3, constructing the connection words in the predicates and the objects into relational graph nodes; step S4, respectively constructing a first relationship link for the first graph node and the second graph node and linking with the relationship graph node, and further constructing a second relationship link for linking the first graph node and the second graph node; step S5, judging whether the index number exists in the regulation text and constructing the index number as an index graph node when judging that the index number exists, and constructing a third relation connection for the index graph node and respectively linking with the first graph node, the second graph node and the relation graph node; step S6, abstracting the regulation text and determining entity classes corresponding to the first graph node, the second graph node and the relation graph node and relation classes corresponding to the first relation link, the second relation link and the third relation link; step S7, an ontology model is constructed based on the entity categories, the relationship categories, and the regulatory text.
The method for constructing the body model of the regulation based on the knowledge graph provided by the invention can also have the technical characteristics that the step S7 comprises the following sub-steps: step S7-1, constructing a corresponding model file based on the entity category and the relationship category; step S7-2, performing entity extraction and marking on the regulation text based on the entity category to obtain a marked text; and step S7-3, importing the mark text into the model file to obtain the ontology model.
The method for constructing the regulatory ontology model based on the knowledge graph provided by the invention can also have the technical characteristics that two paths exist between the first graph node and the second graph node: one is an independent triplet; the other is a predicate relationship.
Action and Effect of the invention
According to the method for constructing the body model of the regulation based on the knowledge graph, after paragraphs in the regulation text are divided according to the forms of the subject, the object and the predicate, all the connection words in the subject, the object, the predicate and the object are used as graph nodes, the relationship links among the graph nodes are constructed, the regulation text is further abstracted, the entity class corresponding to each graph node and the relationship class corresponding to each relationship link are determined, and the body model corresponding to the regulation text is constructed. According to the method for constructing the ontology model of the regulation, the regulation text can be processed into the ontology model which can finely express the regulation content and the logical relationship between the contents for storage, so that the user, other search engines, intelligent question answering and other programs can accurately inquire and judge the regulation on the basis of the ontology model, and a foundation is provided for constructing the airworthiness rule.
Drawings
FIG. 1 is a flow diagram of a method for constructing a regulatory ontology model based on a knowledge-graph in an embodiment of the present invention;
FIG. 2 is a schematic diagram of the structure of a conventional knowledge-graph in an embodiment of the present invention;
FIG. 3 is a structural diagram of an independent triple formed by adding an index (clause E) and changing a relationship C into a graph node in the embodiment of the present invention;
FIG. 4 is a schematic representation of various entities and relationships under the provisions of U.S. Federal regulations 25.651 in an embodiment of the present invention;
fig. 5 is a schematic illustration of a knowledge-graph constructed based on the terms of united states federal regulation 25.651 in an embodiment of the present invention.
FIG. 6 is a diagram of an abstracted entity list table in an embodiment of the invention;
FIG. 7 is a diagram of an abstracted relationship class table in an embodiment of the invention.
Detailed Description
In order to make the technical means, the creation features, the achievement purposes and the effects of the invention easy to understand, the method for constructing the ontology model based on the knowledge graph is specifically described below with reference to the embodiments and the accompanying drawings.
< example >
FIG. 1 is a flow chart of a method for constructing a regulatory ontology model based on a knowledge-graph in an embodiment of the present invention.
As shown in fig. 1, the method for constructing a regulatory ontology model based on a knowledge graph specifically includes the following steps:
step S1, obtaining the regulation text and identifying the part of speech of each word in the regulation text, and further dividing the paragraphs of the regulation text according to subject, object and predicate.
In this example, part 25 of the U.S. federal regulations, item 651 airworthiness rules are taken as an example:
(a)Limit load tests of control surfaces are required.These tests must include the horn or fitting to which the control system is attached.
(b)Compliance with the special factors requirements of§§25.619 through 25.625 and 25.657 for control surface hinges must be shown by analysis or individual load tests.
in step S1 of the present embodiment, the following division can be accomplished by detecting the above-mentioned regulations through a conventional part of speech analysis method:
the subject language includes: $ 25.651; limit load tests; control surface motions
The predicates are: require; include; a compliance; shown;
the object has: control surfaces; horn or shaping to which the control system is attached; the specific factors of $25.619 through 25.625 and 25.657; analysis or induced virtual load test.
In step S2, the subject and the object are constructed as a first graph node and a second graph node, respectively.
In step S3, the predicate and the conjuncts in the object are constructed as nodes of the relational graph.
In the conventional knowledge-graph construction method, as shown in fig. 2, in addition to constructing subjects and objects as knowledge-graph nodes (i.e., entity a and entity B), predicates are also constructed as relational links between two knowledge-graph nodes (i.e., relationship C).
In contrast, in the knowledge-graph constructing method of the present embodiment, as shown in fig. 3, after the subject and the object are constructed as the knowledge-graph nodes (i.e., the entity a and the entity B), the predicate and the conjuncts in the object are also constructed as the knowledge-graph nodes (i.e., the relationship C).
Step S4, a first relationship link is respectively constructed for the first graph node and the second graph node and linked with the relationship graph node, and a second relationship link for linking the first graph node and the second graph node is further constructed.
In this embodiment, the first relational link is a custom relationship, and is dedicated to link a predicate relationship node and an original graph node, and this relational link may be named at will, and may be named as long as it is not renamed with other relationships, for example, named blank, entity-to-relationship entity, and the like, and the purpose of the first relational link is to connect an entity and a relational node. . The relationship between the first graph node and the relationship graph node is the same as the relationship between the second graph node and the relationship graph node, and two continuous triples, namely, subject- > relationship- > predicate (or conjunction) - > relationship- > object, are formed.
In this embodiment, the second relational link is a parent relationship of the relational node (i.e., predicate). For example, if the predicate is "up", "upper", "not lower than", the relationship here is "position". If the classification is finer, it may be "up". The parent relationship of the logical relationship "and", "or" if "may be" logic ", and more detailed, may be" pool "or" judge ". If the user does not need a parent relationship, or the relationship partition is very fine, then the content of the parent relationship and the relationship node are the same.
In this embodiment, two paths exist between the first graph node and the second graph node: one is an independent triple (i.e., the paths of the two are first graph node- > second relationship link- > second graph node); the other is a predicate relationship (i.e., the path between the two is first graph node- > first relationship link- > relationship node- > first relationship link- > second graph node).
In this embodiment, when processing the 25 th part of federal regulations and airworthiness rule 651 in the united states, as shown in fig. 4, bold characters are nodes of a relationship graph that can construct independent triples, words marked with gray bottom color are entities (i.e., first graph nodes and second graph nodes) connected to the nodes of the relationship graph, and gray font characters need to be manually participated and judged according to actual semantics (or can be processed as a whole with the connected entities).
And step S5, judging whether the index number exists in the regulation text or not, and constructing the index number as an index map node when judging that the index number exists, and constructing a third relation link for the index map node and respectively linking with the first map node, the second map node and the relation map node.
In step S5 of this embodiment, if there is a specific index number in the text of the regulation, an independent triple may be reconstructed (if it is not possible to reconstruct the index number), and the structure shown in fig. 3 is formed. Where the index number is a regulatory number, such as "clause 21.123 of part F of the united states federal regulations. In constructing the independent triplets, a graph node is created based on the index number, with the contents being the regulation number, and pointing to all of the related nodes in the clause, namely the uppermost graph node in FIG. 3 (i.e., clause E) and the extended relationship (i.e., the third relationship link).
In this embodiment, after the processing from step S1 to step S5, the map structure finally formed by the 25 th part of the united states federal regulations and airworthiness rules 651 is shown in fig. 5.
At step S6, the regulatory text is abstracted and the entity classes corresponding to the first graph node, the second graph node, and the relational graph node, and the relationship classes corresponding to the first relational link, the second relational link, and the third relational link are determined.
In step S6 of this embodiment, a domain expert is required to abstract the text to determine the entity category and the relationship category according to the airworthiness rule. The abstract entity list table and relationship category table are shown in fig. 6 and 7, respectively, when processing title 25 of the federal regulations in the united states, airworthiness rules 651.
Step S7, an ontology model is constructed based on the entity categories, the relationship categories, and the regulatory text. The step S7 specifically includes the following sub-steps:
step S7-1, constructing a corresponding model file based on the entity category and the relationship category;
step S7-2, performing entity extraction and marking on the regulation text based on the entity category to obtain a marked text;
and step S7-3, importing the mark text into the model file to obtain the ontology model.
In step S7 of this embodiment, after the entity relationship definition in S6 is completed, first, an ontology model file is built using an ontology building tool (e.g., a project tool), then a marking tool (e.g., a branch tool) is used to manually mark the airworthiness rule full text (or an entity extraction algorithm may be used to perform automatic marking, but the automatic marking effect is relatively poor), and further, the marked document is recorded into the model file through a written script (or may be manually recorded when the data size is not large).
In this embodiment, the data in the model file, i.e., the data in fig. 5, is stored in the rdf/xml file. Finally, the constructed ontology model corresponds to the graph structure shown in fig. 5, the ontology model is stored in a computer held by a user, and when a program in the computer needs to identify airworthiness rules, the ontology model can be read and used for accurately performing functions such as retrieval, judgment and the like.
Examples effects and effects
According to the method for constructing the body model of the regulations based on the knowledge graph, after paragraphs in the regulation text are divided according to the forms of the subject, the object and the predicate, all the connection words in the subject, the object, the predicate and the object are used as graph nodes, the relationship links between the graph nodes are constructed, the regulation text is further abstracted, the entity class corresponding to each graph node and the relationship class corresponding to each relationship link are determined, and the body model corresponding to the regulation text is constructed. According to the method for constructing the ontology model of the regulation, the regulation text can be processed into the ontology model which can finely express the regulation content and the logical relationship between the contents for storage, so that the user, other search engines, intelligent question answering and other programs can accurately inquire and judge the regulation on the basis of the ontology model, and a foundation is provided for constructing the airworthiness rule.
In addition, in the embodiment, the ontology model constructed by the method expresses the logic in the regulation, and adds a path of more than one relationship node for two entity nodes, which is equivalent to a quadruple, thereby making up the deficiency of the expression of the traditional knowledge graph. Specifically, a large number of synonyms or expressions with the same meaning and different meanings often appear in regulations, the synonyms can be classified into one class by the parent relationship of the ontology model, different expressions can be distinguished by the relationship nodes, and the method can be applied when business needs exist simultaneously, namely, the synonyms need to be merged and different expressions need to be distinguished (for example, expression degrees).
The above-described embodiments are merely illustrative of specific embodiments of the present invention, and the present invention is not limited to the description of the above-described embodiments.
For example, in the above embodiment, an ontology model of airworthiness rules is constructed, and the construction method of the present invention may also be applied to the construction of ontology models of other regulations, for example, the construction of corresponding ontology models for the regulations such as traffic regulations, soft nail development processes, etc., so as to facilitate the realization of functions such as intelligent question answering, intelligent retrieval, etc. in the corresponding fields.
- 上一篇:石墨接头机器人自动装卡簧、装栓机
- 下一篇:一种应急处置预案方法和系统