Data query method, device and equipment

文档序号:7702 发布日期:2021-09-17 浏览:31次 中文

1. A method for querying data, comprising:

receiving a target text input by a target user;

preprocessing the target text to obtain a feature vector of the target text;

determining a target clustering label according to the feature vector of the target text; wherein the target cluster label is used for representing the classification to which the target text belongs;

and determining a query result corresponding to the target text based on the feature vector of the target text and the target cluster label.

2. The method of claim 1, wherein preprocessing the target text to obtain a feature vector of the target text comprises:

performing word segmentation on the target text to obtain a target word segmentation set;

and removing stop words in the target word segmentation set to obtain the feature vector of the target text.

3. The method of claim 1, further comprising, prior to determining a target cluster label from the feature vector of the target text:

acquiring a question-answer knowledge set from a target database; the question-answer knowledge set comprises a plurality of groups of question-answer data, and each group of question-answer data comprises a question text and an answer text;

preprocessing each group of question and answer data in the question and answer knowledge set to obtain characteristic vectors of a plurality of groups of question and answer data;

performing clustering analysis by using a K-means clustering algorithm according to the characteristic vectors of the multiple groups of question answering data to obtain a clustering result; the clustering result comprises a plurality of clustering centers, and each clustering center corresponds to one cluster;

determining a cluster label of each cluster according to the clustering result;

and obtaining a classification result of the question-answer knowledge set based on the clustering result and the clustering label of each cluster.

4. The method according to claim 3, wherein preprocessing each set of question-answering data in the question-answering knowledge set to obtain feature vectors of a plurality of sets of question-answering data comprises:

performing word segmentation on the question text and the answer text in each group of question and answer data to obtain a word segmentation set of each group of question and answer data;

and removing stop words in the participle set of each group of question and answer data to obtain the characteristic vector of each group of question and answer data.

5. The method according to claim 3, wherein performing clustering analysis by using a K-means clustering algorithm according to the feature vectors of the plurality of groups of question and answer data to obtain a clustering result comprises:

calculating a TF-IDF weight two-dimensional matrix of each feature word in the feature vectors of each group of question-answer data;

and performing clustering analysis by using a K-means clustering algorithm according to the TF-IDF weight two-dimensional matrix of each feature word in each group of question and answer data to obtain a clustering result.

6. The method of claim 5, wherein determining a cluster label for each cluster according to the clustering result comprises:

arranging the characteristic words contained in the target cluster in a descending order according to TF-IDF values of the characteristic words;

and taking the feature words with preset number before sequencing as cluster labels of the target clusters.

7. The method according to claim 5, further comprising, after obtaining the classification result of the question-answering knowledge set based on the clustering result and the clustering label of each cluster,:

and displaying the classification result of the question-answer knowledge set in a retrieval interface.

8. A data query apparatus, comprising:

the receiving module is used for receiving a target text input by a target user;

the preprocessing module is used for preprocessing the target text to obtain a feature vector of the target text;

the first determining module is used for determining a target cluster label according to the characteristic vector of the target text; wherein the target cluster label is used for representing the classification to which the target text belongs;

and the second determining module is used for determining a query result corresponding to the target text based on the feature vector of the target text and the target cluster label.

9. A data interrogation apparatus comprising a processor and a memory for storing processor-executable instructions which, when executed by the processor, implement the steps of the method as claimed in any one of claims 1 to 7.

10. A computer-readable storage medium having stored thereon computer instructions which, when executed, implement the steps of the method of any one of claims 1 to 7.

Background

Banking businesses are various in types and numerous in content, and are continuously and rapidly growing along with the development of the internet. Even under the premise that the intelligent customer service is widely applied, the bank customer service inevitably needs to reply a large number of customer questions every day in daily work.

In the prior art, when answering a customer question, the required answer is usually retrieved by searching the question keywords or searching the relevant documents. However, such a method has high requirements on timeliness and accuracy of question responses, and since such question-answering knowledge often has the characteristics of high discreteness, lack of statistical analysis, and the like, manual screening is required in search results to obtain correct answers. Therefore, the final determined result has certain subjective factors, and the time cost is increased. Therefore, the technical scheme in the prior art cannot accurately determine the corresponding answer according to the question input by the user.

In view of the above problems, no effective solution has been proposed.

Disclosure of Invention

The embodiment of the specification provides a data query method, a data query device and data query equipment, and aims to solve the problem that in the prior art, corresponding answers cannot be accurately determined according to problems input by a user.

An embodiment of the present specification provides a data query method, including: receiving a target text input by a target user; preprocessing the target text to obtain a feature vector of the target text; determining a target clustering label according to the feature vector of the target text; wherein the target cluster label is used for representing the classification to which the target text belongs; and determining a query result corresponding to the target text based on the feature vector of the target text and the target cluster label.

An embodiment of the present specification further provides a data query apparatus, including: the receiving module is used for receiving a target text input by a target user; the preprocessing module is used for preprocessing the target text to obtain a feature vector of the target text; the first determining module is used for determining a target cluster label according to the characteristic vector of the target text; wherein the target cluster label is used for representing the classification to which the target text belongs; and the second determining module is used for determining a query result corresponding to the target text based on the feature vector of the target text and the target cluster label.

The present specification also provides a data query device, which includes a processor and a memory for storing processor-executable instructions, and when the processor executes the instructions, the steps of any one of the method embodiments in the specification are implemented.

The present specification embodiments also provide a computer readable storage medium having stored thereon computer instructions which, when executed, implement the steps of any one of the method embodiments of the specification embodiments.

The embodiment of the specification provides a data query method, which can be used for preprocessing a target text input by a target user to obtain a feature vector of the target text. In order to determine the classification to which the target text belongs, the target cluster label may be determined according to the feature vector of the target text. Because the data volume stored in the database is large, and the question and answer knowledge often has the characteristics of high discreteness, lack of statistical analysis and the like, the query range can be determined through the target cluster label, the query is performed in the determined query range according to the feature vector of the target text, and the query result corresponding to the target text is determined. Therefore, the purposes of reducing the query range and improving the query efficiency can be achieved, and the accuracy of the query result can be improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the embodiments of the disclosure, are incorporated in and constitute a part of this specification, and are not intended to limit the embodiments of the disclosure. In the drawings:

FIG. 1 is a schematic diagram illustrating steps of a data query method provided in an embodiment of the present disclosure;

FIG. 2 is a schematic structural diagram of a data query device provided in an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of a data query device provided in an embodiment of the present specification.

Detailed Description

The principles and spirit of the embodiments of the present specification will be described with reference to a number of exemplary embodiments. It should be understood that these embodiments are presented merely to enable those skilled in the art to better understand and to implement the embodiments of the present description, and are not intended to limit the scope of the embodiments of the present description in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As will be appreciated by one skilled in the art, implementations of the embodiments of the present description may be embodied as a system, an apparatus, a method, or a computer program product. Therefore, the disclosure of the embodiments of the present specification can be embodied in the following forms: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.

Although the flow described below includes operations that occur in a particular order, it should be appreciated that the processes may include more or less operations that are performed sequentially or in parallel (e.g., using parallel processors or a multi-threaded environment).

Referring to fig. 1, the present embodiment may provide a data query method. The data query method can be used for determining the query range through the target cluster label, querying in the determined query range according to the characteristic vector of the target text, and determining the query result corresponding to the target text, so as to achieve the purposes of reducing the query range and improving the query efficiency. The data query method may include the following steps.

S101: target text input by a target user is received.

In this embodiment, a target text input by a target user may be received. The target text may be a text input by the target user in an input box of the search interface when the target user needs to perform a query, and the target text may be used for representing a query purpose of the target user.

In this embodiment, the target text may be one or more keywords, may also be a sentence, or may also be a paragraph, which may be determined according to actual situations, and this is not limited in this embodiment of the present specification.

S102: and preprocessing the target text to obtain a feature vector of the target text.

In this embodiment, because the format of the target text input by the target user is incorrect or some irrelevant redundant characters exist, in order to determine the main purpose of the target user query, the target text may be preprocessed to obtain the feature vector of the target text.

In this embodiment, the feature vector of the target text may include at least one feature word, and the feature vector may be, for example: < Beijing, weather >, it is understood that the feature vector of the target text may also be in other possible forms, which may be determined according to actual situations, and this is not limited in this embodiment of the present specification.

In this embodiment, the pretreatment may include: the data cleaning, word segmentation, word de-stop, and the like, and of course, the way of the preprocessing is not limited to the above examples, and other modifications are possible for those skilled in the art in light of the technical spirit of the embodiments of the present disclosure, but all that can be achieved by the method and the device are intended to be covered by the scope of the embodiments of the present disclosure as long as the functions and effects achieved by the method and the device are the same or similar to the embodiments of the present disclosure.

S103: determining a target clustering label according to the feature vector of the target text; the target cluster label is used for representing the classification to which the target text belongs.

In this embodiment, the target clustering label may be determined according to the feature vector of the target text, so as to determine the classification corresponding to the target text. The target cluster label may be used to indicate a category to which the target text belongs.

In this embodiment, the historical question and answer data stored in the database may be clustered in advance to obtain a plurality of cluster centers, so as to classify the historical question and answer data. Correspondingly, when the user queries, the cluster corresponding to the text input by the user may be determined first.

In the present embodiment, the similarity between the feature vector of the target text and the feature vectors of the individual pieces of historical question-and-answer data may be calculated, and the cluster to which the piece of historical question-and-answer data having the highest similarity belongs may be used as the cluster to which the feature vector of the target text belongs. The similarity between feature vectors can be determined by calculating cosine similarity, minkowski distance, and the like, and certainly, the calculation method of the similarity between feature vectors is not limited to the above example, and other modifications may be made by those skilled in the art within the spirit of the embodiments of the present disclosure, but as long as the functions and effects achieved by the features and effects are the same as or similar to those of the embodiments of the present disclosure, the scope of the embodiments of the present disclosure should be covered.

In this embodiment, the euclidean distance from the feature vector of the target text to each cluster center may also be calculated, and the cluster center of the shortest path is taken and classified into the category. Of course, it is understood that other ways of determining the cluster to which the feature vector of the target text belongs may be used.

In this embodiment, each cluster center may have a cluster label, and the cluster label may be used to identify a category of the cluster, and the cluster label may be one or more feature words. It is understood that the cluster labels may also be in other forms, such as a chart, etc., which may be determined according to practical situations, and this is not limited by the embodiments of the present disclosure.

S104: and determining a query result corresponding to the target text based on the feature vector of the target text and the target clustering label.

In this embodiment, the query result corresponding to the target text may be determined based on the feature vector of the target text and the target cluster label. Therefore, the query can be carried out under the classification corresponding to the target cluster label in the database, the purposes of reducing the query range and improving the query efficiency are achieved, and the accuracy of the query result can be improved.

In the embodiment, because the data amount stored in the database is large, and the question and answer knowledge often has the characteristics of high discreteness, lack of statistical analysis and the like, the query range can be determined through the target clustering label, and the query is performed in the determined query range according to the feature vector of the target text.

In this embodiment, the number of the query results corresponding to the target text may be one or more, and in some cases, the query results may also be null, and when the query results are null, default prompt information may be fed back to the target user, and the user is prompted whether the target text or the target clustering tag needs to be modified, which may be determined specifically according to actual situations, and this is not limited in this specification.

From the above description, it can be seen that the embodiments of the present specification achieve the following technical effects: the target text input by a target user can be received and preprocessed to obtain the feature vector of the target text. In order to determine the classification to which the target text belongs, the target cluster label may be determined according to the feature vector of the target text. Because the data volume stored in the database is large, and the question and answer knowledge often has the characteristics of high discreteness, lack of statistical analysis and the like, the query range can be determined through the target cluster label, the query is performed in the determined query range according to the feature vector of the target text, and the query result corresponding to the target text is determined. Therefore, the purposes of reducing the query range and improving the query efficiency can be achieved, and the accuracy of the query result can be improved.

In one embodiment, the preprocessing the target text to obtain the feature vector of the target text may include: and performing word segmentation on the target text to obtain a target word segmentation set. Further, stop words in the target word segmentation set can be removed, and the feature vector of the target text is obtained.

In this embodiment, because the format of the target text input by the target user is incorrect or some irrelevant redundant characters exist, the target text may be preprocessed to obtain the feature vector of the target text. Specifically, the target text may be segmented first to obtain a target segmentation set of the target text. Further, stop words in the target word segmentation set can be removed, so that the feature vector of the target text is obtained.

In this embodiment, the word segmentation may be a chinese word segmentation, which is used to segment a chinese character sequence into individual words, and the word segmentation is a process of recombining continuous word sequences into word sequences according to a certain specification. Stop words refer to that in information retrieval, some words or words are automatically filtered before or after processing natural language data (or text) in order to save storage space and improve search efficiency, for example: and so on.

In this embodiment, a stop word list may be maintained, and stop words in the target participle set may be removed according to the stop word list.

In one embodiment, before determining the target cluster label according to the feature vector of the target text, the method may further include: acquiring a question-answer knowledge set from a target database; the question-answer knowledge set comprises a plurality of groups of question-answer data, and each group of question-answer data comprises a question text and an answer text. And preprocessing each group of question and answer data in the question and answer knowledge set to obtain characteristic vectors of a plurality of groups of question and answer data. Further, according to the characteristic vectors of the multiple groups of question answering data, carrying out clustering analysis by using a K-means clustering algorithm to obtain a clustering result; the clustering result comprises a plurality of clustering centers, and each clustering center corresponds to one cluster. The cluster label of each cluster can be determined according to the clustering result, and the classification result of the question and answer knowledge set is obtained based on the clustering result and the cluster label of each cluster.

In this embodiment, the question-answer knowledge sets in the target database may be classified in advance, where the target database may be a data source of query, and may be queried in the target database according to the target text to obtain a query result. The question and answer data in the obtained question and answer knowledge set may be historical question and answer data recorded in a target database, and may also include artificially set question and answer data that is frequently retrieved, which may be determined specifically according to actual situations, and this is not limited in this specification.

In this embodiment, the way of preprocessing each group of question and answer data in the question and answer knowledge set may be the same as the way of preprocessing the target text, and repeated parts are not described again.

In this embodiment, a K-means clustering algorithm may be used to perform clustering analysis on feature vectors of multiple groups of question and answer data, so as to obtain multiple clustering centers, each clustering center may correspond to one cluster, and each cluster may include at least one group of question and answer data. The K-means clustering algorithm (K-means clustering algorithm) is an iterative solution clustering analysis algorithm, and comprises the steps of dividing data into K groups in advance, randomly selecting K objects as initial clustering centers, calculating the distance between each object and each seed clustering center, and allocating each object to the nearest clustering center. The cluster centers and the objects assigned to them represent a cluster. The cluster center of a cluster is recalculated for each sample assigned based on the objects existing in the cluster. This process will be repeated until some termination condition is met. The termination condition may be that no (or minimum number) objects are reassigned to different clusters, no (or minimum number) cluster centers are changed again, and the sum of squared errors is locally minimal.

In this embodiment, in order to distinguish different clusters, a cluster label of each cluster may be determined according to the clustering result, and the cluster label may be used to distinguish different classifications, for example, the cluster label may be a feature word such as loan and deposit, and may be specifically determined according to an actual situation, which is not limited in this description example.

In this embodiment, the clustering result and the clustering label of each cluster may be used as the classification result of the knowledge set of question answering.

In one embodiment, the preprocessing each group of question-answering data in the question-answering knowledge set to obtain feature vectors of a plurality of groups of question-answering data may include: and performing word segmentation on the question text and the answer text in each group of the question and answer data to obtain a word segmentation set of each group of the question and answer data. Stop words in the participle set of each group of question and answer data can be removed, and the feature vector of each group of question and answer data is obtained.

In this embodiment, because the format of the question-answering data may be incorrect or there are some irrelevant redundant characters, each group of question-answering data in the question-answering knowledge set may be preprocessed to obtain the feature vector of each group of question-answering data. Specifically, each group of question and answer data in the question and answer knowledge set may be participled to obtain a participle set of each group of question and answer data. Further, stop words in the participle set of each group of question-answering data can be removed, so that the feature vector of each group of question-answering data is obtained.

In an embodiment, performing clustering analysis by using a K-means clustering algorithm according to the feature vectors of the multiple sets of question-answering data to obtain a clustering result may include: and calculating a TF-IDF weight two-dimensional matrix of each feature word in the feature vectors of each group of question-answer data. Further, clustering analysis can be performed by using a K-means clustering algorithm according to the TF-IDF weight two-dimensional matrix of each feature word in each group of question and answer data to obtain a clustering result.

In the present embodiment, TF-IDF (Term Frequency-Inverse Document Frequency) is a commonly used weighting technique for information retrieval and data mining, TF is Term Frequency (Term Frequency), and IDF is Inverse Document Frequency (Inverse Document Frequency). TF-IDF is a statistical method for evaluating the importance of a word to one of a set of documents or a corpus, the importance of a word increasing in direct proportion to the number of occurrences of the word in the document, but decreasing in inverse proportion to the frequency of occurrences of the word in the corpus. Wherein, TF-IDF is calculated according to the following formula:

TF-IDF=TF×IDF

in this embodiment, the word frequency of each feature word in the feature vector of each group of question-answering data may be calculated first, and the calculation formula of the word frequency is as follows:

wherein, tfijThe word frequency of the ith characteristic word in the file set j is obtained; n isi,jThe number of times of the ith characteristic word appearing in the file set j is shown; n iskjThe number of times of occurrence of the kth entry in the file set j is taken as the number of times of occurrence of the kth entry in the file set j; sigmaknkjThe sum of the number of occurrences of all entries in file set j.

In this embodiment, the inverse file frequency may be further calculated, and a calculation formula of the inverse file frequency is as follows:

wherein idfiThe reverse file frequency of the ith characteristic word; d is the total number of records of the question answering data in the target database; t is tiIs the ith characteristic word; i { j: ti∈djAnd | is the total number of files containing the ith feature word.

In the embodiment, the TF-IDF value of each feature word in the feature vector of each group of question and answer data in the file set j can be obtained according to the word frequency and the reverse file frequency, and further the TF-IDF weight two-dimensional matrix ω [ i ] [ j ] of each feature word in different file sets can be obtained.

In this embodiment, unimportant feature words (TF-IDF is smaller than a preset threshold) may be filtered according to the TF-IDF weight two-dimensional matrix of each feature word in the feature vector of each set of question and answer data obtained through calculation, so as to retain the important feature words.

In this embodiment, performing clustering analysis by using a K-means clustering algorithm according to the TF-IDF weight two-dimensional matrix of each feature word in each group of question-answer data to obtain a clustering result may include the following steps:

step 300: and setting a clustering convergence threshold Delta and the maximum iteration number N.

Step 301: and randomly selecting a group of question and answer data from the question and answer knowledge set as an initial clustering central point.

Step 302: and calculating the shortest distance (namely the Euclidean distance of the nearest cluster center) between each row in the omega [ i ] [ j ] and the current existing cluster center, and expressing the shortest distance by D (x). Each row in ω [ i ] [ j ] represents a sample, i.e., a set of question and answer data.

Then, the probability of each sample point being selected as the next clustering center is calculated, and the next clustering center is selected according to a roulette method. The wheel disc method is a proportion selection method, and the basic idea is as follows: the probability of each individual being selected is proportional to the magnitude of its fitness function value.

Step 303: step 302 is repeated until K cluster centers are selected.

Step 304: and taking the selected K clustering centers as initial values, calculating the distance from each remaining row in the omega i j to the K initial clustering centers, and classifying the point to one of K central points according to the minimum distance. And recording the minimum distance into an array A [ k ] [ m ], wherein k is the serial number of the clustering center, and m is the serial number of the sample point classified into the clustering center.

Step 305: the centroids of the K clusters are recalculated. And respectively comparing the newly generated K centroids with the clustering center points calculated last time, if the distances are all smaller than a convergence threshold Delta, indicating that the whole process is converged, calculating the average distance a [ K ] from all sample points in the cluster to the center points according to the A [ K ] [ m ], calculating the distances of any two center points, judging whether the distances are smaller than the average distance in the two clusters, if so, combining the two clusters, and finishing the algorithm under the condition of convergence.

Otherwise, judging whether the iteration times are larger than N, if so, judging that the K value is unreasonable to select at the moment, obtaining partial local optimal solutions, and recalculating by increasing the K value to make the whole body more easily converged. The start step 301 may be re-executed after adjusting the cluster number K + K/2 until the algorithm ends.

If the iteration times are less than or equal to N, whether clusters needing to be combined exist can be further judged if the iteration times are converged, if the clusters need to be combined, the clusters are combined, and the algorithm is ended after the combination.

In the embodiment, when two clusters are 'close', the two clusters are combined, and when convergence cannot be achieved after iteration is performed for N times, the K value is enlarged to adjust the cluster center, so that a more reasonable cluster center value is obtained.

In one embodiment, determining a cluster label of each cluster according to the clustering result may include: and arranging the characteristic words contained in the target cluster in a descending order according to TF-IDF values of the characteristic words, and taking the characteristic words with preset number before the arrangement as cluster labels of the target cluster.

In this embodiment, since the TF-IDF value may be used to evaluate the importance of a word to one of a set of files or a corpus, the feature words included in the target cluster may be sorted in a descending order according to the TF-IDF value of the feature words, and a preset number of feature words before sorting may be used as cluster labels of the target cluster, so as to obtain the cluster labels of each cluster.

In this embodiment, the preset number may be a positive integer, for example: 1. 3, 6, etc., which can be determined according to practical situations and are not limited in the embodiments of the present specification.

In an embodiment, after obtaining the classification result of the question-answering knowledge set based on the clustering result and the clustering label of each cluster, the method may further include: and displaying the classification result of the question-answer knowledge set in a retrieval interface.

In this embodiment, the classification result of the question and answer knowledge set may be displayed through a front-end page, so as to provide a classification function to provide a query service for a user or a manual customer service.

In this embodiment, the classification result may include a plurality of classifications, and the name of each classification may be represented by a cluster label of each cluster. In some embodiments, only the names of the categories may be displayed in the search interface, or the names of the categories and the corresponding representative question and answer data may be displayed in the search interface together to give a user or a human customer service an example explanation, so that the meaning of the categories can be understood more intuitively and clearly. Of course, the manner of displaying the classification result is not limited to the above examples, and other modifications may be made by those skilled in the art within the spirit of the embodiments of the present disclosure, but the function and effect of the embodiments of the present disclosure are also within the scope of the embodiments of the present disclosure.

Based on the same inventive concept, the embodiment of the present specification further provides a data query apparatus, as described in the following embodiments. Because the principle of solving the problem of the data query device is similar to that of the data query method, the implementation of the data query device can refer to the implementation of the data query method, and repeated details are not repeated. As used hereinafter, the term "unit" or "module" may be a combination of software and/or hardware that implements a predetermined function. Although the means described in the embodiments below are preferably implemented in software, an implementation in hardware, or a combination of software and hardware is also possible and contemplated. Fig. 2 is a block diagram of a data query device according to an embodiment of the present disclosure, and as shown in fig. 2, the data query device may include: the receiving module 201, the preprocessing module 202, the first determining module 203, and the second determining module 204, which will be described below.

A receiving module 201, which may be used to receive a target text input by a target user;

the preprocessing module 202 may be configured to preprocess the target text to obtain a feature vector of the target text;

a first determining module 203, configured to determine a target cluster label according to the feature vector of the target text; wherein the target cluster label is used for representing the classification to which the target text belongs;

the second determining module 204 may be configured to determine a query result corresponding to the target text based on the feature vector of the target text and the target cluster label.

The embodiment of the present specification further provides an electronic device, which may specifically refer to a schematic structural diagram of an electronic device based on the data query method provided by the embodiment of the present specification, shown in fig. 3, where the electronic device may specifically include an input device 31, a processor 32, and a memory 33. The input device 31 may be specifically configured to input a target text. The processor 32 may be specifically configured to receive target text input by a target user; preprocessing the target text to obtain a feature vector of the target text; determining a target clustering label according to the feature vector of the target text; wherein the target cluster label is used for representing the classification to which the target text belongs; and determining a query result corresponding to the target text based on the feature vector of the target text and the target cluster label. The memory 33 may be specifically configured to store parameters such as a feature vector of a target text, a target cluster tag, and the like.

In this embodiment, the input device may be one of the main apparatuses for information exchange between a user and a computer system. The input device may include a keyboard, a mouse, a camera, a scanner, a light pen, a handwriting input board, a voice input device, etc.; the input device is used to input raw data and a program for processing the data into the computer. The input device can also acquire and receive data transmitted by other modules, units and devices. The processor may be implemented in any suitable way. For example, the processor may take the form of, for example, a microprocessor or processor and a computer-readable medium that stores computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, an embedded microcontroller, and so forth. The memory may in particular be a memory device used in modern information technology for storing information. The memory may include multiple levels, and in a digital system, the memory may be any memory as long as it can store binary data; in an integrated circuit, a circuit without a physical form and with a storage function is also called a memory, such as a RAM, a FIFO and the like; in the system, the storage device in physical form is also called a memory, such as a memory bank, a TF card and the like.

In this embodiment, the functions and effects specifically realized by the electronic device can be explained by comparing with other embodiments, and are not described herein again.

Embodiments of the present specification further provide a computer storage medium based on a data query method, where the computer storage medium stores computer program instructions, and when the computer program instructions are executed, the computer storage medium may implement: receiving a target text input by a target user; preprocessing the target text to obtain a feature vector of the target text; determining a target clustering label according to the feature vector of the target text; wherein the target cluster label is used for representing the classification to which the target text belongs; and determining a query result corresponding to the target text based on the feature vector of the target text and the target cluster label.

In this embodiment, the storage medium includes, but is not limited to, a Random Access Memory (RAM), a Read-Only Memory (ROM), a Cache (Cache), a Hard Disk Drive (HDD), or a Memory Card (Memory Card). The memory may be used to store computer program instructions. The network communication unit may be an interface for performing network connection communication, which is set in accordance with a standard prescribed by a communication protocol.

In this embodiment, the functions and effects specifically realized by the program instructions stored in the computer storage medium can be explained by comparing with other embodiments, and are not described herein again.

It will be apparent to those skilled in the art that the modules or steps of the embodiments of the present specification described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed over a network of multiple computing devices, and alternatively, they may be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, and in some cases, the steps shown or described may be performed in an order different from that described herein, or they may be separately fabricated into individual integrated circuit modules, or multiple ones of them may be fabricated into a single integrated circuit module. Thus, embodiments of the present description are not limited to any specific combination of hardware and software.

Although the embodiments herein provide the method steps as described in the above embodiments or flowcharts, more or fewer steps may be included in the method based on conventional or non-inventive efforts. In the case of steps where no causal relationship is logically necessary, the order of execution of the steps is not limited to that provided by the embodiments of the present description. When the method is executed in an actual device or end product, the method can be executed sequentially or in parallel according to the embodiment or the method shown in the figure (for example, in the environment of a parallel processor or a multi-thread processing).

It is to be understood that the above description is intended to be illustrative, and not restrictive. Many embodiments and many applications other than the examples provided will be apparent to those of skill in the art upon reading the above description. The scope of embodiments of the present specification should, therefore, be determined not with reference to the above description, but should instead be determined with reference to the appended claims, along with the full scope of equivalents to which such claims are entitled.

The above description is only a preferred embodiment of the embodiments of the present disclosure, and is not intended to limit the embodiments of the present disclosure, and it will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present disclosure. Any modification, equivalent replacement, improvement and the like made within the spirit and principle of the embodiments of the present disclosure should be included in the protection scope of the embodiments of the present disclosure.

完整详细技术资料下载
上一篇:石墨接头机器人自动装卡簧、装栓机
下一篇:针对大数据的云业务话题信息处理方法及大数据服务器

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!