Data processing method and device

文档序号:7569 发布日期:2021-09-17 浏览:26次 中文

1. A data processing method, comprising:

if a query request aiming at a basic data set by a data access party is received, acquiring alternative data provided by each data provider in one or more data providers aiming at a target data item in the basic data set in a target data providing period to obtain one or more alternative data;

acquiring the historical interest of each data provider, wherein the historical interest is obtained according to the comprehensive interest of all the historical data providing periods of each data provider before the target data providing period;

determining the confidence degree of each alternative data in the one or more alternative data, and using the alternative data with the highest confidence degree as target data, wherein the confidence degree of each alternative data is determined according to the historical rights and interests of each target data provider in one or more target data providers providing each alternative data;

and determining a basic data set according to the target data, and sending the basic data set to the data access party.

2. The method of claim 1, wherein determining the confidence level for each of the one or more candidate data comprises:

determining one or more target data providers that provide the each of the alternative data;

acquiring the historical rights and interests of each target data provider in the one or more target data providers to obtain one or more historical rights and interests;

and determining the confidence level of each alternative data according to the one or more historical rights.

3. The method according to claim 2, wherein the target data providing period is an Nth data providing period, N ≧ 2 and N is an integer; the determining one or more target data providers that provide the each of the alternative data includes:

acquiring one or more data providers which provide each alternative data in each historical data providing period in the first N-1 historical data providing periods;

and taking all the data providers which provide each piece of alternative data and are acquired in the first N-1 data providing periods as one or more target data providers of each piece of alternative data.

4. The method of claim 1, further comprising:

determining one or more target data providers that provide the target data;

if the target data is not provided in all historical data providing periods and the number of one or more target data providers providing the target data in the target data providing periods is one, determining the traffic interest of the target data provider providing the target data based on the consumption of the target data by the data access party, wherein the traffic interest and the consumption of the target data by the data access party are in a positive correlation trend;

and determining the comprehensive rights of the target data provider in the target data providing period according to the traffic rights, wherein the comprehensive rights of the target data provider in the target data providing period and the traffic rights show a positive correlation trend.

5. The method of claim 4, wherein the target data provider is the data access, the method further comprising:

determining the consumption rights of the target data provider based on the consumption amount of the target data provider to the target data, wherein the consumption rights and the consumption amount of the target data provider to the target data are in a positive correlation trend;

the determining the comprehensive rights of the target data provider in the target data providing period according to the traffic rights comprises the following steps:

and determining the comprehensive rights and interests of the target data provider in the target data providing period according to the flow rights and the consumption rights.

6. The method of claim 1, further comprising:

determining one or more target data providers that provide the target data;

if the target data is not provided in all historical data providing periods and the number of one or more target data providers providing the target data in the target data providing periods is multiple, determining the traffic interest of the target data provider providing the target data based on the consumption of the target data by the data access party, wherein the traffic interest and the consumption of the target data by the data access party are in a positive correlation trend;

determining a composite interest of each of the one or more target data providers during the target data provision period based on the traffic interest and the number of target data providers.

7. The method of claim 1, further comprising:

if the target alternative data provided by the data provider for the target data item is acquired in any data providing period after the target data providing period, and the number of the data providers providing the target alternative data is multiple, acquiring the historical rights of each data provider providing the target alternative data in the data providing period providing the target alternative data;

determining the confidence of the target alternative data according to the historical rights of each data provider providing the target alternative data;

if the confidence degree of the target alternative data is greater than that of the target data, taking the target alternative data as the target data;

and determining a basic data set according to the target data, and sending the basic data set to the data access party.

8. The method of claim 7, further comprising:

deducting one or more target data providers providing the target data, a sum of traffic rights and interests for the target data in all historical data providing periods prior to a data providing period in which the target alternative data is the target data;

and allocating the deducted traffic interest sum to each data provider providing the target alternative data.

9. The method of claim 1, wherein the target data providing period is an nth data providing period, N being a positive integer, the method further comprising:

if the target data provided by the data provider is acquired in the Mth data providing period and the target data is kept unchanged in the Mth data providing period, M is larger than N and is an integer;

determining a traffic interest of the one or more target data providers during the Mth data providing period based on the consumption amount of the target data by the data access party during the Mth data providing period, wherein the traffic interest of the target data providers during the Mth data providing period is zero.

10. A data processing apparatus, comprising:

the data access unit is used for acquiring one or more pieces of alternative data provided by each data provider in one or more data providers in a target data providing period according to a target data item in a basic data set if a query request of the data access party for the basic data set is received;

the obtaining unit is further configured to obtain a historical interest of each data provider, where the historical interest is obtained according to a composite interest of all historical data providing periods of each data provider before the target data providing period;

a determining unit, configured to determine a confidence level of each of the one or more candidate data, and use a candidate data with a highest confidence level as target data, where the confidence level of each of the candidate data is determined according to a historical interest of each of one or more target data providers that provide each of the candidate data;

and the sending unit is used for determining a basic data set according to the target data and sending the basic data set to the data access party.

Background

With the vigorous development of the internet technology, one data platform can cooperate with a plurality of data providers to integrate and obtain a relatively comprehensive and accurate basic data set for users to use, so that the competitiveness of the data platform in similar products is improved; the basic data set is: data that may be disclosed that does not relate to sensitive information. In the case of multiple data providers, when a data platform integrates data, there may be a data conflict, and the term data conflict may be understood as: different data providers provide different data values for the same data item, such as: aiming at the address of a market A, the data provided by a provider A is a block A, and the data provided by a provider B is a block B. Therefore, how to efficiently integrate data provided by various data providers becomes a current research hotspot.

Disclosure of Invention

The embodiment of the application provides a data processing method and device, which can improve the efficiency of a data platform in fusing data.

In one aspect, an embodiment of the present application provides a data processing method, including:

if a query request aiming at a basic data set by a data access party is received, acquiring alternative data provided by each data provider in one or more data providers aiming at a target data item in the basic data set in a target data providing period to obtain one or more alternative data;

acquiring the historical interest of each data provider, wherein the historical interest is obtained according to the comprehensive interest of all the historical data providing periods of each data provider before the target data providing period;

determining the confidence degree of each alternative data in the one or more alternative data, and using the alternative data with the highest confidence degree as target data, wherein the confidence degree of each alternative data is determined according to the historical rights and interests of each target data provider in one or more target data providers providing each alternative data;

and determining a basic data set according to the target data, and sending the basic data set to the data access party.

In one aspect, an embodiment of the present application provides a data processing apparatus, including:

the data access unit is used for acquiring one or more pieces of alternative data provided by each data provider in one or more data providers in a target data providing period according to a target data item in a basic data set if a query request of the data access party for the basic data set is received;

the obtaining unit is further configured to obtain a historical interest of each data provider, where the historical interest is obtained according to a composite interest of all historical data providing periods of each data provider before the target data providing period;

a determining unit, configured to determine a confidence level of each of the one or more candidate data, and use a candidate data with a highest confidence level as target data, where the confidence level of each of the candidate data is determined according to a historical interest of each of one or more target data providers that provide each of the candidate data;

and the sending unit is used for determining a basic data set according to the target data and sending the basic data set to the data access party.

In one aspect, an embodiment of the present application provides a data platform, including:

a processor adapted to implement one or more computer programs;

a computer storage medium storing one or more computer programs adapted to be loaded and executed by the processor to:

if a query request aiming at a basic data set by a data access party is received, acquiring alternative data provided by each data provider in one or more data providers aiming at a target data item in the basic data set in a target data providing period to obtain one or more alternative data; acquiring the historical interest of each data provider, wherein the historical interest is obtained according to the comprehensive interest of all the historical data providing periods of each data provider before the target data providing period; determining the confidence degree of each alternative data in the one or more alternative data, and using the alternative data with the highest confidence degree as target data, wherein the confidence degree of each alternative data is determined according to the historical rights and interests of each target data provider in one or more target data providers providing each alternative data; and determining a basic data set according to the target data, and sending the basic data set to the data access party.

In one aspect, embodiments of the present application provide a storage medium storing one or more computer programs, the one or more computer programs being adapted to be loaded by a processor and executed to:

if a query request aiming at a basic data set by a data access party is received, acquiring alternative data provided by each data provider in one or more data providers aiming at a target data item in the basic data set in a target data providing period to obtain one or more alternative data; acquiring the historical interest of each data provider, wherein the historical interest is obtained according to the comprehensive interest of all the historical data providing periods of each data provider before the target data providing period; determining the confidence degree of each alternative data in the one or more alternative data, and using the alternative data with the highest confidence degree as target data, wherein the confidence degree of each alternative data is determined according to the historical rights and interests of each target data provider in one or more target data providers providing each alternative data; and determining a basic data set according to the target data, and sending the basic data set to the data access party.

In one aspect, embodiments of the present application provide a computer program product or a computer program, where the computer program product includes a computer program, and the computer program is stored in a computer storage medium; the computer program is read from a computer storage medium by a processor of the data platform, and the computer program is executed by the processor to cause the data platform to perform:

if a query request aiming at a basic data set by a data access party is received, acquiring alternative data provided by each data provider in one or more data providers aiming at a target data item in the basic data set in a target data providing period to obtain one or more alternative data; acquiring the historical interest of each data provider, wherein the historical interest is obtained according to the comprehensive interest of all the historical data providing periods of each data provider before the target data providing period; determining the confidence degree of each alternative data in the one or more alternative data, and using the alternative data with the highest confidence degree as target data, wherein the confidence degree of each alternative data is determined according to the historical rights and interests of each target data provider in one or more target data providers providing each alternative data; and determining a basic data set according to the target data, and sending the basic data set to the data access party.

According to the data fusion method and device, when the target data item has data conflict, the candidate data corresponding to the target data item are obtained, the historical rights and interests of one or more data providers corresponding to each candidate data item are determined, and then the confidence coefficient of each candidate data item is determined based on the historical rights and interests of the one or more data providers, so that data fusion is carried out, and the accuracy of the data provided for the user by the data platform is improved.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1a is a schematic diagram of a data flow in a data platform according to an embodiment of the present disclosure;

FIG. 1b is a schematic diagram of a data processing system provided by an embodiment of the present application;

fig. 2 is a schematic flowchart of a data processing method according to an embodiment of the present application;

fig. 3a is a schematic structural diagram of a block chain according to an embodiment of the present application;

fig. 3b is a schematic structural diagram of a blockchain network according to an embodiment of the present invention;

FIG. 3c is a schematic diagram illustrating a process of determining target data according to an embodiment of the present application;

FIG. 4 is a schematic flow chart diagram illustrating a further data processing method provided in an embodiment of the present application;

fig. 5 is a schematic diagram of a data processing apparatus according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of a data platform according to an embodiment of the present application.

Detailed Description

Thanks to the development of internet technology, a data platform can integrate information provided by multiple data providers (e.g., multiple service parties) by using the advantages of the internet, so that a user can obtain information related to services corresponding to the service parties on the data platform, such as: the medical data platform can acquire relevant information of doctor live broadcast service, on-line inquiry service, registration service and other services, and illustratively, the doctor live broadcast service can comprise doctor name, doctor job title, hospital address and other data; the on-line inquiry service can comprise data such as doctor names, doctor job titles, disease names and the like; the registration service may include data such as doctor's name, hospital's address, department's name, etc. It is understood that the same data item may be used in different services, such as: the data item of the doctor job title of the doctor a may be used in both the registration service and the online inquiry service, and if the data between the services is not uniform, the situation that the data for the same data item seen by the user in different services is different may occur, for example: the job title of doctor a seen in the registration service is inconsistent with the job title of doctor a seen in the online inquiry service, thereby affecting user experience. Therefore, data fusion processing needs to be performed on data provided by each service party to obtain uniform data with high confidence for users to use. In practical applications, under the condition that one data platform has multiple data providers, data fusion may encounter conflicts, and at this time, the data platform generally places similar objects in the same group according to some characteristic values, and makes a judgment through each data access party to select data required by each data access party. However, in the process of the judgment of the data access party, data used by other data access parties may be modified, so that the services of other data access parties are affected.

Wherein, the data platform includes one or more basic data sets, each basic data set includes one or more data, the data items corresponding to different data are different, each data item can correspond to one or more data, and exemplarily, a so-called basic data set can be understood as: a complete piece of information, such as: doctor information (e.g., doctor a-ophthalmology-chief physician), college information, etc.; the data item can then be understood as: the fields that make up the complete piece of information, such as: the doctor name or the doctor job title, etc. constituting the doctor information, and further, for example, the college name or the college address, etc. constituting the college information. It is further understood that the data provider may then refer to: a business party which is in communication connection with the data platform and can provide the data platform with the alternative data of each data item owned by the business party; the data access party may refer to: an application or client that uses the underlying data set in the data platform, etc.

In order to solve the above problem, an embodiment of the present application provides a data processing scheme, where when data provided by each service party is merged, a traffic interest certificate is constructed by calculating traffic brought by each service party for a data platform, and data used by each service party is unified based on the traffic interest certificate, so that the data platform can efficiently collect relatively comprehensive data, and becomes an open data platform. Specifically, the general principle of this data processing scheme is as follows: after receiving a query request of a data access party for a basic data set, acquiring alternative data provided by each data provider cooperating with the data platform for a target data item in the basic data set in a target data providing period, wherein the target data providing period can be any one data providing period, and the data platform can receive the alternative data of any data item provided by one or more data providers in each data providing period. Further, in the target data providing period, if there is a case that the target data item corresponds to multiple different candidate data, determining a data provider (which may be one or more) of each of the multiple different candidate data items, and then the data platform may determine a confidence level of each of the candidate data items by determining a historical benefit of one or more data providers, so as to select the candidate data with the highest confidence level, determine a basic data set of the user query according to the candidate data, and then send the basic data set to the data access party.

Wherein, the target data item may refer to: the data platform includes any one of all basic data sets, any one of which includes one or more data items, for example, when the basic data set is the college information, if the college information includes two data items, namely, the college name and the college address, then in this basic data set of the college information, the target data item may be the college name or the college address, and when the target data item is the college name, the alternative data may be: XX information academy, XX engineering academy, etc. The historical rights of each data provider are related to the alternative data provided by the data provider in the historical data providing period, and specifically, if the more alternative data is adopted in all the alternative data provided by the data provider in the historical data providing period, the higher the historical rights corresponding to the data provider are; accordingly, the less candidate data that is adopted, the lower the history rights corresponding to the data provider.

For example, the data flow among the data provider, the data access party and the data platform can be as shown in fig. 1a, the data provider provides a basic data set to the data platform, the data platform saves the produced basic data set (i.e., the basic data set provided by the data provider) in a block form, then the data access party periodically provides basic data set consumption details to the data platform after consuming the existing basic data set from the block where the basic data set is stored, and pays a fee to the data platform based on the traffic corresponding to the data consumption details, and the data platform can allocate economic benefits to the data provider according to the traffic rights of the data provider after obtaining the fee. For example, assuming that the "school address of school X" is from the data provider B, if the data access party a uses the "school address of school X", the data access party a needs to provide a traffic certification using the "school address of school X" to the data platform, and pay the data platform based on the traffic in the traffic certification, and the data platform may distribute the fee paid by the data access party a to the data provider B based on the traffic interest of the data provider in the target data providing period. The data access party needs to pay for the data platform according to the flow, so that the problem that the data access party reports more flow is limited from the aspect of customers, and for the condition that the data access party reports less flow, the data platform can randomly access one or more services of the data access party and check whether the flow of the data access party is increased or not after the access; if the data access party is not increased, the data access party is proved to have the condition of under-reporting traffic, and at the moment, the data platform can perform appropriate punishment on the data access party, so that the behavior of under-reporting traffic can be effectively prevented.

In a specific implementation, the above-described data processing scheme may be applied in a data processing system as shown in fig. 1b, which comprises one or more data providers 11, a data platform 12, and one or more data accessors 13, as shown in fig. 1 b. Wherein each of the one or more data providers, or each of the one or more data accesses, may be a terminal or a client operating in a terminal; the data platform may reside in a data platform, which may be a terminal or a server. Among others, terminals may include, but are not limited to: smart phones, tablet computers, notebook computers, desktop computers, smart televisions, and the like; and the user can log in the client by an account login method, and then provides alternative data to the server by the client, and the like. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform, and the like.

Based on the data processing scheme and the related description of the data processing system, the embodiment of the application provides a data processing method, which can be executed by the data platform; referring to fig. 2, the data processing method includes:

s201, if a query request aiming at the basic data set by the data access party is received, acquiring alternative data provided by each data provider in one or more data providers aiming at a target data item in the basic data set in a target data providing period, and acquiring one or more alternative data.

In practical applications, the data platform may establish a communication connection with one or more data providers to obtain alternative data provided by the one or more data providers for the target data item, where the one or more data providers refer to: at least one data provider. In the target data providing period, the manner for acquiring the alternative data provided by any data provider for the target data item by the data platform may be: and the data platform receives the private data synchronized with the data platform from any data provider after the data platform accesses the data platform, and then stores the private data synchronized with the data provider to the block corresponding to the target data providing period in the form of the block.

The data processing system may form a blockchain network, and the node devices in the blockchain network may include a data platform, at least one data provider, and at least one data access. After one or more of the at least one data providers send the alternative data provided for the target data item to the data platform, the data platform may generate a blob from the alternative data provided for the target data item by each of the one or more data providers and publish the blob to the blockchain network.

Taking the block chain structure diagram shown in fig. 3a as an example, whenever new candidate data needs to be written into the block chain, the new candidate data is gathered into one block (block), and added to the end of the existing block chain, and it is ensured that the newly added blocks of each node are completely the same through a consensus algorithm. Each block is recorded with a plurality of data synchronous records and simultaneously contains the hash value of the previous block, and all blocks store the hash value of the previous block in this way and are connected in sequence to form a block chain. The hash value of the previous block is stored in the block header of the next block in the block chain, and when the target data of the data item in the previous block changes, the hash value of the current block changes, so that the target data uploaded to the block chain network is difficult to tamper. In order to better understand the data processing method provided in the embodiment of the present application, a block chain network used in the embodiment of the present application will be described below. In practical application, the blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer. The block chain underlying platform can comprise processing modules for user management, basic service, intelligent contracts, operation monitoring and the like, wherein the user management module is responsible for identity information management of all block chain participants, and comprises public and private key generation maintenance (account management), key management, user real identity and block chain address corresponding relation maintenance (authority management) and the like, and under the authorization condition, transaction conditions of certain real identities are supervised and audited, and rule configuration (wind control audit) of risk control is provided; the basic service module is deployed on all block chain node equipment and used for verifying the validity of the service request, recording the service request to storage after consensus on the valid request is completed, for a new service request, the basic service firstly performs interface adaptation analysis and authentication processing (interface adaptation), then encrypts service information (consensus management) through a consensus algorithm, transmits the service information to a shared account (network communication) completely and consistently after encryption, and performs recording and storage; the intelligent contract module is responsible for registering and issuing contracts, triggering the contracts and executing the contracts, developers can define contract logics through a certain programming language, issue the contract logics to a block chain (contract registration), call keys or other event triggering and executing according to the logics of contract clauses, complete the contract logics and simultaneously provide the function of upgrading and canceling the contracts; the operation monitoring module is mainly responsible for deployment, configuration modification, contract setting, cloud adaptation in the product release process and visual output of real-time states in product operation, such as: alarm, monitoring network conditions, monitoring node equipment health status, and the like. In addition, the platform product service layer is used for providing basic capability and an implementation framework of typical application, and developers can complete block chain implementation of business logic based on the basic capability and the characteristics of superposed business; and the application service layer provides the application service based on the block chain scheme for the service participants to use.

Referring to fig. 3b, fig. 3b is a schematic structural diagram of a blockchain network provided in this embodiment of the present application, and as shown in fig. 3b, the blockchain network includes at least one data provider 301, at least one first blockchain link device 302, and at least one second blockchain link device 303. The data provider 301 may be a client operating in a terminal device, the first block link point device 302 may be any one node device in the block link point devices, and the first block link point device 302 is a block link point device selected by all block link point devices in the block chain network according to a consensus algorithm, where the consensus algorithm includes, but is not limited to, a Proof of Work (PoW) algorithm, a Proof of rights and interests (PoS) algorithm, a granted Proof of rights and interests (DPoS) algorithm, a Practical Byzantine Fault Tolerance (PBFT) algorithm, and the like. The second blockchain link point device is the other blockchain node device in the blockchain network excluding the first blockchain link point device. The first block link point equipment can be obtained by periodic election through a consensus algorithm, and the first block link point equipment obtained by periodic election in different periods can be the same or different.

S202, acquiring the historical rights and interests of each data provider.

In practical applications, the historical rights of each data provider can be determined according to the comprehensive rights acquired by the data provider in all historical data providing periods before the target data providing period, wherein the comprehensive rights refer to: the rights obtained by each data provider in any historical data providing period; for example, if the target data providing period is the 3 rd data providing period, the historical rights of the data provider a may be determined according to all the integrated rights (2 integrated rights, one integrated right for each data providing period) acquired by the data provider a in the 1 st and 2 nd data providing periods, specifically, it is assumed that in the 1 st data providing period, the integrated rights obtained by the data provider through the data a and the data b provided by the data provider a are a; in the 2 nd data providing period, the data provider obtains the comprehensive rights and interests of the data a, the data B and the data c provided by the data provider as B; then, during the 3 rd data providing period, the historical rights of the data provider can be obtained according to the composite rights a and the composite rights B. Illustratively, the data platform may construct a synthetic interest (or called: traffic interest evidence) based on POS (Proof of interest), and the sources of interest evidence on the blockchain are: the aggregate of the profit of the generated blocks (i.e., the aggregate of the composite rights and interests of the respective historical data providing periods) can be understood that the more rights and interests, the easier the generation of the blocks becomes, and further, the more rights and interests can be brought to the data provider.

In one embodiment, the data platform may further deduct one or more target data providers providing the original target data, sum the traffic equity for the target data in all historical data providing periods prior to the data providing period with the target alternative data as the target data, and allocate the deducted traffic equity sum to each data provider providing the target alternative data. It will be appreciated that the traffic rights of the data provider during the target data provision period may be negative, i.e.: the data provider is withheld from traffic rights associated with the original target data during the target data provision period. For example, in fig. 3c, the data platform may deduct the sum of the traffic rights obtained by data provider a in data providing period 1 to 3 and distribute the sum of the traffic rights to data provider B, C, D in data providing period 4, and may distribute the sum of the traffic rights to data provider B, C, D on average.

Based on this, it can be understood that the historical interest of the data provider does not necessarily increase as the number of the historical data providing periods increases (i.e., the historical interest of the data provider in the nth (N is a positive integer) historical data providing period is not necessarily greater than the historical interest of the data provider in the nth-1 th historical data providing period), for example, if the data provider provides the alternative data in the nth data providing period, which is sent to the data access party by the data platform as the target data in a certain basic data set, the historical interest of the data provider increases (it can be understood that the historical interest of the nth-1 th historical data providing period is less than the historical interest of the nth historical data providing period); correspondingly, if the alternative data provided by the data provider in the Nth data providing period is error data, but the alternative data is sent to the data access party by the data platform as target data in a certain basic data set, the historical interest of the data provider is reduced (it can be understood that the historical interest of the Nth historical data providing period is greater than that of the Nth historical data providing period); if the alternative data provided by the data provider in the Nth data providing period to the (N + 2) th data providing period are not transmitted to the data access party by the data platform as target data in a certain basic data set, the historical rights of the data provider can be kept unchanged (it can be understood that the historical rights of the (N + 2) th data providing period can be consistent with the historical rights of the Nth data providing period).

In one embodiment, the data provider may also be a data access. Specifically, if the same data as that in the data platform is used in the service of the data provider after the data provider accesses the data platform, the data provider is considered to use the basic data set of the data platform, and the data provider is the data access party at the same time; for example: assuming that the data platform has a school address of school X, the data provider a provides a call for school X to the data platform, and the data provider a has a school information query service, and the school address of school X is used in the service, then the "school address of school X" used by the data provider a is considered to come from the data platform, and the data provider a is also considered to be a data access party in the data platform. As can be seen from the foregoing, the integrated rights of the data provider (or the data access party as the data provider) in the target data providing period refer to: the data provider gains rights during the target data providing period. Then, when the data provider is also the data access side, it is understood that the rights (i.e.: the composite rights) obtained by the data provider in the target data providing period may include: traffic rights (or: rights to produce underlying data sets) and consumption rights (or: rights to consume underlying data sets). It will further be appreciated that the historical rights of the data provider may be derived from the traffic rights of the data provider over all historical data provision periods and the consumption rights over all historical data provision periods.

The traffic rights and the data access parties (here, all the data access parties in the data platform are referred to, and since the data provider is also the data access party, all the data access parties include the data provider) have a positive correlation trend with the consumption amount of the target data provided by the data provider, that is, the data access parties: the more consumption of the target data provided by the data provider by the data access party, the greater the traffic rights of the data provider. The consumption amount can be understood as: the traffic generated by all data access parties when consuming (such as accessing or downloading) the target data provided by the data provider; illustratively, the consumption amount may be a visit amount, a daily active user amount, an independent visit amount, and the like. Further, when the consumption amount is the access amount, the traffic rights of the data provider can be understood as: the benefit of the data provider generated by the data access party accessing all the target data provided by the data provider, that is, the traffic right of the data provider can be determined according to the access amount generated by all the data access parties accessing all the target data provided by the data provider. For example, assuming that a data provider provides a data value a of a data item a, and 3 data access parties of all data access parties corresponding to the data platform access the data value a, and the total access times are 100, it can be understood that the consumption amount of the data value a may be 100 times; thus, the data platform can determine the traffic rights of the data provider providing the data value a according to the 100-time access amount.

Further, the above-mentioned consumption rights may refer to: the data access party (or the data provider is used as the data access party) generates the income belonging to the data access party due to the consumption of the data in the data platform, and the consumption income and the consumption amount of the data access party to the target data are in a positive correlation trend; that is, the higher the consumption amount of the target data by the data access party is, the greater the consumption rights of the data access party are. For example, when the consumption amount is the access amount, the higher the access amount of the data access party to each target data in the data platform is, the greater the consumption rights corresponding to the data access party are. It should be noted that, the "respective target data in the data platform" mentioned herein refers to: in the data platform, all data which can be inquired by a data access party.

In yet another embodiment, if the data provider does not consume the underlying data set in the data platform, the aggregate rights of the data provider can be traffic rights, and further, the historical rights of the data provider can be derived from the traffic rights of the data provider over all historical data provision periods. In another embodiment, if the data access party does not provide the underlying data set to the data platform, the integrated rights and interests of the data access party may be consumption rights and further, the historical rights and interests of the data provider may be obtained according to the consumption rights and interests of the data provider in all the historical data providing periods.

S203, determining the confidence of each candidate data in one or more candidate data, and taking the candidate data with the highest confidence as target data.

Wherein, the target data can be understood as: sending the data value to a data access party so that the data access party can inquire the data value; alternatively, it can be understood that: the data value with the highest confidence level is determined from the candidate data. For example: the candidate data corresponding to the job title information of the doctor A comprises an auxiliary main-cast doctor and a main-cast doctor, the confidence coefficient of the auxiliary main-cast doctor is higher than that of the main-cast doctor, the target data is the auxiliary main-cast doctor, and furthermore, when each data access party acquires the job title information of the doctor A from the data platform, the information of the auxiliary main-cast doctor is acquired. The confidence of the candidate data may be determined based on the historical rights of each data provider providing the candidate data, and the confidence of the candidate data is in positive correlation with the historical rights of each data provider, for example, the confidence of the candidate data may be a weighted summation operation result of the historical rights of each data provider providing the candidate data.

In one embodiment, for each alternative data that is not determined as target data, the data platform may also store, but not allow the data access party to use; if the confidence coefficient of the target alternative data (the target alternative data is any alternative data in the target alternative data) is higher than that of the target data in any data providing period after the target data providing period, the data platform may use the target alternative data as new target data for the data access party to use. Specifically, if the data platform acquires target candidate data provided by the data provider for the target data item in any data providing period after the target data providing period, and the number of the data providers providing the target candidate data is multiple, the data platform may acquire the historical rights of each data provider providing the target candidate data in the data providing period providing the target candidate data, and determine the confidence of the target candidate data according to the historical rights of each data provider providing the target candidate data.

For example, as shown in fig. 3C, in data providing period 1, the "subordinate principal doctor" is determined as the target data of doctor a, and in data providing period 2, data providing period 3 and data providing period 4 after data providing period 1, each data provider provides the target alternative data "principal doctor" about doctor a, then the confidence level of the "principal doctor" can be determined according to 2% of the historical rights of data provider B in data providing period 2, 2% of the historical rights of data provider C in data providing period 3 and 2% of the historical rights of data provider D in data providing period 4, and the data platform can be determined based on the sum of the historical rights of three data providers (i.e.: 6%).

Further, if the confidence of the target candidate data is greater than that of the target data, the target candidate data is used as the target data. Specifically, as shown in fig. 3c, in the 4 th data providing cycle, since the sum of the historical rights of the data providers providing the target candidate data "primary doctor" is greater than the historical rights of the data provider a providing the target data "secondary primary doctor", and since the confidence of the candidate data is positively correlated with the historical rights of the data providers providing the candidate data, it can be understood that in the data providing cycle 4, the confidence of the target candidate data "primary doctor" is greater than that of the original target data "secondary primary doctor", and therefore the data platform will determine "primary doctor" as the new target data.

And S204, determining a basic data set according to the target data, and sending the basic data set to the data access party.

After the data platform determines the target data, a basic data set can be determined according to the target data, the basic data set is stored in the data platform, and all data access parties are allowed to access or download the basic data set in the data platform. In addition, the data platform can also charge the data consumption platform according to the consumption condition of the data access party on the basic data set in the data platform, and distribute the charged fee to the target data provider. The more data consumed by the data access party, the more value the data access party contributes to the data platform, the more value the data access party contributes, the more consumption rights and interests of the data access party are, and further the historical rights and interests of the data access party are increased.

After a plurality of alternative data of the same data item are obtained, determining target data by obtaining the historical rights of data providers providing the alternative data; the historical rights and the consumption amount of the provided target data in the historical data providing period of each data provider are in positive correlation, and it is easy to understand that the consumption amount is high, which means that the access amount of the target data is large, and the traffic rights brought to the data provider by the large access amount are larger. Therefore, each data provider can be stimulated to provide more and more accurate data as much as possible to improve the historical interest of the data provider, so that the data platform can obtain a more comprehensive basic data set more efficiently.

Referring to fig. 4, fig. 4 is a schematic flowchart of a data processing method provided in an embodiment of the present application, where the data processing method can be executed by the aforementioned data platform, and as shown in fig. 4, the method includes:

s401, if a query request aiming at the basic data set by the data access party is received, acquiring alternative data provided by each data provider in one or more data providers aiming at a target data item in the basic data set in a target data providing period, and acquiring one or more alternative data.

In one embodiment, for a target data item, different data providers may provide the same alternative data within the target data provision period, i.e., the same alternative data may be provided by one or more data providers within the target data provision period. For example: department names in department information in the medical data platform, such as dermatology, may be provided by one or more data providers within the target data provision period.

S402, acquiring the historical rights and interests of each data provider.

As can be seen from the description of step S202, the historical rights of each data provider are related to the combined rights of the data provider in all the historical data providing periods, and the combined rights of the data provider in the target data providing period can be obtained according to the consumption rights and the traffic rights of the data provider in the target data providing period. Based on this, in one embodiment, the traffic rights may be determined as follows: if the target data is not provided in all the historical data providing periods (i.e. the target data appears for the first time in the target data providing period), when the data platform obtains the comprehensive rights of the data providers in the target data providing period, one or more target data providers providing the target data can be determined in the target data providing period, if the number of the one or more target data providers is one, the data platform can determine the consumption amount of the target data by all the data access parties as the consumption amount of the target data provider, and further obtain the traffic rights of the target data provider based on the consumption amount. For example, assuming that the target data providing period is the 3 rd data providing period, the history data providing period includes: a 1 st data supply period and a 2 nd data supply period; further, if the target data "doctor a" exists in the 3 rd target data providing period, and only the data provider a provides the data "doctor a" in the 1 st data providing period, the 2 nd data providing period, and the 3 rd data providing period, the consumption amount corresponding to the doctor a may be used as the consumption amount of the data provider a in the 3 rd data providing period, and further, the traffic right corresponding to the consumption amount may be used as a part of the traffic right of the data provider a.

Further, if the number of target data provided by the target data provider in the target data providing period is one or more, the data platform may determine the traffic right of the target data provider by: the data platform acquires one or more target data of the target data provider in a target data providing period, and acquires a flow right corresponding to each target data in the one or more target data, so as to obtain the flow right of the target data provider in the target data providing period. In specific implementation, the data platform may obtain the traffic right corresponding to the target data in the following manner: the data platform acquires a weight value of a target data item corresponding to target data and consumption of a basic data set where the target data is located; further, the data platform may determine the consumption amount of the target data based on the weight value of the target data item corresponding to the target data and the consumption amount of the basic data set in which the target data item is located, and may further obtain the traffic benefits corresponding to all the target data provided by the target data provider, and use the sum of the traffic benefits corresponding to all the target data as the traffic benefit of the target data provider in the target data providing period.

The basic data set may include one or more data items, and the sum of the weighted values of the one or more data items is 1, for example: if the basic data set of the target data "XX college" is "college information", it is understood that the "college information" may include: the system comprises three data items of a college name, a college head count and a college address, wherein the sum of weighted values of the three data items of the college name, the college head count and the college address is 1. For example, when the basic data set includes m (m ≧ 0, and m is an integer) data items, the sum of the weights of the respective data items in the basic data set can be as shown in equation 1:

wherein, FiThe weighted value of the ith data item is represented, i is an integer, and the weighted value of each data item in the college information can be allocated by the data platform as shown in table 1:

TABLE 1

College information Weighted value
Name of college F=20%
College address F=50%
Total number of colleges F=30%

In a specific embodiment, assuming that j (j is a positive integer) data items of j target data in the one or more target data are data items of the same basic data set, where the data items of different target data are different, the data platform may obtain a sum of weight values corresponding to the j target data included in the basic data set by using a method shown in formula 2.

Wherein R' represents the sum of the weight values of the j target data in the basic data set, FiAnd the weight value of a target data item corresponding to the ith target data in the j target data is represented, i is an integer and is less than or equal to j. For example, the weight value of each target data item may be determined by the data platform based on the underlying data set in which the target data item is located.

It is understood that the data provider provides the data platform with the alternative data, and does not necessarily bring traffic interest, and the traffic interest may be generated after the alternative data is used by the data access party, that is: assuming that target data corresponding to the data item is from the data provider A, the consumption amount corresponding to the target data and the traffic rights brought by the consumption amount belong to the data provider A; assuming that the data provider a provides alternative data of the target data item, but the alternative data is not selected as the target data for the data access party to use, the data provider a does not have the traffic rights brought by the target data corresponding to the target data item. For example, if data provider a provides alternative data a for the data item of the school name, data provider B provides alternative data B for the data item of the school name, and alternative data a is used as the target data of the school name, then data provider a will have the traffic interest generated when the data access side queries the school name, and data provider B will not have the traffic interest generated when the data access side queries the school name.

For example, the data platform may determine the consumption amount of j target data included in each base data set by the method shown in equation 3:

r ═ R' x PV formula 3

Wherein, R' represents the sum of the weight values of the j target data in the basic data set, and pv represents the consumption amount corresponding to the basic data set. Illustratively, pv may represent the total consumption corresponding to the target data for each data item in the underlying data set.

Further, assuming that the data items of the one or more target data are data items of basic data sets included in the same group of basic data sets, where the group of basic data sets includes one or more basic data sets, the data platform may calculate the consumption amount of all target data provided by the target data provider, as shown in equation 4:

wherein a denotes the number of the one or more underlying data sets,i represents the ith basic data set, a is more than or equal to i, and a and i are integers; riRepresenting the consumption of each target data in the ith basic data set; t represents the sum of the consumption amounts of all target data included in a set of basic data sets.

Further, assuming that the one or more target data exists in one or more sets of base data, the data platform may calculate the traffic rights of the target data provider based on the method shown in equation 5:

where S1 represents the traffic interest of the target data provider in the target data providing period, PV represents the sum of the consumption of all the basic data sets in the data platform, and TiThe sum of the consumption amounts corresponding to all the target data included in each set of basic data is represented. b is the number of one or more sets of underlying data sets, and b is a positive integer. Further, after the data platform determines the traffic right of the target data provider, the data platform may determine, according to the traffic right, the comprehensive right of the target data provider in the target data providing period, and it can be understood that the comprehensive right of the target data provider in the target data providing period is positively correlated with the traffic right of the target data provider in the target data providing period. Illustratively, when the target data provider does not consume any underlying data set in the data platform, the data platform may directly take the traffic rights as the composite rights of the target data provider.

In another embodiment, if the target data provider is also a data access side at the same time, the following steps are performed: if the target data provider also uses the target data in the data platform, considering that the data access party is a main source for bringing commercial value to the data platform, the consumption right can be allocated to the data access party (or the data provider which can be used as the data access party), so that the data access party has a certain decision right on the accuracy of the data. Specifically, the data platform may determine the consumption right of the target data provider based on the consumption amount of the target data provider for the target data, and for example, the data platform may determine the consumption right of the target data provider in a manner shown in equation 6:

wherein V represents the fee paid by the data access party to the data platform due to the consumption of the data in the data platform, p represents the number of the data access parties in the data platform, and S2 represents the consumption right of the data access party; it is to be understood that when the data access party may also be a data provider, S2 may also represent a consumption right of the data provider (e.g., a target data provider). Further, after obtaining the consumption rights and interests of the target data provider, the data platform may determine the comprehensive rights and interests of the target data provider in the target data providing period according to the traffic rights and consumption rights and, for example, when the target data provider is also a data access party, the data platform may determine the comprehensive rights and interests of the target data provider based on the method shown in equation 7.

Si ═ x × S1+ y × S2 formula 7

Wherein x + y is 1, and x and y are the weight of the traffic right and the weight of the consumption right respectively; the data platform has the right to adjust the proportion of x and y, illustratively, in the early stage of the data platform, the basic data set is less, x can be properly increased, and in the later stage of the data platform, the basic data set is more, y can be properly increased. In practical application, the data platform can distribute the value of the data platform in the data providing period to each data provider according to the data providing period (also called as the operation period of the block) and the operation condition of the data platform per se. Through the process of constructing the rights and interests certification, it can be seen that the traffic rights and interests are in positive correlation with the correct alternative data, and if the data is wrong, rights and interests (such as the promotion of the traffic rights and interests, the promotion of the historical rights and interests and the like) cannot be brought to the provider providing the wrong alternative data due to no access by people.

In yet another embodiment, if the data platform determines that there are multiple target data providers providing target data, and the target data is not provided in all historical data providing periods, the data platform may determine, after determining a traffic interest corresponding to the target data based on the consumption amount of the target data by the data access party, a traffic interest of each target data provider of the one or more target data providers in the target data providing period according to the traffic interest and the number of target data providers. For example, assuming that there are 2 data providers providing the target data a in the target data providing period, the data platform may first determine the consumption amount of the target data a for the data platform, and determine the traffic interest (assumed to be b) provided by the target data a for the data platform based on the consumption amount, and further, the data platform may allocate the traffic interest b to the 2 data providers according to the number of data providers (i.e., 2). For example, the data platform may assign traffic rights in the plurality of target data providers to each of the plurality of target data providers on average. As shown in table 2, table 2 shows that the data platform obtains the same candidate data provided by different data providers in the target data providing period, and then allocates traffic rights to the candidate data.

TABLE 2

Data provider Traffic equity Alternative data provided
P1 S1=1% Zhang SanTitle of job: chief and ren doctors
P2 S1=1% Zhang three titles: chief and ren doctors
P3 S1=1% Zhang three titles: chief and ren doctors
P4 S1=1% Zhang three titles: chief and ren doctors
P5 S1=1% Zhang three titles: chief and ren doctors

It can be seen that since the data providers P1, P2, P3, P4 and P5 all provide "zhang san: the principal physician "this alternative data, then the data platform can call" zhang san zhang: the chairman "traffic interest (5%) generated after targeting data is distributed evenly to the above 5 data providers.

Based on the above description, it can be understood that if the number of target data providers providing target data is one in a target data providing period, and none of the data providers provides the target data in all the history data providing periods before the target data providing period, then the traffic right corresponding to the consumption amount of the target data is exclusively owned by the 1 target data provider; for example, if there are N data providing periods, and only data provider a provides alternative data 1 in the N data providing periods, the traffic rights generated by each data access party accessing the target data will be attributed to one data provider a after the data platform determines alternative data 1 as the target data. Correspondingly, if the number of target data providers providing the target data is one in the target data providing period, and there are a plurality of data providers providing the target data in the target data providing period and all history data providing periods before the target data providing period, the traffic right generated by the consumption amount corresponding to the target data can be allocated to the plurality of data providers and the target data providers, for example: if there are 1 data provider providing the target data a in the target data providing period and 2 data providers providing the target data a in all the history data providing periods, it can be understood that there are 3 target data providers corresponding to the target data a, and for example, the traffic right corresponding to the consumption amount of the target data a can be equally divided by the 3 target data providers.

S403, one or more target data providers which provide each piece of alternative data are determined.

In one embodiment, if the target data providing period is the Nth data providing period, N ≧ 2 and N are integers, the data platform can acquire one or more data providers providing each alternative data in each of the previous N-1 historical data providing periods. For example, assuming that N is 3, the target data item is the address of school a, and there are two addresses "street B1" and "street B5" in the address of school a in the first 2 history data providing periods, the data platform may obtain a data provider providing "street B1", and a data provider providing "street B5".

Further, the data platform may use all data providers that provide each piece of alternative data, which are acquired in the first N-1 data providing periods, as one or more target data providers of the piece of alternative data. For example, assuming that the data providers providing "B street No. 1" in the above example are the data provider 1 and the data provider 2, the data provider 1 and the data provider 2 are target data providers of the data alternative value of "B street No. 1".

S404, acquiring the historical rights of each target data provider in the one or more target data providers to obtain one or more historical rights.

S405, determining the confidence of each candidate data according to one or more historical rights, and taking the candidate data with the highest confidence as target data.

In one embodiment, the data platform may perform a weighted summation operation on the historical rights of the data providers providing the candidate data to obtain the confidence of the candidate data, for example, see table 3, where table 3 shows a case where the data platform may face a data collision when acquiring multiple candidate data for the same data item in a target data providing period and all historical data providing periods.

TABLE 3

As can be seen, the historical equity of P5 (10%) is greater than the sum of the historical equity of P1, P2, P3, and P4 (4%), then the data platform can determine "Zhang three Job's title: the confidence level of the assistant chief and ren doctor is higher than that of Zhang three-position: confidence of the principal physician ".

With continued reference to table 4, table 4 also shows a situation that the data platform may face when acquiring multiple candidate data for the same data item in the target data providing period and all historical data providing periods.

TABLE 4

Data provider Historical equity Alternative data provided
P1 S=1% Zhang three titles: chief and ren doctors
P2 S=1% Zhang three titles: chief and ren doctors
P3 S=1% Zhang three titles: chief and ren doctors
P4 S=1% Zhang three titles: chief and ren doctors
P5 S=3% Zhang three titles: assistant chief physician

As can be seen, the historical interest (3%) of P5 is the largest among the individual data providers, but the historical interest of P5 is less than the sum (4%) of the historical interests of P1, P2, P3 and P4, then the data platform can determine "zhang san zhang: the confidence level of the principal physician "" is higher than Zhang three-position: degree of confidence of assistant principal physician ". Based on the related descriptions in table 3 and table 4, it is understood that when the same data item corresponds to different multiple candidate data, the data platform may use the corresponding candidate data with the greatest historical interest as the target data to resolve the data conflicts shown in table 3 and table 4. The solution of the data conflict can be seen that the root of the traffic interest is the traffic, and the built data platform has natural capability of improving the data conflict through the method of solving the data conflict based on the traffic interest certificate, so that the built data platform is more convenient to open to the outside in the internet.

Further, if the data platform acquires target data provided by the data provider in an Mth data providing period, and the Mth data providing period keeps the target data unchanged, wherein M is greater than N and is an integer; then, the data platform may determine, based on the consumption amount of the target data by the data access party in the mth data providing period, a traffic interest of one or more target data providers in the mth data providing period, where the traffic interest for the target data in the mth data providing period is zero for the data provider providing the target data in the mth data providing period, that is: when the same alternative data of the same target data item is provided, the target data providing period in which the alternative data is adopted as the target data is adopted, and the data provider which provides the alternative data in all the historical data providing periods before the target data providing period are used as the target data provider, and the data provider which provides the alternative data in the subsequent data providing period cannot obtain the traffic rights generated based on the alternative data.

It is to be understood that, although the alternative data is also provided before the target data providing period, but is not regarded as the target data, the alternative data is determined as the target data only in the target data providing period, and it should be noted that, before the target data providing period, the alternative data is not regarded as the target data, but this does not mean that the alternative data is invalid data. As shown in fig. 3a, the data provider B, C provides the alternative data "master doctor", and although the job title information of doctor a in the data platform is changed from "subordinate master doctor" to "master doctor" because B, C provides the alternative data "master doctor", it can be understood that: data provider B and data provider C each vote on alternative data "master doctor", which voting action may affect the determination of target data for the title information of doctor a. For example, after data provider D voted for "primary doctor" in the 4 th data providing period, the confidence of "primary doctor" reached 6% and was greater than the confidence of "secondary primary doctor" by 5%, so the data platform will change the target data of the job title information of doctor a from "secondary primary doctor" to "primary doctor" in the 4 th data providing period, and further, the traffic interest due to the consumption of "primary doctor" can be shared among the three data providers of data provider B, C, D.

And S406, determining a basic data set according to the target data, and sending the basic data set to the data access party.

In an embodiment, the related embodiment in step S406 may refer to the description in step S204, and the embodiment of the present application is not described herein again.

Based on the description of fig. 2 and fig. 4, in one embodiment, the data platform may establish a security protection mechanism to prevent the data provider from stealing the data platform's data during the use of the underlying data set of the data platform, such as: the data platform may require a service of the data provider to be hosted on the data platform.

In the embodiment of the application, the larger the historical rights is, the more traffic rights can be brought to the data providers, so that greater economic benefits are generated, and since different data providers are equal and opaque, each data provider in the data platform is in a mutually gaming scene, at this time, the data provider maliciously provides false data (false alternative data) which is not beneficial to the data provider, and providing false data not only reduces the rights and benefits of the data provider, but also is penalized correspondingly by the data platform. Therefore, the POS (proof of stamp) algorithm is adopted to construct the related comprehensive rights and interests, the behaviors of various data providers are objectively avoided, and the data providers are prompted to provide more correct data, so that the data processing method provided by the embodiment of the application can accelerate the progress of the data platform in collecting various basic data sets and ensure that the collected basic data sets have higher confidence.

Based on the description of the above-mentioned data processing method related embodiments, the present application also discloses a data processing apparatus, which may be a computer program (including program code) running in the above-mentioned data platform. The data processing apparatus may perform the methods as shown in fig. 2 and fig. 4, please refer to fig. 5, the data processing apparatus may at least include: an acquisition unit 501, a determination unit 502 and a transmission unit 503.

An obtaining unit 501, configured to obtain, if a query request for a basic data set by a data access party is received, one or more pieces of alternative data for alternative data provided by each data provider in one or more data providers in a target data providing period for a target data item in the basic data set;

the obtaining unit 501 is further configured to obtain a historical interest of each data provider, where the historical interest is obtained according to a composite interest of all historical data providing periods of each data provider before the target data providing period;

a determining unit 502, configured to determine a confidence of each candidate data in the one or more candidate data, and use the candidate data with the highest confidence as the target data, where the confidence of each candidate data is determined according to a historical interest of each target data provider of one or more target data providers that provide the each candidate data;

a sending unit 503, configured to determine a basic data set according to the target data, and send the basic data set to the data access party.

In an embodiment, when determining the confidence of each candidate data in the one or more candidate data, the determining unit 502 is specifically configured to perform:

determining one or more target data providers that provide the each of the alternative data;

acquiring the historical rights and interests of each target data provider in the one or more target data providers to obtain one or more historical rights and interests;

and determining the confidence level of each alternative data according to the one or more historical rights.

In yet another embodiment, the target data providing period is the Nth data providing period, N ≧ 2 and N is an integer; when determining one or more target data providers that provide each piece of candidate data, the determining unit 502 is specifically configured to perform:

acquiring one or more data providers which provide each alternative data in each historical data providing period in the first N-1 historical data providing periods;

and taking all the data providers which provide each piece of alternative data and are acquired in the first N-1 data providing periods as one or more target data providers of each piece of alternative data.

In yet another embodiment, the determining unit 502 may be further configured to:

determining one or more target data providers that provide the target data;

if the target data is not provided in all historical data providing periods and the number of one or more target data providers providing the target data in the target data providing periods is one, determining the traffic interest of the target data provider providing the target data based on the consumption of the target data by the data access party, wherein the traffic interest and the consumption of the target data by the data access party are in a positive correlation trend;

and determining the comprehensive rights of the target data provider in the target data providing period according to the traffic rights, wherein the comprehensive rights of the target data provider in the target data providing period and the traffic rights show a positive correlation trend.

In another embodiment, the target data provider is the data access party, and the determining unit 502 is further configured to perform:

determining the consumption rights of the target data provider based on the consumption amount of the target data provider to the target data, wherein the consumption rights and the consumption amount of the target data provider to the target data are in a positive correlation trend;

the determining the comprehensive rights of the target data provider in the target data providing period according to the traffic rights comprises the following steps:

and determining the comprehensive rights and interests of the target data provider in the target data providing period according to the flow rights and the consumption rights.

In yet another embodiment, the determining unit 502 may be further configured to perform:

determining one or more target data providers that provide the target data;

if the target data is not provided in all historical data providing periods and the number of one or more target data providers providing the target data in the target data providing periods is multiple, determining the traffic interest of the target data provider providing the target data based on the consumption of the target data by the data access party, wherein the traffic interest and the consumption of the target data by the data access party are in a positive correlation trend;

determining a composite interest of each of the one or more target data providers during the target data provision period based on the traffic interest and the number of target data providers.

In another embodiment, the obtaining unit 501 may be further configured to:

if the target alternative data provided by the data provider for the target data item is acquired in any data providing period after the target data providing period, and the number of the data providers providing the target alternative data is multiple, acquiring the historical rights of each data provider providing the target alternative data in the data providing period providing the target alternative data;

the determining unit 502 may be further configured to perform:

determining the confidence of the target alternative data according to the historical rights of each data provider providing the target alternative data;

if the confidence degree of the target alternative data is greater than that of the target data, taking the target alternative data as the target data;

the sending unit 503 is configured to perform: and determining a basic data set according to the target data, and sending the basic data set to the data access party.

In another embodiment, the data processing apparatus further includes a deduction unit 504 and an allocation unit 505, where the deduction unit 504 is configured to perform:

deducting one or more target data providers providing the target data, a sum of traffic rights and interests for the target data in all historical data providing periods prior to a data providing period in which the target alternative data is the target data;

the allocation unit 505 is configured to perform: and allocating the deducted traffic interest sum to each data provider providing the target alternative data.

In another embodiment, the target data providing period is an nth data providing period, where N is a positive integer, and the determining unit 502 may be further configured to:

if the target data provided by the data provider is acquired in the Mth data providing period and the target data is kept unchanged in the Mth data providing period, M is larger than N and is an integer;

determining a traffic interest of the one or more target data providers during the Mth data providing period based on the consumption amount of the target data by the data access party during the Mth data providing period, wherein the traffic interest of the target data providers during the Mth data providing period is zero.

According to an embodiment of the present application, the steps involved in the data processing methods shown in fig. 2 and 4 may be performed by units in the data processing apparatus shown in fig. 5. For example, step S201 and step S202 shown in fig. 2 can be executed by the acquisition unit 501 in the data processing apparatus shown in fig. 5; step S203 may be performed by the determination unit 502 in the data processing apparatus shown in fig. 5; step S204 may be performed by the transmitting unit 503 in the data processing apparatus shown in fig. 5. As another example, step S401, step S402, and step S404 shown in fig. 4 may all be performed by the acquisition unit 501 in the data processing apparatus shown in fig. 5; both of step S403 and step S405 may be performed by the determination unit 502 in the data processing apparatus shown in fig. 5; step S406 may be performed by the transmitting unit 503 in the data processing apparatus shown in fig. 5.

According to another embodiment of the present application, the units in the data processing apparatus shown in fig. 5 are divided based on logical functions, and the units may be respectively or entirely combined into one or several other units to form the unit, or some unit(s) therein may be further split into multiple units smaller in function to form the unit(s), which may achieve the same operation without affecting the achievement of the technical effects of the embodiments of the present application. In other embodiments of the present application, the data processing apparatus may also include other units, and in practical applications, these functions may also be implemented by being assisted by other units, and may be implemented by cooperation of multiple units.

According to another embodiment of the present application, the data processing apparatus shown in fig. 5 may be constructed by running a computer program (including program codes) capable of executing the steps involved in the method shown in fig. 2 or fig. 4 on a general-purpose computing device such as a computer including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), and a storage element, and implementing the data processing method of the embodiment of the present application. The computer program may be embodied on, for example, a computer storage medium, and loaded into and executed by the computing device described above via the computer storage medium.

After a plurality of alternative data of the same data item are obtained, determining target data by obtaining the historical rights of data providers providing the alternative data; the historical rights and the consumption amount of the provided target data in the historical data providing period of each data provider are in positive correlation, and it is easy to understand that the consumption amount is high, which means that the access amount of the target data is large, and the traffic rights brought to the data provider by the large access amount are larger. Therefore, each data provider can be stimulated to provide more and more accurate data as much as possible to improve the historical interest of the data provider, so that the data platform can obtain a more comprehensive basic data set more efficiently.

Based on the description of the above method embodiment and apparatus embodiment, the present application also provides a data platform, please refer to fig. 6. The data platform comprises at least a processor 601, an output interface 602, and a computer storage medium 603, and the processor 601, the output interface 602, and the computer storage medium 603 of the data platform may be connected by a bus or other means.

The computer storage medium 603 is a memory device in a data platform for storing programs and data. It is understood that the computer storage medium 603 herein may comprise a built-in storage medium in the data platform, and may also comprise an extended storage medium supported by the data platform. The computer storage medium 603 provides a storage space that stores the operating system of the data platform. Also stored in this memory space are one or more computer programs, which may be one or more program codes, adapted to be loaded and executed by the processor 601. The computer storage medium may be a high-speed RAM memory, or may be a non-volatile memory (non-volatile memory), such as at least one disk memory; and optionally at least one computer storage medium located remotely from the processor. The processor 601 (or CPU) is a computing core and a control core of the data platform, and is adapted to implement one or more computer programs, and specifically, adapted to load and execute the one or more computer programs so as to implement corresponding method processes or corresponding functions.

In one embodiment, one or more computer programs stored in the computer storage medium 603 may be loaded and executed by the processor 601 to implement the corresponding method steps described above in connection with the method embodiments shown in fig. 2 and 4; in a specific implementation, one or more computer programs in the computer storage medium 603 are loaded and executed by the processor 601 as follows:

if a query request aiming at a basic data set by a data access party is received, acquiring alternative data provided by each data provider in one or more data providers aiming at a target data item in the basic data set in a target data providing period to obtain one or more alternative data;

acquiring the historical interest of each data provider, wherein the historical interest is obtained according to the comprehensive interest of all the historical data providing periods of each data provider before the target data providing period;

determining the confidence degree of each alternative data in the one or more alternative data, and using the alternative data with the highest confidence degree as target data, wherein the confidence degree of each alternative data is determined according to the historical rights and interests of each target data provider in one or more target data providers providing each alternative data;

and determining a basic data set according to the target data, and sending the basic data set to the data access party.

In an embodiment, when determining the confidence level of each candidate data in the one or more candidate data, the processor 601 is specifically configured to load and execute:

determining one or more target data providers that provide the each of the alternative data;

acquiring the historical rights and interests of each target data provider in the one or more target data providers to obtain one or more historical rights and interests;

and determining the confidence level of each alternative data according to the one or more historical rights.

In yet another embodiment, the target data providing period is the Nth data providing period, N ≧ 2 and N is an integer; when determining one or more target data providers that provide each piece of candidate data, the processor 601 is specifically configured to load and execute:

acquiring one or more data providers which provide each alternative data in each historical data providing period in the first N-1 historical data providing periods;

and taking all the data providers which provide each piece of alternative data and are acquired in the first N-1 data providing periods as one or more target data providers of each piece of alternative data.

In yet another embodiment, the processor 601 is further operable to load and execute:

determining one or more target data providers that provide the target data;

if the target data is not provided in all historical data providing periods and the number of one or more target data providers providing the target data in the target data providing periods is one, determining the traffic interest of the target data provider providing the target data based on the consumption of the target data by the data access party, wherein the traffic interest and the consumption of the target data by the data access party are in a positive correlation trend;

and determining the comprehensive rights of the target data provider in the target data providing period according to the traffic rights, wherein the comprehensive rights of the target data provider in the target data providing period and the traffic rights show a positive correlation trend.

In another embodiment, the target data provider is the data access party, and the processor 601 is further configured to load and execute:

determining the consumption rights of the target data provider based on the consumption amount of the target data provider to the target data, wherein the consumption rights and the consumption amount of the target data provider to the target data are in a positive correlation trend;

the determining the comprehensive rights of the target data provider in the target data providing period according to the traffic rights comprises the following steps:

and determining the comprehensive rights and interests of the target data provider in the target data providing period according to the flow rights and the consumption rights.

In yet another embodiment, the processor 601 may be further configured to load and execute:

determining one or more target data providers that provide the target data;

if the target data is not provided in all historical data providing periods and the number of one or more target data providers providing the target data in the target data providing periods is multiple, determining the traffic interest of the target data provider providing the target data based on the consumption of the target data by the data access party, wherein the traffic interest and the consumption of the target data by the data access party are in a positive correlation trend;

determining a composite interest of each of the one or more target data providers during the target data provision period based on the traffic interest and the number of target data providers.

In yet another embodiment, the processor 601 may be further configured to load and execute:

if the target alternative data provided by the data provider for the target data item is acquired in any data providing period after the target data providing period, and the number of the data providers providing the target alternative data is multiple, acquiring the historical rights of each data provider providing the target alternative data in the data providing period providing the target alternative data;

determining the confidence of the target alternative data according to the historical rights of each data provider providing the target alternative data;

if the confidence degree of the target alternative data is greater than that of the target data, taking the target alternative data as the target data;

the output interface 602 is configured to perform: and determining a basic data set according to the target data, and sending the basic data set to the data access party.

In yet another embodiment, the processor 601 may be further configured to load and execute:

deducting one or more target data providers providing the target data, a sum of traffic rights and interests for the target data in all historical data providing periods prior to a data providing period in which the target alternative data is the target data;

and allocating the deducted traffic interest sum to each data provider providing the target alternative data.

In another embodiment, the target data providing cycle is an nth data providing cycle, where N is a positive integer, and the processor 601 is further configured to load and execute:

if the target data provided by the data provider is acquired in the Mth data providing period and the target data is kept unchanged in the Mth data providing period, M is larger than N and is an integer;

determining a traffic interest of the one or more target data providers during the Mth data providing period based on the consumption amount of the target data by the data access party during the Mth data providing period, wherein the traffic interest of the target data providers during the Mth data providing period is zero.

After a plurality of alternative data of the same data item are obtained, determining target data by obtaining the historical rights of data providers providing the alternative data; the historical rights and the consumption amount of the provided target data in the historical data providing period of each data provider are in positive correlation, and it is easy to understand that the consumption amount is high, which means that the access amount of the target data is large, and the traffic rights brought to the data provider by the large access amount are larger. Therefore, each data provider can be stimulated to provide more and more accurate data as much as possible to improve the historical interest of the data provider, so that the data platform can obtain a more comprehensive basic data set more efficiently.

The embodiment of the present application further provides a storage medium, where a computer program of the data processing method is stored in the storage medium, where the computer program includes program instructions, and when one or more processors load and execute the program instructions, the description of the data processing method in the embodiment may be implemented, which is not described herein again. The description of the beneficial effects of the same method is not repeated herein. It will be understood that the program instructions may be deployed to be executed on one or more devices capable of communicating with each other.

It should be noted that according to an aspect of the embodiments of the present application, there is also provided a computer program product or a computer program, which includes computer instructions, and the computer instructions are stored in a computer readable storage medium. The processor in the data platform reads the computer instructions from the computer-readable storage medium and then executes the computer instructions, thereby enabling the data platform to perform the methods provided in the various alternatives described above in connection with the data processing method embodiments shown in fig. 2 and 4.

It will be understood by those skilled in the art that all or part of the processes in the methods of the above embodiments may be implemented by a computer program, which may be stored in a computer-readable storage medium, and may include the processes of the above embodiments of the data processing method when the computer program is executed. The computer readable storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

While the embodiments of the present invention have been described in detail, it should be understood that they have been presented by way of example only, and not limitation, such that this disclosure is not limited to particular embodiments described herein, but are capable of other embodiments and equivalents within the spirit and scope of the invention as defined by the appended claims.

完整详细技术资料下载
上一篇:石墨接头机器人自动装卡簧、装栓机
下一篇:基于LKJ监控交路数据自动生成LKJ径路数据的方法和系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!