Data acquisition method and device, storage medium and electronic equipment

文档序号:7603 发布日期:2021-09-17 浏览:33次 中文

1. A method of data acquisition, comprising:

acquiring incremental data of a target relational database in at least one relational database through an open source project client, wherein the incremental data is changed data in the target relational database, and the open source project client supports the at least one relational database;

and writing the acquired incremental data of the target relational database into a target data source.

2. The method of claim 1, wherein prior to separately collecting incremental data of the target relational database by the open-source project client, the method further comprises:

and starting an increment acquisition configuration function of the target relational database, wherein the increment acquisition configuration function is used for allowing the open-source project client to acquire increment data of the target relational database.

3. The method of claim 1, wherein prior to writing the collected delta data for the target relational database to the target data source, the method further comprises:

and acquiring an original data table structure of the target relational database, wherein the original data table structure is a table structure of the incremental data to be acquired.

4. The method of claim 3, wherein writing the collected incremental data of the target relational database into a target data source comprises:

converting the table structure of the acquired incremental data of the target relational database from the original data table structure into a target data table structure;

writing the delta data of the target data table structure into the target data source.

5. The method of claim 4, wherein transforming the collected table structure of the incremental data of the target relational database from the original data table structure to a target data table structure comprises:

acquiring a first target instruction, wherein the first target instruction is used for indicating the conversion of the original data table structure;

and responding to the first target instruction to convert the acquired table structure of the incremental data of the target relational database from the original data table structure to the target data table structure.

6. The method of claim 1, wherein writing the collected incremental data of the target relational database into a target data source comprises:

acquiring a second target instruction, wherein the second target instruction is used for indicating to write the incremental data;

responding to the second target instruction to write the incremental data of the target relational database into the target data source.

7. The method of any of claims 1 to 6, wherein the type of delta data comprises at least one of:

an increase type;

updating the type;

the type is deleted.

8. A data acquisition device, comprising:

the system comprises a collecting unit, a source-opening project client and a source-opening project client, wherein the collecting unit is used for collecting incremental data of a target relational database in at least one relational database through the source-opening project client, the incremental data is changed data in the target relational database, and the source-opening project client supports the at least one relational database;

and the writing unit is used for writing the acquired incremental data of the target relational database into a target data source.

9. A storage medium, comprising a stored program, wherein the program, when executed, controls an apparatus in which the storage medium is located to perform the method of any one of claims 1 to 7.

10. An electronic device comprising to a processor, and at least one memory, bus connected to the processor;

the processor and the memory complete mutual communication through the bus;

the processor is configured to invoke program instructions in the memory to perform the method of any of claims 1 to 7.

Background

At present, both the data frame canal and the analysis program Open repeater are Open source items for analysis based on the file MySQL binlog. The data frame canal provides incremental data subscription and consumption based on database incremental log analysis, is packaged on the basis of file binlog analysis, only supports incremental acquisition of the relational database Mysql but not incremental acquisition of other relational databases, and does not support data writing into a target data source; the parser Open repeater also provides only binlog parsing of files. When data of a plurality of relational databases needs to be collected, codes need to be developed for each relational database separately, and therefore data collection efficiency is low.

Aiming at the technical problem of low data acquisition efficiency in the prior art, no effective solution is provided at present.

Disclosure of Invention

The invention mainly aims to provide a data acquisition method, a data acquisition device, a storage medium and electronic equipment, and at least solves the technical problem of low data acquisition efficiency.

To achieve the above object, according to one aspect of the present invention, a data acquisition method is provided. The method can comprise the following steps: acquiring incremental data of a target relational database in at least one relational database through an open source project client, wherein the incremental data is changed data in the target relational database, and the open source project client supports the at least one relational database; and writing the acquired incremental data of the target relational database into a target data source.

Optionally, before acquiring, by the open-source project client, incremental data of a target relational database in the at least one relational database, the method further includes: and starting an increment acquisition configuration function of the target relational database, wherein the increment acquisition configuration function is used for allowing the open-source project client to acquire increment data of the target relational database.

Optionally, before writing the acquired incremental data of the target relational database into the target data source, the method further includes: and acquiring an original data table structure of the target relational database, wherein the original data table structure is a table structure of incremental data to be acquired.

Optionally, writing the acquired incremental data of the target relational database into the target data source includes: converting the table structure of the acquired incremental data of the target relational database from the original data table structure into a target data table structure; the incremental data of the target data table structure is written into the target data source.

Optionally, converting the table structure of the acquired incremental data of the target relational database from the original data table structure to the target data table structure, including: acquiring a first target instruction, wherein the first target instruction is used for indicating the conversion of an original data table structure; and responding to the first target instruction to convert the table structure of the acquired incremental data of the target relational database from the original data table structure to the target data table structure.

Optionally, writing the acquired incremental data of the target relational database into the target data source includes: acquiring a second target instruction, wherein the second target instruction is used for indicating the write operation of the incremental data; and responding to a second target instruction to write the acquired incremental data of the target relational database into the target data source.

Optionally, the type of incremental data comprises at least one of: an increase type; updating the type; the type is deleted.

Optionally, the target data source is a data warehouse tool.

In order to achieve the above object, according to another aspect of the present invention, there is provided a data acquisition apparatus. The apparatus may include: the system comprises a collecting unit, a source-opening project client and a source-opening project client, wherein the collecting unit is used for collecting incremental data of a target relational database in at least one relational database through the source-opening project client, the incremental data is changed data in the target relational database, and the source-opening project client supports the at least one relational database; and the writing unit is used for writing the acquired incremental data of the target relational database into the target data source.

In order to achieve the above object, according to another aspect of the present invention, there is provided a storage medium. The storage medium comprises a stored program, wherein when the program runs, the device where the storage medium is located is controlled to execute the data acquisition method of the embodiment of the invention.

In order to achieve the above object, according to another aspect of the present invention, an electronic apparatus is provided. The electronic equipment comprises a processor, at least one memory connected with the processor, and a bus; the processor and the memory complete mutual communication through a bus; the processor is used for calling the program instructions in the memory so as to execute the data acquisition method of the embodiment of the invention.

According to the method, the incremental data of the target relational database in the at least one relational database is acquired through the open source project client, wherein the incremental data are changed data in the target relational database, and the open source project client supports the at least one relational database; and writing the acquired incremental data of the target relational database into a target data source. That is to say, the method and the system realize the acquisition of at least one relational database through the open source project client, do not need to develop codes for each relational database separately, can realize the multiplexing of the codes, reduce the repeated labor, and write the data acquired in increments into the target data source, thereby achieving the technical effect of improving the data acquisition efficiency and further solving the technical problem of low data acquisition efficiency.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this application, illustrate embodiments of the invention and, together with the description, serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a flow chart of a method of data acquisition according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a data acquisition device according to an embodiment of the present invention; and

fig. 3 is a schematic diagram of an electronic device according to an embodiment of the invention.

Detailed Description

It should be noted that the embodiments and features of the embodiments in the present application may be combined with each other without conflict. The present invention will be described in detail below with reference to the embodiments with reference to the attached drawings.

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only partial embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It should be noted that the terms "first," "second," and the like in the description and claims of this application and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the application described herein may be used. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

The embodiment of the invention provides a data acquisition method.

Fig. 1 is a flow chart of a data acquisition method according to an embodiment of the present invention. As shown in fig. 1, the method may include the steps of:

and S102, acquiring incremental data of a target relational database in at least one relational database through the open source project client.

In the technical solution provided by step S102 of the present invention, the incremental data is changed data in the target relational database, wherein the open source project client supports at least one relational database.

In this embodiment, the open-source item may be an open-source item Debezium, such that the open-source item client of this embodiment may be an open-source item Debezium client, which may support at least one relational database. The open source item Debezium may provide a low-latency streaming processing platform for Captured Data Change (CDC), may be installed and configured to monitor the database, and then the application may Change Data of each row-level (row-level) in the database. Wherein only committed changes are visible, so that it can be avoided that corresponding transactions are applied (transactions) or that changes are rolled back (roll back). In addition, the open source item Debezium provides a uniform model for all database modification events, so the intricacies of each database management system can be avoided. Because the open source item Debezium records the history of database data changes by using the persistent log with copy backup, the application can stop and restart at any time without missing events occurring when the application stops running, thereby ensuring that all events can be correctly and completely processed.

In this embodiment, the database may include at least one relational database, the open source project client is started first, the started open source project client is used to monitor the target relational database in the at least one relational database, and if incremental data is monitored to exist in the target relational database, the incremental data of the target relational database is collected, where the incremental data is changed data in the target relational database, and may be a structured data file, and needs to be collected, that is, incremental collection is performed. The relational database of the embodiment may include, but is not limited to, a relational database Mysql, a relational database Oracle, a relational database Postgresql, or a relational database SQL Server, thereby avoiding that the existing data framework only supports incremental acquisition of the relational database Mysql and does not support incremental acquisition of the relational database Oracle, the relational database SQL Server, and the relational database Postgresql.

Alternatively, the embodiment may write code to enable the open-source project client to connect with a target relational database, for example, connect with the relational database Mysql, so as to perform incremental collection.

Optionally, in this embodiment, an open source project client is newly created, and multiple configurations are performed on the open source project client, after the configuration is completed, the open source project client starts to be connected to the target relational database, log monitoring is performed on the target relational database, and once it is monitored that data in the target relational database changes, the changed data can be collected.

The open source project client side of the embodiment can sample the same code to acquire the incremental data of the target relational database in at least one relational database, that is, the code for acquiring the incremental data can be multiplexed to realize acquisition of a plurality of relational databases, and repeated coding is not needed when the incremental data in the relational databases are acquired every time, so that repeated labor is reduced.

And step S104, writing the acquired incremental data of the target relational database into a target data source.

In the technical solution provided by step S104 of the present invention, after the incremental data of the target relational database in the at least one relational database is collected by the open-source project client, the collected incremental data of the target relational database may be written into the target data source.

In this embodiment, the target data source may be a data warehouse tool, for example, a data warehouse tool Hive, which is a data warehouse tool based on Hadoop, and may map the Structured data file into a database table, and provide a Query function of a simple Structured Query Language (SQL) and convert an SQL statement into a MapReduce task of a programming model for running.

The embodiment writes the acquired incremental data of the target relational database into the target data source, thereby avoiding the problem that the existing data frame does not support the writing of the data into the target data source.

Through the steps S102 to S104, acquiring incremental data of a target relational database in at least one relational database by using an open source project client, wherein the incremental data is changed data in the target relational database, and the open source project client supports the at least one relational database; and writing the acquired incremental data of the target relational database into a target data source. That is to say, the method and the system can realize the acquisition of at least one relational database through the open source project client, do not need to develop codes for each relational database separately, can realize the multiplexing of the codes, reduce repeated labor, and write the data acquired in increments into a target data source, thereby achieving the technical effect of improving the data acquisition efficiency and further solving the technical problem of low data acquisition efficiency.

The above-described method of this embodiment is further described below.

As an optional implementation manner, before acquiring, by the open-source project client, incremental data of a target relational database in the at least one relational database in step S102, the method further includes: and starting an increment acquisition configuration function of the target relational database, wherein the increment acquisition configuration function is used for allowing the open-source project client to acquire increment data of the target relational database.

In this embodiment, the target relational database has an increment acquisition configuration function, which is used to enable the open source project client to acquire increment data, so that before the open source project client acquires the increment data of the target relational database in at least one relational database, the target relational database can be set, and the increment acquisition configuration function of the target relational database needs to be started, so that the target relational database supports increment acquisition.

Alternatively, when setting the target relational database, the embodiment may open the binlog setting of the database, for example, the binlog setting of Mysql is opened, and the following settings are performed:

[mysqld]

log-bin-mysql-bin # adds the settings for this line;

binlog-format ═ ROW # selects the setting of the line (ROW) mode.

Optionally, the embodiment creates the user and gives the right may be implemented as follows:

CREATE USER debezium IDENTIFIED BY'debezium';

GRANT SELECT,REPLICATION SLAVE,REPLICATION CLIENT ON*.*TO'debezium'@'%';

--GRANT ALL PRIVILEGES ON*.*TO'debezium'@'%';

FLUSH PRIVILEGES。

as an optional implementation manner, before writing the acquired incremental data of the target relational database into the target data source in step S104, the method further includes: and acquiring an original data table structure of the target relational database, wherein the original data table structure is a table structure of incremental data to be acquired.

In this embodiment, before writing the acquired incremental data of the target relational database into the target data source, an original data table structure of the target relational database to be acquired, that is, a table structure of a data table of the incremental data to be acquired may be acquired, so as to convert the table structure into a target data table structure supported by the target data source, for example, into a target data table structure supported by the data warehouse tool Hive, thereby implementing creation of the target data table structure.

As an optional implementation manner, step S104, writing the acquired incremental data of the target relational database into the target data source, includes: converting the table structure of the acquired incremental data of the target relational database from the original data table structure into a target data table structure; the incremental data of the target data table structure is written into the target data source.

In this embodiment, after the table structure of the acquired incremental data of the target relational database is converted from the original data table structure to the target data table structure, the table structure of the acquired incremental data of the target relational database may be converted from the original data table structure to the target data table structure to obtain the incremental data of the target data table structure, and then the incremental data of the target data table structure may be written into the target data source, so as to achieve the purpose of mapping the incremental data into a database table and writing the database table into the target data source.

As an alternative embodiment, converting the table structure of the acquired incremental data of the target relational database from the original data table structure to the target data table structure, includes: acquiring a first target instruction, wherein the first target instruction is used for indicating the conversion of an original data table structure; and responding to the first target instruction to convert the table structure of the acquired incremental data of the target relational database from the original data table structure to the target data table structure.

In this embodiment, when the table structure of the acquired incremental data of the target relational database is converted from the original data table structure to the target data table structure, a first target instruction may be obtained first, where the first target instruction may be a program code statement for converting the original data table structure into a program code statement for creating the target data table structure supported by the target data source, and then the program code statement is executed to create the target data table structure, so that the purpose of converting the table structure of the acquired incremental data of the target relational database from the original data table structure to the target data table structure is achieved.

As an optional implementation manner, step S104, writing the acquired incremental data of the target relational database into the target data source, includes: acquiring a second target instruction, wherein the second target instruction is used for indicating the write operation of the incremental data; and responding to a second target instruction to write the acquired incremental data of the target relational database into the target data source.

In this embodiment, when writing acquired incremental data of the target relational database into the target data source, a second target instruction may be obtained first, where the second target instruction is used to instruct to perform a write operation on the incremental data, and may instruct to convert the incremental data into data supported by the target data source, where the target data source has a characteristic of supporting transactional update, and then instruct to write the converted incremental data into the target data source, for example, into the data warehouse tool Hive, by using the characteristic of the target data source that supports the transactional update.

As an alternative embodiment, the type of incremental data includes at least one of: an increase type; updating the type; the type is deleted.

In this embodiment, after the open source project client is started, data collection is started, and the types of the collected incremental data may include an addition type, an update type, and a deletion type, where the addition type is used to indicate that the change form of the data is addition, the update type is used to indicate that the change form of the data is update, and the deletion type is used to indicate that the change form of the data is deletion.

In the embodiment, incremental data of a target relational database in at least one relational database is acquired through an open source project client, wherein the incremental data is changed data in the target relational database, and the open source project client supports the at least one relational database; and writing the acquired incremental data of the target relational database into a target data source. That is to say, the method and the system realize the acquisition of at least one relational database through the open source project client, do not need to develop codes for each relational database separately, can realize the multiplexing of the codes, reduce the repeated labor, and write the data acquired in increments into the target data source, thereby achieving the technical effect of improving the data acquisition efficiency and further solving the technical problem of low data acquisition efficiency.

It should be noted that the steps illustrated in the flowcharts of the figures may be performed in a computer system such as a set of computer-executable instructions and that, although a logical order is illustrated in the flowcharts, in some cases, the steps illustrated or described may be performed in an order different than presented herein.

Example 2

The embodiment of the invention also provides a data acquisition device. It should be noted that the data acquisition apparatus of this embodiment may be used to execute the data acquisition method of the embodiment of the present invention.

Fig. 2 is a schematic diagram of a data acquisition device according to an embodiment of the present invention. As shown in fig. 2, the data acquisition device 20 may include: an acquisition unit 21 and a writing unit 22.

The acquisition unit 21 is configured to acquire incremental data of a target relational database in the at least one relational database through the open-source project client, where the incremental data is changed data in the target relational database, and the open-source project client supports the at least one relational database.

And the writing unit 22 is used for writing the acquired incremental data of the target relational database into the target data source.

Optionally, the apparatus further comprises: the starting unit is used for starting an incremental acquisition configuration function of the target relational database before acquiring incremental data of the target relational database in at least one relational database through the open-source project client, wherein the incremental acquisition configuration function is used for allowing the open-source project client to acquire the incremental data of the target relational database.

Optionally, the apparatus further comprises: the acquiring unit is used for acquiring an original data table structure of the target relational database before writing the acquired incremental data of the target relational database into the target data source, wherein the original data table structure is a table structure of the incremental data to be acquired.

Alternatively, the writing unit 22 includes: the writing module is used for converting the table structure of the acquired incremental data of the target relational database from the original data table structure into a target data table structure; the incremental data of the target data table structure is written into the target data source.

Optionally, the conversion unit comprises: the first determining module is used for acquiring a first target instruction, wherein the first target instruction is used for indicating the conversion of an original data table structure; and the first response module is used for responding to the first target instruction so as to convert the table structure of the acquired incremental data of the target relational database from the original data table structure to the target data table structure.

Alternatively, the writing unit 22 includes: the second determining module is used for acquiring a second target instruction, wherein the second target instruction is used for indicating the writing operation of the incremental data; and the second response module is used for responding to a second target instruction so as to write the acquired incremental data of the target relational database into the target data source.

Optionally, the type of incremental data comprises at least one of: an increase type; updating the type; the type is deleted.

Optionally, the target data source is a data warehouse tool.

In this embodiment, the acquiring unit 21 acquires incremental data of a target relational database in at least one relational database through an open source project client, where the incremental data is changed data in the target relational database, and the open source project client supports the at least one relational database; the acquired incremental data of the target relational database is written into the target data source through the writing unit 22. That is to say, the application realizes the collection of a plurality of relational databases through the open source project client, does not need to develop codes for the target relational database separately, can realize the multiplexing of the codes, can reduce the repeated labor, and can write the data acquired in increments into the target data source, thereby achieving the technical effect of improving the data acquisition efficiency and further solving the technical problem of low data acquisition efficiency.

Example 3

In this embodiment, the data acquisition apparatus includes a processor and a memory, the acquisition unit 21 and the writing unit 22 are both stored in the memory as program units, and the processor executes the program units stored in the memory to implement corresponding functions.

The processor comprises a kernel, and the kernel calls the corresponding program unit from the memory. The kernel can be set to be one or more than one, and the technical effect of improving the data acquisition efficiency is achieved by adjusting the kernel parameters.

Example 4

An embodiment of the present invention provides a storage medium on which a program is stored, the program implementing the data acquisition method when executed by a processor.

Example 5

Fig. 3 is a schematic diagram of an electronic device according to an embodiment of the invention. As shown in fig. 3, the electronic device 30 includes at least one processor 301, and at least one memory 302 connected to the processor 301, a bus 303; wherein, the processor 301 and the memory 302 complete the communication with each other through the bus 303; the processor 301 is used to call program instructions in the memory 302 to perform the data acquisition method described above. The electronic device 30 herein may be a server, a PC, a PAD, a mobile phone, etc.

Example 6

The present application further provides a computer program product adapted to perform a program for initializing the following method steps when executed on a data processing device: acquiring incremental data of a target relational database in at least one relational database through an open source project client, wherein the incremental data is changed data in the target relational database, and the open source project client supports the at least one relational database; and writing the acquired incremental data of the target relational database into a target data source.

When executed on a data processing device, is further adapted to perform a procedure for initializing the following method steps: before acquiring incremental data of a target relational database in at least one relational database through an open source project client, starting an incremental acquisition configuration function of the target relational database, wherein the incremental acquisition configuration function is used for allowing the open source project client to acquire the incremental data of the target relational database.

When executed on a data processing device, is further adapted to perform a procedure for initializing the following method steps: before writing the acquired incremental data of the target relational database into the target data source, acquiring an original data table structure of the target relational database, wherein the original data table structure is a table structure of the incremental data to be acquired.

When executed on a data processing device, is further adapted to perform a procedure for initializing the following method steps: converting the table structure of the acquired incremental data of the target relational database from the original data table structure into a target data table structure; the incremental data of the target data table structure is written into the target data source.

When executed on a data processing device, is further adapted to perform a procedure for initializing the following method steps: acquiring a first target instruction, wherein the first target instruction is used for indicating the conversion of an original data table structure; and responding to the first target instruction to convert the table structure of the acquired incremental data of the target relational database from the original data table structure to the target data table structure.

When executed on a data processing device, is further adapted to perform a procedure for initializing the following method steps: acquiring a second target instruction, wherein the second target instruction is used for indicating the write operation of the incremental data; and responding to the second target instruction to write the incremental data of the target relational database into the target data source.

It will be apparent to those skilled in the art that the modules or steps of the present invention described above may be implemented by a general purpose computing device, they may be centralized on a single computing device or distributed across a network of multiple computing devices, and they may alternatively be implemented by program code executable by a computing device, such that they may be stored in a storage device and executed by a computing device, or fabricated separately as individual integrated circuit modules, or fabricated as a single integrated circuit module from multiple modules or steps. Thus, the present invention is not limited to any specific combination of hardware and software.

The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

完整详细技术资料下载
上一篇:石墨接头机器人自动装卡簧、装栓机
下一篇:一种用于智慧城市的公共信息管理系统

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!