Distributed data center based on kafka and Hash shared memory

文档序号:7633 发布日期:2021-09-17 浏览:34次 中文

1. A distributed data center based on kafka and hash shared memory is characterized by comprising:

the central database is used for establishing a data updating table corresponding to the constant table and writing the increased/modified data of the constant table into the data updating table;

the newly added data reading server is connected with the central database and used for regularly reading newly added data in the data updating table and pushing the read data to the kafka;

the system comprises a central database, a kafka server side, a shared memory and a root data center server side, wherein the central database is connected with the kafka server side, the root data center server side is also connected with the shared memory, and is used for reading the whole data of the central database when the shared memory connected with the root data center server side is queried to have no data, continuously consuming the newly added data from the kafka after the whole data is read, writing the data into the shared memory, and directly consuming the newly added data in the kafka and writing the data into the shared memory when the shared memory connected with the root data center server side is queried to have the data;

each page node data center server is used for synchronizing the total data of the data center servers to the node when inquiring that no data exists in the shared memory of the corresponding server, then consuming the newly added data in the kafka and storing the newly added data in the shared memory, and directly consuming the newly added data in the kafka and storing the newly added data in the shared memory when inquiring that no data exists in the shared memory of the corresponding server.

2. The kafka and hash-based distributed data center based on memory sharing, according to claim 1, wherein a data update table of the same field of an existing constant table is built in the central database through a table building statement of mysql, and a trigger of a corresponding data table is created through a statement of mysql that creates a trigger to trigger an insert and update operation of the data table of mysql, so that new data or data of a modified central database is inserted into the data update table through the trigger.

3. The kafka and hash-based distributed data center of claim 1, wherein the newly added data reading server periodically reads the first 128 pieces of data in the data update table, reads data queried by the mysql query statement, and pushes the read data to the kafka.

4. The kafka and hash-based distributed data center according to claim 3, wherein the data read by the newly-added data reading server is pushed to the kafka in a json manner.

5. The kafka and hash-based distributed data center of claim 3, wherein the newly added data reading server deletes the corresponding read data from the data update table by indexing a data deletion statement using mysql to avoid repeated reading of the data.

6. The kafka and hash-based shared memory distributed data center according to claim 1, wherein the shared memory includes a shared memory header and a data area, the shared memory header stores information including a plurality of times of data insertion, a number of times of data deletion, a number of times of data update, a data version, and a data amount of the shared memory, and the data area is used for storing data read from the central database by the root data center server.

7. The kafka and hash-based distributed data center of claim 6, wherein the root data center server opens the shared memory through an interface created by a boost library, and queries whether data exists in the shared memory; when the root data center server side reads that no data exists in the shared memory, reading the full data of the central database, storing the read data into a data area of the shared memory, after the full data of the central database is read, consuming updated data from the kafka, and writing the consumed data into the data area of the shared memory through an interface of the boost library; and when the root data center server side reads the data existing in the shared memory, the data are directly consumed from the kafka, and the consumed data are written into the shared memory through an interface of the boost library.

8. The kafka and hash-based distributed data center of claim 1, wherein each page node data center server opens a shared memory through an interface created by a boost library, and queries whether data exists in the shared memory; when the shared memory is inquired by each page node data center server, the whole data is synchronized to the server corresponding to the page node data center server from the root data center server in a file synchronization mode, then the newly added data in the kafka is consumed, and the newly added data is stored in the shared memory through an interface of a boost library for inserting the data of the shared memory; when the server side of each page node data center inquires that the shared memory has data, the server side directly starts to consume the newly added data of the kafka, and stores the data into the shared memory through an interface of the boost library, which is inserted with the data of the shared memory.

9. The kafka and hash-based distributed data center according to claim 8, wherein when each of the page node data center servers queries that no data exists in the shared memory, the page node data center servers consume new data in the kafka by starting consumption according to an offset value of the kafka in the synchronized data;

when the page node data center server inquires that data exists in the shared memory, the newly added data in the page node data center server consumption kafka is consumed from the corresponding offset value directly.

10. The kafka and hash-based distributed data center according to any one of claims 1 to 9, wherein the shared memory is a hash-based shared memory.

Background

For a data center, a redis (redis is a key-value storage system) or a database is currently generally used for storage, and only one central database or a database with separate reading and writing is used. With the increase of data volume, the data of a central database is more huge, the writing and accessing frequency of a data center is increased, the accessing pressure is continuously close to the limit of the data center, so that the accessing pressure of the central database is overlarge, and the performance of the data center also limits the continuous increase of services.

Accordingly, the prior art is deficient and needs improvement.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a kafka and hash-based shared memory distributed data center.

The technical scheme of the invention is as follows:

a distributed data center based on kafka and hash shared memory comprises:

the central database is used for establishing a data updating table corresponding to the constant table and writing the increased/modified data of the constant table into the data updating table;

the newly added data reading server is connected with the central database and used for regularly reading newly added data in the data updating table and pushing the read data to the kafka;

the system comprises a central database, a kafka server side, a shared memory and a root data center server side, wherein the central database is connected with the kafka server side, the root data center server side is also connected with the shared memory, and is used for reading the whole data of the central database when the shared memory connected with the root data center server side is queried to have no data, continuously consuming the newly added data from the kafka after the whole data is read, writing the data into the shared memory, and directly consuming the newly added data in the kafka and writing the data into the shared memory when the shared memory connected with the root data center server side is queried to have the data;

each page node data center server is used for synchronizing the total data of the data center servers to the node when inquiring that no data exists in the shared memory of the corresponding server, then consuming the newly added data in the kafka and storing the newly added data in the shared memory, and directly consuming the newly added data in the kafka and storing the newly added data in the shared memory when inquiring that no data exists in the shared memory of the corresponding server.

Furthermore, a data update table of the same field of the existing constant table is established in the central database through a table building statement of mysql, and a trigger corresponding to the data table is created through a statement of the creation trigger of mysql to trigger the insertion and update operations of the data table of mysql, so that the newly added data or the modified data of the central database is inserted into the data update table through the trigger.

Further, the newly added data reading server reads the first 128 pieces of data of the data update table at regular time, reads the data queried by the query statement of mysql, and pushes the read data to the kafka.

Further, the data read by the newly added data reading server is pushed to the kafka in a json mode.

Further, the newly added data reading service end deletes the corresponding read data from the data updating table by indexing the data deleting statement of mysql so as to avoid repeated reading of the data.

Furthermore, the shared memory comprises a shared memory head and a data area, the information stored in the shared memory head comprises a plurality of times of inserting data, a plurality of times of deleting data, a plurality of times of updating data, a data version and a data volume of the shared memory, and the data area is used for storing the data read from the central database by the data center server.

Further, the root data center server opens the shared memory through an interface created by a boost library, and queries whether data exists in the shared memory; when the root data center server side reads that no data exists in the shared memory, reading the full data of the central database, storing the read data into a data area of the shared memory, after the full data of the central database is read, consuming updated data from the kafka, and writing the consumed data into the data area of the shared memory through an interface of the boost library; and when the root data center server side reads the data existing in the shared memory, the data are directly consumed from the kafka, and the consumed data are written into the shared memory through an interface of the boost library.

Furthermore, each page node data center server opens a shared memory through an interface created by a boost library, and queries whether data exists in the shared memory; when the shared memory is inquired by each page node data center server, the whole data is synchronized to the server corresponding to the page node data center server from the root data center server in a file synchronization mode, then the newly added data in the kafka is consumed, and the newly added data is stored in the shared memory through an interface of a boost library for inserting the data of the shared memory; when the server side of each page node data center inquires that the shared memory has data, the server side directly starts to consume the newly added data of the kafka, and stores the data into the shared memory through an interface of the boost library, which is inserted with the data of the shared memory.

Further, when each page node data center server inquires that no data exists in the shared memory, the newly added data in the consumption kafka of the page node data center server starts to consume according to the offset value of the kafka in the synchronized data;

when the page node data center server inquires that data exists in the shared memory, the newly added data in the page node data center server consumption kafka is consumed from the corresponding offset value directly.

Further, the shared memory is a hash shared memory.

By adopting the scheme, the invention has the following beneficial effects:

1. according to the invention, the data of the central database is synchronized into the shared memory of each server, so that the access pressure of the central database is reduced, the problem of too low cross-region and cross-country access speed is effectively solved, the distributed query of the data is realized, the speed of service query and data verification is greatly improved, and the rapid query and verification requirements of local services are met;

2. the application of kafka in the optimal scheme ensures that the service and the server update do not interfere with each other, the server is updated without stopping the service, and the normal operation of the service in the server updating process is ensured.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the structures shown in the drawings without creative efforts.

Fig. 1 is a flowchart of a distributed data center based on kafka and hash shared memory according to the present invention.

The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.

Detailed Description

The invention is described in detail below with reference to the figures and the specific embodiments.

Referring to fig. 1, the present invention provides a kafka and hash-based distributed data center sharing a memory, including:

the central database 1 establishes a data updating table corresponding to the constant table, and writes the added/modified data of the constant table into the data updating table through a trigger; specifically, the central database 1 establishes a mysql table (referred to as a data update table, and hereinafter, the mysql table is replaced by the data update table) of the same field of an existing constant table through a table building statement of the mysql (the mysql table is referred to as the data update table), and creates a trigger corresponding to the data table through a statement of the mysql creating trigger, so as to trigger the operations of inserting and updating the data table of the mysql, and further insert new data or modified data of the central database 1 into the data update table through the trigger;

a newly added data reading server 2, wherein the newly added data reading server 2 reads newly added data from a corresponding data updating table at regular time, and pushes the read data to kafka (kafka is a high-throughput distributed publish-subscribe message system) in a json manner; specifically, the newly added data reading server 2 is connected to the central database 1, and reads the first 128 pieces of data in the data update table of the central database 1 at regular time, and when the data exists in the data update table queried by the query statement of mysql, the newly added data reading server 2 reads the data in the data update table queried by the query statement of mysql, and pushes the data to kafka in a json manner; in order to ensure that data is not repeatedly read, the newly added data reading server 2 deletes the corresponding read data from the data update table by indexing a data deletion statement of mysql;

the root data center server 3 firstly queries whether data exists in a shared memory connected with the root data center server 3, wherein the shared memory is preferably a hash shared memory, the shared memory is structurally divided into a shared memory head and a data area, information stored by the shared memory head includes a plurality of times of inserting data, times of deleting data, times of updating data, a data version and data quantity of the shared memory, the data area is used for storing data read by the root data center server 3 from the central database 1, if no data exists in the hash shared memory, the root data center server 3 reads full data from a full table in the central database 1, and after the full data of the central database 1 is read, the root data center server 3 continuously consumes newly added data from the kafka, and writing the data into a hash shared memory, and if the hash shared memory has data, the root data center server 3 directly consumes the newly added data in the kafka, and writes the data into the hash shared memory, that is, the root data center server 3 judges whether it needs to read the full data from the full table in the central database 1 by inquiring whether the data exists in the hash shared memory, when the data does not exist in the hash shared memory, the root data center server 3 first reads the full data from the full table in the central database 1, continuously adds the consumed data from the kafka after reading the full data in the central database 1, and writes the data into the hash shared memory, and when the data exists in the hash shared memory, the root data center server 3 directly consumes the newly added data in the kafka, writing the data into the Hash shared memory; in order to prevent the root data center server 3 from accessing the central database 1 again to read the full data caused by the restart of the server, the root data center server 3 regularly stores the data in the hash shared memory into a file, thereby solving the problem that the service provided by the server is slow; specifically, the root data center server 3 is connected to the central database 1, the kafka and the hash shared memory, when the root data center server 3 is started, firstly, an interface for opening the hash shared memory is created through the boost library, and whether the hash shared memory with the same name as that of the interface (i.e., the interface created by the boost library) exists is determined, if the hash shared memory with the same name as that of the interface does not exist, the root data center server 3 calls the boost library to create a new interface for opening the hash shared memory, and if the hash shared memory with the same name as that of the interface exists, the root data center server 3 opens the hash shared memory by using the interface (i.e., the interface created by the boost library), and determines whether data exists in the hash shared memory by reading the data amount of the shared memory stored in the shared memory head, if no data exists in the hash shared memory, the root data center server 3 is connected with the center database 1 and reads the total data in the total table of the center database 1 (namely, after the root data center server 3 is connected with the center database 1, the root data center server 3 reads all data in the mysql data table according to the table name), stores the read data into the data area of the hash shared memory, and after the total data is read, the updated data is consumed from the kafka (namely, the data of the inserted or updated center database 1 is obtained from the kafka), and then the consumed data is written into the data area of the hash shared memory connected with the data area through the interface of the boost library, and if the data exists in the hash shared memory, the root data center server 3 directly consumes the data from the kafka and writes the consumed data into the hash shared memory connected with the root data center server 3 through the interface of the boost library Sharing the memory;

a plurality of page node data center service terminals 4, each page node data center service terminal 4 corresponds to a server, each server is provided with a shared memory, the shared memory is preferably a hash shared memory, each page node data center service terminal 4 firstly inquires whether data exists in the hash shared memory of the corresponding server, if no data exists in the hash shared memory of the corresponding server, the page node data center service terminal 4 synchronizes the full data to the node (i.e. the node corresponding to the page node data center service terminal) from the root data center service terminal 3 in a file synchronization mode, then starts to consume the newly-increased data in kafka according to the offset value of the kafka in the synchronized data and stores the newly-increased data in the hash shared memory, and if data exists in the hash shared memory of the corresponding server of the page node data center service terminal 4, the page node data center server 4 directly consumes the newly added data in the kafka from the corresponding offset value, and stores the newly added data in the hash shared memory; that is, each page node data center server 4 determines whether there is data in the local hash shared memory (i.e. the hash shared memory provided in the server corresponding to the page node data center server 4) to determine whether there is a need to synchronize the full data from the root data center server 3, when there is no data in the hash shared memory, the page node data center server 4 synchronizes the full data from the root data center server 3 to the node corresponding to the page node data center server 4 in a file synchronization manner, then starts to consume the newly added data in kafka according to the offset value of kafka in the synchronized data, and stores the newly added data in the hash shared memory, and when there is data left in the hash shared memory, the page node data center server 4 directly consumes the newly added data of kafka from the corresponding offset value, storing the newly added data into a Hash shared memory; specifically, each page node data center server 4 is connected to the root data center server 3 and the kafka, when each page node data center server 4 is started, firstly, an interface for opening a hash shared memory is created through a boost library, and whether a hash shared memory with the same name as that of the interface exists is judged, if the hash shared memory with the same name as that of the interface does not exist, the page node data center server 4 calls the boost library to create a new interface for opening the hash shared memory, and if the hash shared memory with the same name as that of the interface exists, the page node data center server 4 opens the hash shared memory by using the interface of the boost library (i.e., the interface), and judges whether data exists in the hash shared memory by reading the data amount of the shared memory stored in the shared memory head, if no data exists in the hash shared memory, the page node data center server 4 is connected with the root data center server 3 and synchronizes the total data from the root data center server 3 to the server corresponding to the page node data center server 4 in a file synchronization mode, then, the newly added data in kafka is consumed according to the offset value of kafka in the synchronized data, and the newly added data is stored in the hash shared memory through an interface of a boost library for inserting shared memory data, and if the data already exists in the hash shared memory, the page node data center server 4 directly consumes the newly added data of kafka from the corresponding offset value and stores the data in the hash shared memory through the interface of the boost library for inserting shared memory data.

Compared with the prior art, the invention has the following beneficial effects:

1. according to the invention, the data of the central database is synchronized into the shared memory of each server, so that the access pressure of the central database is reduced, the problem of too low cross-region and cross-country access speed is effectively solved, the distributed query of the data is realized, the speed of service query and data verification is greatly improved, and the rapid query and verification requirements of local services are met;

2. the application of kafka in the optimal scheme ensures that the service and the server update do not interfere with each other, the server is updated without stopping the service, and the normal operation of the service in the server updating process is ensured.

The present invention is not limited to the above preferred embodiments, and any modifications, equivalent substitutions and improvements made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

完整详细技术资料下载
上一篇:石墨接头机器人自动装卡簧、装栓机
下一篇:一种基于PBFT的委托权益证明区块链共识算法

网友询问留言

已有0条留言

还没有人留言评论。精彩留言会获得点赞!

精彩留言,会给你点赞!