File operation IP tracing method and system
1. A file operation IP tracing method is characterized by comprising the following steps:
establishing a kernel debugging module, and mounting at least two different key functions on the kernel debugging module;
acquiring a PID dictionary and a socketfd dictionary for executing the key function process;
establishing a file monitoring module, and recording the PID and the file operation behavior of the file operation process;
mounting an execute function in the kernel debugging module, recording a function calling process, and acquiring and returning a process tree;
comparing the process PID of the file operation with the returned process tree, searching the process tree where the process PID of the file operation is located, and acquiring an ancestor process according to the process tree;
and comparing the PID of the ancestor process with the PID dictionary and the socketfd dictionary, acquiring the socketfd corresponding to the ancestor process, and analyzing the IP in the socketfd of the ancestor process.
2. The IP tracing method for file operations according to claim 1, wherein the parsing ancestor process obtaining the IP method in socketfd comprises: calling a native function sockfd _ lookup function in the kernel, converting sockfd acquired by the traceable ancestor process into a sock structural body through the sockfd _ lookup function, and analyzing a real IP from the sock structural body.
3. The IP tracing method for file operations according to claim 1, wherein the kernel debugging module includes a kprobe module in a Linux system and a kretprobe module derived from the kprobe module, and the kernel debugging module tracks and records an execution state of a kernel function.
4. The IP tracing method for file operation according to claim 1, wherein the file monitoring module is connected to a monitor program, and the file monitoring module records file reading, file writing, file deleting, file creating and file executing operations, obtains PID of the file operation process, and returns PID of the file operation process and file operation behavior to the monitor program.
5. The IP tracing method for the file operation according to claim 4, wherein the kernel debugging module inputs a PID dictionary, a socketfd dictionary, all process PIDs and a process tree into the monitor program, the monitor program compares the PID of the file operation process with the process tree to obtain the process tree containing the PID of the file operation process, and traces back the ancestor process in the process tree.
6. The IP tracing method for file operations according to claim 5, wherein the tracing method for ancestor process comprises: after the monitor program obtains a process tree containing a file operation process PID, whether the current file operation process PID has the calling of a key function is judged, if yes, the socketfd analytic IP called by the key function is obtained, if not, the parent process of the process called by the file operation process PID is further searched until the ancestor process called by the key function is searched.
7. The method as claimed in claim 6, wherein after the socket fd called by the parsing key function obtains the IP, it is determined whether the current IP obtained by parsing is an external IP, and if so, the external IP is returned as the visitor IP.
8. The method for tracing the file operation IP according to claim 6, wherein if the IP obtained by current parsing is an internal IP, it is determined whether another different key function call exists in the file operation process PID corresponding to the parsed IP, if so, the socketfd of the different key function is obtained, and the socketfd of the different key function is further parsed to obtain the IP.
9. A file erasing operation IP tracing system, characterized in that the system performs a file erasing operation IP tracing method as claimed in any one of the above claims 1-8.
10. A computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, and the computer program can be executed by a processor to execute a file erasing operation IP tracing method according to any one of the claims 1 to 8.
Background
File monitoring is a widely used security monitoring mechanism deployed on linux servers. At present, a plurality of technical schemes are available on linux for monitoring reading, writing, deleting and executing of files, but all existing technical schemes for monitoring file operation have the same disadvantage that only the progress of the current operation of files can be determined, but the operation behavior is generated by which visitor (file operator) and the network geographic position (IP address) of the visitor cannot be traced back.
Disclosure of Invention
One purpose of the present invention is to provide a method and a system for tracing a file operation IP source, where the method and the system trace back to an ancestor process according to a process tree in which a file operation is located, and trace back a file operator IP through a socket chain acquired by the ancestor process, regardless of masquerading of an IP address, through a process chain and a socket chain of the file operation, so that a real IP can be effectively identified.
Another object of the present invention is to provide a method and a system for tracing a file operation IP source, where the method and the system obtain a PID dictionary and a socketfd dictionary for executing a key function through the key function, and the PID dictionary and the socketfd dictionary are used for comparing subsequent file operation tracing PIDs and obtaining a real IP.
The method and the system convert the socketfd into the Sock structure body in the process of tracing according to the ancestor process, and then analyze the real IP in the Sock structure body.
In order to achieve at least one of the above objects, the present invention further provides a file operation IP tracing method, including the steps of:
establishing a kernel debugging module, and mounting at least two different key functions on the kernel debugging module;
acquiring a PID dictionary and a socketfd dictionary for executing the key function process;
establishing a file monitoring module, and recording the PID and the file operation behavior of the file operation process;
mounting an execute function in the kernel debugging module, recording a function calling process, and acquiring and returning a process tree;
comparing the process PID of the file operation with the returned process tree, searching the process tree where the process PID of the file operation is located, and acquiring an ancestor process according to the process tree;
and comparing the PID of the ancestor process with the PID dictionary and the socketfd dictionary, acquiring the socketfd corresponding to the ancestor process, and analyzing the IP in the socketfd of the ancestor process.
According to a preferred embodiment of the present invention, the method for resolving the ancestor process to obtain the IP in the socketfd includes: calling a native function sockfd _ lookup function in the kernel, converting sockfd acquired by the traceable ancestor process into a sock structural body through the sockfd _ lookup function, and analyzing a real IP from the sock structural body.
According to another preferred embodiment of the present invention, the kernel debugging module includes a kprobe module in the Linux system and a kretprobe module derived from the kprobe module, and the kernel debugging module tracks and records the execution state of the kernel function.
According to another preferred embodiment of the present invention, the file monitoring module is connected to the monitor program, and the file monitoring module records file reading, file writing, file deleting, file creating, and file executing operations, obtains the PID of the above operations, and returns the file operating process PID and the file operating behavior to the monitor program.
According to another preferred embodiment of the present invention, the kernel debugging module inputs the PID dictionary, the socketfd dictionary, all process PIDs and the process tree into the monitor program, and the monitor program compares the file operation process PID with the process tree to obtain the process tree containing the file operation process PID, and traces back the ancestor process in the process tree.
According to another preferred embodiment of the present invention, the method for tracing the ancestor process comprises: after the monitor program obtains the process tree containing the PID of the file operation process, whether the current PID of the file operation process has the calling of the key function is judged, if yes, the socketfd analytic IP called by the key function is obtained, if not, the father process of the PID process of the file operation process is further searched until the ancestor process called by the key function is searched.
According to another preferred embodiment of the present invention, after the socket fd called by the parsing key function obtains the IP, it is determined whether the currently parsed IP is an external IP, and if the current parsed IP is an external IP, the external IP is returned as the visitor IP.
According to another preferred embodiment of the present invention, if the current IP obtained by parsing is an internal IP, it is determined whether another different key function call exists in the file operation process PID corresponding to the parsed IP, and if so, the socketfd of the different key function is obtained, and the socketfd of the different key function is further parsed to obtain the IP.
In order to achieve at least one of the above objects, the present invention further provides a file erasing operation IP tracing system, which executes the above file operation IP tracing method.
The invention provides a computer-readable storage medium, which stores a computer program, wherein the computer program can be executed by a processor to execute the IP tracing method for file erasing operation.
Drawings
Fig. 1 is a schematic flow chart showing a file operation IP tracing method according to the present invention.
Fig. 2 is a detailed flowchart of IP tracing of file operations according to the present invention.
Detailed Description
The following description is presented to disclose the invention so as to enable any person skilled in the art to practice the invention. The preferred embodiments in the following description are given by way of example only, and other obvious variations will occur to those skilled in the art. The basic principles of the invention, as defined in the following description, may be applied to other embodiments, variations, modifications, equivalents, and other technical solutions without departing from the spirit and scope of the invention.
It is understood that the terms "a" and "an" should be interpreted as meaning that a number of one element or element is one in one embodiment, while a number of other elements is one in another embodiment, and the terms "a" and "an" should not be interpreted as limiting the number.
Referring to fig. 1 to fig. 2, the method for tracing to a source of a file operation according to the present invention is schematically illustrated in a flow chart, wherein the method uses a process tree to find an ancestor process, and finds a real IP according to a socket of the ancestor process, because a quintuple of the socket in the ancestor process cannot be modified, a traceable IP has a higher credibility, and thus real key information of a file operator can be obtained.
Specifically, a kernel debugging module needs to be established, wherein the kernel debugging module comprises a kprobe module and a kretprobe derived from the kprobe module, and the kernel debugging module is a lightweight debugging technology for tracking and acquiring a kernel function call state in a Linux system. Mounting a plurality of key functions in a kernel debugging module, wherein the key functions comprise: the method comprises a recvfrom function and an accept function, wherein the accept function generates a socket file descriptor socketfd when establishing a link of each request in a waiting sequence, and the socket file descriptor socketfd of the accept function can be used for interaction with a client. The recvfrom function acquires a socket from a data sender when called each time, further specifies a buffer area for receiving data, a length of the received data and a mark of the received data through the recvfrom function, and returns a socket file descriptor socketfd. It should be noted that, when the recvdrom function is executed, a currently executed PID (process identifier) of the recvdrom function is obtained, and a file descriptor socketfd returned by the currently executed recvdrom function is obtained according to the PID, and a PID dictionary and a socketfd dictionary can be obtained through the two key functions, and the PID dictionary and the socketfd dictionary can be used for judging whether the attack instruction of the file operator has the call of the key function, so that the file operator IP can be queried according to the PID dictionary and the socketfd dictionary.
Furthermore, an execute function is also mounted in the kernel debugging module, and the execute function is a core function called by all user mode functions in the system, that is, the execute function is used for calling all user mode functions, so that the process PID of each file operation can be obtained by mounting the execute function in the kernel debugging module. It should be noted that, the same process tree has a same ancestor process, so that the PID of the sub-process under the process tree can be obtained to trace back to the same ancestor process under the process tree.
Specifically, the kernel debugging module is connected with a monitor program, acquires a PID (proportion integration differentiation) from an execute function, acquires a corresponding process tree after the analysis by the kernel debugging module, and transmits the process tree and the PID for executing the execute function into the monitor program; and the kernel debugging module acquires a PID dictionary and a socketfd dictionary for executing key functions such as a recvdrom function and an accept function, and transmits the PID dictionary and the socketfd dictionary of the key functions such as the recvdrom function and the accept function to the monitor program so that the monitor program can track an ancestor process and inquire a file operator IP, and the monitor program stores a process tree and a PID of a corresponding function.
The monitor program is further connected to a file monitoring module, where the file monitoring module is configured to identify and obtain a file operation behavior, and obtain a PID of a file operation process, where the file monitoring module may identify and monitor the file operation behavior by: file operation behaviors such as file writing, file reading, file deleting, file creating and file executing. And the file monitoring module transmits the PID of the file operation process to the monitor program so that the monitor program can find the IP of the file operator.
In the monitor program, after the current file operation behavior and the PID of the file operation process are obtained from the file monitoring module, whether the PID of the current file operation process exists in the current process tree list is judged by inquiring the process tree list, and if not, the next process tree list is further inquired. If the PID of the file operation process is inquired in the process tree list, the file operation behavior exists in the process tree, whether a recvdrom event exists in the file operation behavior process or not is further judged, namely the recvdrom function call exists, and if the recvdrom event exists, the corresponding socketfd is inquired according to a socketfd dictionary and a PID dictionary of the recvdrom function. And analyzing the IP of the file operator from the socketfd obtained by the query. If the current PID has no relevant recvfrom function call, further inquiring the PID parent process of the current process according to the process tree stored by the exeve function, further circularly searching the ancestor process of the PID of the current file operation process, acquiring the PID corresponding to the ancestor process, and further judging whether the ancestor process has recvfrom function call. If the event exists, inquiring the recvdrom function PID dictionary and the socketfd dictionary according to the current ancestor process PID, acquiring the socketfd of the ancestor process of the recvdrom event, and further analyzing the operation IP of the analysis file of the socketfd of the ancestor process of the recvdrom event.
According to the invention, the kernel debugging module is mounted with a plurality of key functions, so that the IP tracing of the file operation can be more accurate, and the possibility of failure of the IP tracing of the file operation under an extreme network environment is reduced. Furthermore, because some file operations in the kernel may be internal operations, not external operations, the present invention further determines whether the file operations are internal IPs based on the IP analyzed by the ancestor process socketfd, and because the number of the internal IPs is limited and relatively fixed, it can be determined whether the file operations are internal IPs by comparing the IP field analyzed by the ancestor process socketfd with the internal IP field, and when the IP analyzed by the ancestor process socketfd is determined to be external IPs, the external IPs are external visitors of the corresponding file operations, so that the relevant information of the external visitors of the file operations can be well monitored.
When the IP analyzed by the socket fd of the ancestral process is an internal IP, judging whether an accept event exists in the process corresponding to the PID of the file operation process, namely whether an accept function call exists, if so, finding the corresponding accept function socket fd by the PID dictionary and the socket fd dictionary of the accept function, analyzing the IP of the accept function socket fd, if the process corresponding to the PID of the file operation process has no relevant accept event, finding a process tree stored by an execome function according to the PID of the file operation process, tracing back to the ancestral process, obtaining the PID of the ancestral process, inquiring the PID dictionary and the socket fd of the accept function, finding the socket fd corresponding to the ancestral process, further analyzing the IP in the socket fd of the ancestral process of the accept event, judging whether the IP is the internal IP, and if not, judging that the IP is output by an external file operation visitor.
It should be noted that, the invention adopts the native function sockfd _ lookup in the Linux system, and the native function sockfd _ lookup can convert socketfd of the ancestor process into a socket structure body when the program runs, and five-tuple data is stored in the socket structure, wherein the five-tuple data includes: the file operation visitor IP can be obtained by obtaining the quintuple data. Since the sock structure of the ancestor process is less likely to be tampered with, a file operator IP with high confidence can be obtained.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section, and/or installed from a removable medium. The computer program, when executed by a Central Processing Unit (CPU), performs the above-described functions defined in the method of the present application. It should be noted that the computer readable medium mentioned above in the present application may be a computer readable signal medium or a computer readable storage medium or any combination of the two. The computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wire segments, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In this application, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless section, wire section, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
It will be understood by those skilled in the art that the embodiments of the present invention described above and illustrated in the drawings are given by way of example only and not by way of limitation, the objects of the invention having been fully and effectively achieved, the functional and structural principles of the present invention having been shown and described in the embodiments, and that various changes or modifications may be made in the embodiments of the present invention without departing from such principles.