Arrival⁢Unix⁢timestamp⁢%⁢ring⁢buffer⁢size(3)

  • Where in the equation (3),
  • The arrival Unix timestamp is the time of arrival of the first IP packet 110A, and
  • The ring buffer size may indicate the total number of data structures in the second set of data structures 120. For example, the size of the ring buffer may be 64.

The above equation may be used to select the data structure from the second set of data structures 120 for all the IP packets associated with the first set of IP packets 110, the second set of IP packets 112, and the Nth set of IP packets 114.

Consider the first example where the first IP packet 110A, which may be the SYN packet may arrive at a first timestamp. The first timestamp may be for example, —1704799501. The processor 202 may employ the equation (3) to determine the data structure from one or more data structures associated with the second set of data structures 120. For example, the ring buffer size is 64, on computing the equation (3), arrival Unix timestamp % ring buffer size (1704799501% 64), the computed result may be 13. The determined data structure of the second set of data structures 120 will be the 13th data structure.

In another embodiment, the first IP packet 110A associated with the first set of IP packets 110 may be stored in the index zero of the data structure selected from the third set of data structures 122. In the exemplary case mentioned above, the first IP packet 110A, which may be the SYN packet, may be stored in the data structure of the third set of data structures 122 corresponding to the 13th data structure of the second set of data structure 120.

Further, the index where the second IP packet 110B may be the SYN-ACK packet associated with the first set of IP packets 110 arriving at the same timestamp as the first IP packet 110A associated with the first set of IP packets 110 may be stored may be determined by adding a numeric value for example, 12 to the sum of the length of the first IP packet 110A. Considering the first example, the SYN packet may be stored in the 0th index of the data structure of the third set of data structures 122 corresponding to the 13th data structure of the second set of data structures 120 and the index of the data structure of the third set of data structures 122 corresponding to the 13th data structure of the second set of data structure 120 where the SYN-ACK packet arriving at the same timestamp may be stored may be determined by adding a numeric value, for example, 12 to the sum of the length of the SYN packet.

In another embodiment, based on the second IP packet 110B which may be the SYN-ACK packet associated with the first set of IP packets 110 arriving at the second timestamp, the second IP packet 110B that may be the SYN-ACK packet may be stored in the data structure of the third set of data structures 122 depending on the arrival time of the second IP packet 110B.

Considering the first example, where the SYN-ACK packet may arrive at the second timestamp, where the second timestamp may be 1704799502. The data structure of the third set of data structures 122 where the SYN-ACK packet may be stored may be determined by computing the equation (3). The processor 202 may compute the equation (3), on computing (1704799502% 64) the result may be 14. The second payload information associated with the SYN-ACK packet arriving at the second timestamp may be stored in the data structure of the third set of data structures 122 corresponding to the 14th data structure of the second set of data structures 120. Further in an embodiment, the system 102 may send an ACK (acknowledgment) packet to the server 106 acknowledging the response of the SYN-ACK packet. The processor 202 may compute the equation (3) to determine the location to store the payload information associated with the ACK packet in the data structure of the third set of data structures 122.

In one embodiment, once the IP packets stored in the data structures of the third set of data structures 122 may be stored in the memory of the system 102. The value of the index of the data structures of the third set of data structures 122 may be set to zero.

In another embodiment, the first data structure 122A, the second data structure 122B up to the Nth data structure 122N associated with the third set of data structures 122 may have a specific capacity and time limit. In one example, the set of IP packets 108 that may have been in the first data structure 122A of the third set of data structures 122 for a pre-set time period may be dumped on the memory of the system 102 as the raw file. The raw file may be associated with each of the third set of data structures 122. This may help the first data structure 122A of the third set of data structures 122 from over-flowing and may ensure that a set of IP packets 108 may be continually processed. The pre-set time period may be, for example, 32 seconds.

FIG. 5 is a block diagram 500 that illustrates the structure of an internet protocol (IP) packet, in accordance with an embodiment of the disclosure. FIG. 5 is explained in conjunction with elements of FIG. 1, FIG. 2, FIG. 3, and FIG. 4. With reference to FIG. 5, there is shown the block diagram 500 of the structure of the IP packet. The IP packet may include two sections an IP packet header 502 and an IP packet data 504. The IP packet header 502 may further include a packet length 502A, a subsequent IP packet file name 502B, and a packet offset 502C.

In one embodiment, the first IP packet 110A associated with the first set of IP packets 110 may be an example of the internet protocol (IP) packet. The IP packet may contain 14 bytes for the IP packet header 502. The IP packet header 502 may contain essential information necessary for the proper routing and delivery of the IP packet from the server 106 to the system 102. The IP packet header 502 may further contain a time-to-live (TTL) field that may ensure that the IP packet does not circulate indefinitely within the communication network 104. The IP packet header 502 may further contain protocol information that may be responsible for further processing of the data, such as a Transmission Control Protocol (TCP), and a user datagram protocol (UDP).

The IP packet header 502 may further include the packet length 502A, the packet length 502A may be of the size of 2 bytes. The packet length 502A may specify the total length of the IP packet including both the IP packet header 502 and IP packet data 504. The packet length 502A may allow the system 102 to determine the IP Packet size and may help differentiate between the IP packet header 502 and IP packet data 504. On determination of the packet length 502A, the system 102 may validate that the IP packet data 504 may have not been transmitted in its entirety and that the IP packet data 504 may have not been lost during the transmission.

The IP packet header 502 may further include the subsequent IP packet file name 502B. The subsequent IP packet file name 502B may be, for example, of 4 bytes. In an exemplary embodiment, the subsequent IP packet file name 502B of the first IP packet 110A associated with the first set of IP packets 110 may store the file name of the second IP packet 110B associated with the first set of IP packets 110. The first set of IP packets 110 may be associated with the first network session. In one example, based on the determination of the file name of the second IP packet 110B that may be stored in the subsequent IP packet file name 502B of the first IP packet 110A, it may be easier for the user 116 associated with the system 102 to retrieve the second IP packet 110B information without needing to iterate through all the raw files.

The IP packet header 502 may further include packet offset 502C. The packet offset 502C may be, for example, of 8 bytes. The packet offset 502C may be used to determine the location of the IP packet in the raw file. In one example, the packet offset 502C may indicate the location from where the first IP packet 110A associated with the first set of IP packet 110 begins within the raw file. On determination of the packet offset 502C the need for extensive data parsing may be eliminated and the process may become more efficient in high-speed network environments.

The IP packet header 502 may further include the IP packet data 504. The IP packet data 504 may be the data that may be transmitted from the server 106 to the system 102 over the communication network 104. The IP packet data 504 may allow end-to-end communication between the server 106 and the system 102. The IP packet data 504 may be text, files, images, commands, and the like. In one example, the IP packet data 504 may contain HTML content of web pages.

FIG. 6 is a block diagram 600 that illustrates the method for accessing the raw data file 602, in accordance with an embodiment of the disclosure. FIG. 6 is explained in conjunction with elements of FIG. 1, FIG. 2, FIG. 3, FIG. 4, and FIG. 5. FIG. 6 may include raw data files 602. The raw data files 602 may include a first raw file 602A, a second raw file 602B, a third raw file 602C, a fourth raw file 602D, a fifth raw file 602E, a sixth raw file 602F, a seventh raw file 602G, eighth raw file 602H, up to Nth raw file 602N.

In one embodiment, the raw data file 602 may be a file format and data organization method that may store the set of IP packets 108. The raw files such as the first raw file 602A, the second raw file 602B, to the Nth raw file 602N may be created based on one file per second per IP probe. In another embodiment, the first raw file 602A, the second raw file 602B, up to the Nth raw file 602N may follow a specific naming convention.

In one embodiment, the raw file may be created based on one file per second per IP probe. The IP probe may be a monitoring device (such as the system 102 or the server 106) that may be designed to capture and analyze network traffic. In one embodiment, the system 102 may be an exemplary embodiment of the monitoring device. In another embodiment, the creation of the raw file may follow a specific naming convention. For example, the name of the raw file may be in a specific format “yyyy/mm/ss/hh/mi/probe-id_unix-ts.raw”. The prototype of the path of the raw file may be a “user-defined-root-path/year/month/day/hour/minutes/probe-id_unixTimestampInSeconds.raw”. In one example, the name of the raw file may be /var/lib/raw/2021/8/30/13/2/0_1630308751.raw. In another example, the name of the raw file may be /var/lib/raw/2021/8/30/13/2/0_1630308752.raw and the like.

In yet another embodiment, the set of IP packets 108 that may be captured by the system 102 may be segmented based on the time of arrival. The system 102 may timestamp the first IP packet 110A and the second IP packet 110B associated with the first set of IP packets 110 to record the time of arrival of the first set of IP packets 110. Based on the time of arrival of the first IP packet 110A and the second IP packet 110B, the first IP packet 110A and the second IP packet 110B may be stored in the raw files. The raw data files 602 may be associated with the third set of data structures 122. For example, the first raw file 602A may be associated with the first data structure 122A of the third set of data structures 122. Similarly, the second raw file 602B may be associated with the second data structure 122B of the third set of data structures 122.

In one exemplary embodiment, the subsequent IP packet file name 502B associated with the IP packet header 502 of the first IP packet 110A associated with the first set of IP packets 110 in the first raw file 602A may contain the filename of the second IP packet 110B associated with first session IP packets 110. The second IP packet 110B associated with the first session of IP packet 110 may be stored in the fifth raw file 602E. The subsequent IP packet file name 502B associated with the IP packet header 502 of the Nth IP packet 110N associated with the first set of IP packets 110 may contain “NULL” as Nth IP packet 110N may be the last packet of the first session of IP packets 110.

Traditionally, writing the set of IP packets 108 to their corresponding data file may not be possible. There may have been a large number of input/output operations performed on the storage disk. Various software may have been used to write IP packets to the PCAP files based on the time of arrival and the number of IP packets. The PCAP files may not contain the session metadata. In one condition, when the user 116 associated with the system 102 may want to retrieve the set of IP packets associated with the first set of IP packets 110, the user 116 may have to parse all the PCAP files of all the network sessions sequentially from the first raw file 602A in order to retrieve the files containing IP packets associated with the first set of IP packets 119 where the first set of IP packets 110 may be associated with the first network session of the set of network sessions. The traditional technique may increase the time complexity to O(n).

In the disclosed IP packet indexing technique, the IP packet header 502 may contain the information of the next IP packet that may belong to the same network session. The disclosed technique of IP packet indexing may make it easier for the user 116 associated with the system 102 to retrieve the IP packets associated with the same network session.

In an exemplary embodiment, where the user 116 associated with the system 102 may want to retrieve the first IP packet 110A and second IP packet 110B associated with the first set of IP packets 110, the system 102 may determine the location of the first IP packet 110A in the first data structure of the first set of data structures 118. Further, the user 116 may determine the location of the second IP packet 110B in the subsequent IP packet file name 502B associated with the IP packet header 502 of the first IP packet 110A. For example, the first IP packet 110A may be stored in the first raw file 602A and the second IP packet 110B may be stored in the fifth raw file 602E.

In an exemplary case, the system 102 may receive the first set of IP packets 110, and where the first set of IP packets may include the SYN packet, the SYN-ACK packet, the ACK packet, and a FIN packet. The first set of IP packets 110 may be associated with the first network session. The FIN packet may be the last packet received from the server 106. The FIN packet may be sent to close a connection between the server 106 and the system 102. The SYN packet and the SYN-ACK packet may be received at the first timestamp, the ACK packet may be received 1 second after the arrival of the SYN packet and the SYN-ACK packet at the second timestamp and the FIN packet may be received 1 second after the arrival of the ACK packet at the third timestamp. The SYN packet and the SYN-ACK packet may be stored in the first raw file 602A. The name of the first raw file 602A may be, for example, 0_1634567890.raw. Similarly, the ACK packet may be stored in the second raw file 602B. The name of the second raw file 602B may be, for example, 0_1634567891.raw. Further, the FIN packet may be stored in the third raw file 602C. The name of the third raw file 602C may be, for example, 0_1634567892.raw. The IP packet header 502 of the SYN packet may include packet length 502A that may be, for example, 1500 bytes. Further, the IP packet header 502 of the SYN packet may include the SYN-ACK packet file name which may be 0_1634567890.raw. The IP packet header 502 of the SYN packet may further include the packet offset 502C which may be, for example, 96000.

Further, the IP packet header 502 of the SYN-ACK packet may include packet length 502A which may be, for example, 1400 bytes. Further, the IP packet header 502 of the SYN-ACK packet may include the ACK packet file name which may be 0_1634567891.raw. The IP packet header 502 of the SYN-ACK packet may further include the packet offset 502C which may be, for example, 16000.

The IP packet header 502 of the ACK packet may include packet length 502A that may be, for example, 1600 bytes. Further, the IP packet header 502 of the ACK packet may include the FIN packet file name which may be 0_1634567892.raw. The IP packet header 502 of the SYN packet may further include the packet offset 502C which may be, for example, 4000.

Further, the IP packet header 502 of the FIN packet may include packet length 502A that may be, for example, 2000 bytes. Further, the IP packet header 502 of the FIN packet may include the Subsequent IP packet file name 502B which may be NULL as the FIN packet may be the last packet received from the first network session. The IP packet header 502 of the FIN packet may further include the packet offset 502C that may be, for example, 0.

In an exemplary embodiment, in a case where the user 116 may want to retrieve the packets associated with the first network session, the processor 202 may enable the user 116 to retrieve the filename of the SYN packet that may be stored in the first set of data structures 118. The filename of the SYN-ACK packet may be retrieved from the IP packet header 502 of the SYN packet. Further, the filename of the ACK packet may be retrieved from the IP packet header 502 of the SYN-ACK packet. The filename of the FIN packet may be retrieved from the IP packet header 502 of the ACK packet.

Following the above technique, the user 116 associated with the system 102 may just have to parse the raw data files 602 which may contain the first IP packet 110A and the second IP packet 110B associated with the first set of IP packets 110. The above-mentioned technique may bring down the time complexity to O (1). Therefore, the raw data files 602 may allow efficient retrieval of the set of IP packets 108 from the set of network sessions. Instead of searching through all the raw files associated with the indexed raw data files 602, the user 116 may retrieve the IP packets by accessing only the concerned data files. The disclosed system 102 may provide an optimized and efficient technique for indexing and storing the IP packets, in order to streamline the network packet management process for network infrastructure.

FIG. 7 is a flowchart that illustrates an exemplary method for indexing IP packets, in accordance with an embodiment of the disclosure. FIG. 7 is explained in conjunction with elements from FIGS. 12345 and 6. With reference to FIG. 7 there is shown the flowchart 700. The operations of the exemplary method may be executed by any computing system, for example, by the system 102 of FIG. 1 or the processor 202 of FIG. 2. The operations of the flowchart 700 may start at 702.

At 702, the first internet protocol (IP) packet 110A associated with the first network session of the set of network sessions is received at the first timestamp. In an embodiment, the processor 202 may be configured to receive the first internet protocol (IP) packet associated with the first network session of a set of network sessions at the first timestamp. Details of receiving the first IP packet 110A are provided in FIG. 1.

At 704, the first metadata associated with the first network session is determined. In an embodiment, the processor 202 may be configured to determine the first metadata associated with the first network session. The first metadata includes the first set of data structures 118 associated with the received first IP packet. Details of determining the first metadata are provided in FIG. 2.

At 706, using the second set of data structures 120, the first identifier associated with the first data structure of the third set of data structures 122 is determined. In an embodiment, the processor 202 may be configured to determine, using the second set of data structures 120, the first identifier associated with the first data structure of the third set of data structures 122.

At 708, the first payload information associated with the first IP packet in the first data structure of the third set of data structures 122 is stored based on the determined first identifier. In an embodiment, the processor 202 may be configured to store the first payload information associated with the first IP packet in the first data structure of the third set of data structures 122 based on the determined first identifier. Details of storing the first payload information are provided in FIG. 4.

At 710, the second IP packet associated with the first network session at the second timestamp is received. In an embodiment, the processor 202 may be configured to receive the second IP packet associated with the first network session at the second timestamp. Details of receiving the second IP packet 110B are provided in FIG. 1.

At 712, the second payload information associated with the received second IP packet is stored. In an embodiment, the processor may be configured to store the second payload information associated with the received second IP packet in one of the first data structures or the second data structure 122B of the third set of data structures 122 based on the first timestamp and the second timestamp. Details of storing the second IP packet 110B are provided in FIG. 4.

Accordingly, blocks of the flowchart 700 support combinations of means for performing the specified functions and combinations of operations for performing the specified functions. It will also be understood that one or more blocks of the flowchart 700, and combinations of blocks in the flowchart 700, can be implemented by special-purpose hardware-based computer systems which perform the specified functions, or combinations of special-purpose hardware and computer instructions.

Alternatively, the system 102 may comprise means for performing each of the operations described above. In this regard, according to an example embodiment, examples of means for performing operations may comprise, for example, the processor and/or a device or circuit for executing instructions or executing an algorithm for processing information as described above.

Various embodiments of the disclosure may provide a non-transitory computer-readable medium and/or storage medium having stored thereon, instructions executable by a machine and/or a computer to operate a system (e.g., the system 102) to index internet protocol (IP) packets. The instructions may cause the machine and/or computer to perform operations including receiving, at a first timestamp, a first IP packet associated with a first network session of a set of network sessions. The operations may further include determining first metadata associated with the first network session. The first metadata may include a first set of data structures 118 associated with the received first IP packet 110A. The operations may further include determining, using a second set of data structures 120, a first identifier associated with a first data structure of a third set of data structures 122. The operations may further include storing a first payload information associated with the first IP packet 110A in the first data structure 122A of the third set of data structures 122 based on the determined first identifier. The operations may further include receiving a second IP packet 110B associated with the first network session at a second timestamp. The operation may further include storing a second payload information associated with the received second IP packet 110B in one of the first data structure 122A or a second data structure 122B of the third set of data structures 122 based on the first timestamp and the second timestamp.

Many modifications and other embodiments of the inventions set forth herein will come to mind to one skilled in the art to which these inventions pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Moreover, although the foregoing descriptions and the associated drawings describe example embodiments in the context of certain example combinations of reactants and/or functions, it should be appreciated that different combinations of reactants and/or functions may be provided by alternative embodiments without departing from the scope of the appended claims. In this regard, for example, different combinations of reactants and/or functions than those explicitly described above are also contemplated as may be set forth in some of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.