Statistical Data Handling Program of Wireshark Analyzer and Incoming Traffic Research

. The identification of the distribution laws of intervals is particularly sophisticated problem, at the same time the traffic as a random process tends to be constantly changing. Therefore it is important to know the numerical characteristics of these intervals or their moments. In this paper we propose to use the Wireshark analyzer to determine such characteristics. The paper presents a plugin to the Wireshark traffic analyzer to calculate the moments of the random variable – the interval between packets of incoming traffic. The article also presents the analytical solution for the average waiting time for a QS type H2/M/1. Here H2 is the 2nd order hyperexponential distribution law of the input flow time intervals. The final result is obtained as a solution of Lindley’s integral equation using the method of spectral decomposition. It is shown that in this case the distribution laws of intervals between input flow requirements can be approximated at the level of their three first moments. The joint use of these results allows to fully analyze the incoming traffic by queuing methods. The obtained results demonstrate the fact that the classical M/M/1 system shows optimistic results in comparison with the considered system. Therefore, the approach can be successfully applied in the modern teletraffic theory where packet delays in the incoming traffic are significant.


Introduction
The identification of the distribution laws of intervals is particularly sophisticated problem, at the same time the traffic as a random process tends to be constantly 304 changing. It is known, the queuing theory is based on the laws of distribution of intervals between income and service requirements. Therefore it is important to know the numerical characteristics of these intervals or their moments. In this paper we propose to use the Wireshark analyzer to determine such characteristics [[1]].

Description of the program Wireshark
Wireshark (previously, Ethereal) is a traffic analyzer for Ethernet computer networking technology and some others. In June 2006 the project was renamed Wireshark due to trademark issues [[1]]. The functionality provided by Wireshark is very similar to the capabilities of the tcpdump program, but Wireshark has a graphical user interface and additional features for sorting and filtering information. The program allows the user to view all the traffic through the network in real time, shifting the network card to promiscuous mode. (Eng. Promiscuous mode) (Fig. 1). Wireshark is an application that can display the structure of a wide variety of network protocols, and therefore allows parsing network packets, showing the value of each field protocol at any level. The use of Pcap packet capture library allows capturing data only from those networks that are supported by this library. However, Wireshark can work with multiple formats of input data an open data files captured by other programs that enhances the capture. The features include:  deep analysis of hundreds of protocols, with the regular addition of new ones;  capturing network traffic in real time, followed by analysis at any time;  standard three-pane packet browser (standard package has three regions);  cross-platform: there are versions for most types of UNIX, including Linux, Solaris, FreeBSD, NetBSD, OpenBSD, Mac OS X, as well as for Windows;  The captured from network information can be viewed by using the graphical user interface or by using the TTY-mode utility TShark;  the most powerful sorting and filtering in the industry;  a great opportunity to VoIP analysis;  output data can be exported to XML, PostScript®, CSV, or plain text. CSV is one of the formats of data export, convenient for viewing (Fig. 2). This file can be opened in any text editor or spreadsheet editor for analysis and calculation of performance.
However, it is difficult to process the data in case of intense traffic even in the spreadsheet editor. Furthermore the traffic data can be stored in more than one file. This article describes a software solution for the calculation of the numerical characteristics of packet arrival intervals. The main advantage of this analyzer is his work on a small scale of time (microseconds), in contrast to the same program NetFlow Analyzer, which captures packets-per-minute rate.

Determination of the moments of the interarrival time of incoming traffic
The program developed by the authors of the present paper allows, in addition to the analyzer, to retrieve the packet arrival times, isolated the incoming traffic from the entire data set received by Wireshark. Next, using the well-known formulas of mathematical statistics, it can be defined the moment characteristics of the timing. We use the statistics to the third order statistical properties, which provides representations of the distribution of the intervals. For example, the coefficient of variation shows the difference from a Poisson traffic flow and with asymmetry gives an indication of the degree of weight in the distribution tails. The average value of the interval between adjacent packets the second initial moment.
The coefficient of variation If a large amount of data is divided into several blocks, then these formulas are determined by the average group, and then their mean values.

Time data analysis software and Results
To calculate the moments of the interval between adjacent packets, we developed a program, which selects only the data related to the inbound packet from the input file, containing the capture of a network traffic data, and calculates intervals and moments. The features include:  sample timing of the data packets arrived at said host;  calculation of the time intervals between the incoming packets;  calculation of the torque characteristics for intervals of received packets;  saving time of the data packets arrived in binary and text format;  saving data packet arrival intervals in binary and text formats;  output and saving torque characteristics in a text format. The program handles text files containing the data as shown in Fig. 2 or similar. For the program the two classes (in terms of object-oriented programming) are developed:  TrafficLogParamsstores the packet arrival time, their intervals and calculates the torque characteristics. Also provides the methods to store and download the data from files;  LogParserstatic class that produces an analysis of the input file and adds data to the TrafficLogParams class. The input of LogParser main method is the file name and IP-address of the host. Each line of the source file is processed and from the selected data on the time and two IP-address -the address of the sender and the recipient's address. If the recipient field matches the host IP-address, then the packet arrival time is added to the array such times in TrafficLogParams class. } The method checks if the input symbol is a separator "." or ",". Such testing is important only for the time data, as in some countries, the fractional part is separated by a comma (for example, in Russia), rather than a point. It is for the reason, when a string representation of a number is converted to its equivalent real number denoting the time, the standard method is not used programming language, and its modification depends on the regional settings. When comparing the IP-address of the host with the IP-address on the current line of the log file to minimize the usual pro-IP-address to the general form. In other words, IP-address will be equal 010,014,000,011 10.14.0.11. The program was used to analyze the data file of the traffic coming to the proxy server of the university with almost an hour-long data set. The input file contains 310 more than 2150000 rows, which could not be processed manually. Were obtained the following results (Fig. 3):

Research of queuing system h2/m/1
The data indicate that the analyzed traffic differs from a Poisson (coefficient of variation c = 3,43 instead of 1), the asymmetry value As = 10,25 indicates that the distribution of intervals between the packets of traffic relates to a heavy-tailed distributions. For example, for Poisson flow of As = 2. The calculation of the characteristics of such traffic requires appropriate mathematical apparatus. For the analysis of such traffic the authors of [ [2]] proposed the new results for the system H2/M/1. We will describe the basic results from the article. It is known, as example from [ [3]], to study queuing systems (QS) G/G/1 the integral equation of Lindley is used: To solve (1), a spectral method is used that reduces to using the expression and finding a representation as a product of two factors, which would give a rational function of s [3]. Thus, to find the latency distribution, the following spectral decomposition is used: and function (6): Now we define (2) for the distributions (5) and (6) from (7) where the coefficients

Practical use of the results
Consider the result (10) for example, the input distribution, with a heavy tail ( fig. 3).
Using the Laplace transform (7) we can determine the initial moments of the distribution (5):