Effective Use of Resources Distributed Cloud Computing Platform for Providing Quality Multimedia Services

. Existing approaches to the use of cloud computing resources is not efficient. Modern multimedia services require significant computing power, which are not always available. In this paper, we introduce an approach that allows more efficient use of limited resources by dynamically scheduling the distribution of data flows at several levels: between the physical computing nodes, virtual machines, and multimedia applications . разработанного симулятора облачной системы.


Introduction
The information flows between computing nodes in local and global networks has been steadily increasing each year. It is true not only for large data processing centers, but also for locally datacenters (DC) specializing in industry, economy, health and so on. An important area to use local DCs is education. Universities are increasingly using their own DCs to support integrated automated information systems (IAIS), providing end users with network multimedia services. The need for more resources is one of the problems of high-loaded IAIS. The consumption of resources unlike the available volumes grows exponentially. [5]. The analysis of request flows to IAIS services shows their structure heterogeneity [1]. Modern IAIS services are based on the concept of cloud computing. However, the problem of limited resources used for cloud systems remains relevant [4].

316
The use of virtualization and cloud computing allows to consolidate several online services located on virtual machines (VM). It reduces the number of physical servers. But to effectively deploy applications on VM it is necessary to solve the problem of resource planning based on variable loads and service level agreement (SLA) [3]. The most flexible architecture of cloud computing is the infrastructure as a service (IaaS). This architecture allows the user to control a pool of computing resources. This approach can imply the start of operating systems and applications, and the creation of virtual machines and networks. Thus, cloud computing leads to significant cost savings due to the increased load density [2]. However, the above is not enough to consolidate computing power, to reduce the infrastructure overheads and to reach optimal performance of cloud systems. To use the cloud infrastructure effectively new methods and algorithms should be developed to control components of cloud systems. It demands determining the formal structure of a cloud system [6].

Model of resource virtualization of cloud systems
In our research, we have developed a model of computing resources of cloud systems. The conception of virtualization of computing resources is based on abstractions representing the tuples of relations between the interconnected elements of subsets. The cloud system can be represented as a set of interconnected objects. They are computing nodes (Snode), system storages (Sstg), network attached storages (Snas) and scheduling servers (Srasp). The number of objects and the content of each set may vary depending on the cloud's size and its use. Each compute node can run multiple instances of virtual machines represented as a set: Snodei={VMi,1, VMi,2, …, VMi,k}, (1) where k is the number of virtual machines on a compute node i, i = 1...l (lnumber of nodes). Each virtual machine belonging to the set (1) can support several applications and services represented as a set:

317
The work of entire cloud system is performed using the planning system for certain operations defined by the scheduling servers. Srasp={Rtask1, Rtask 2, … Rtaskf}, (5) The distributed storage system usually consists of failover RAID arrays Sstgf={RDsik1, RDsik2, …, RDsikd} containing the information for multimedia services RDsikd={Data1, Data2, …, Datas}, (6) In addition, the cloud system also contains virtual and physical switches for interconnection between all the components in a network. Each component of a cloud system Shcn={Snode, Snas, Srasp, Sstg, VM …} has the following characteristics: where State {"on","off"} is the state of the component; Mem N is the size of RAM; Disk N is the disk capacity for storage; Diskn N is the number of storage devices; Core N is the number of processor cores; Lan N is maximum bandwidth of the network adapter; The set of virtual machines can be divided into subsets VMnode={Snode, Snas, Sstg,… } to isolate computing resources for different services from each other.
The cloud system is a dynamic object changing at time t. Its state can be formalized in an oriented graph form: where Node(t)={Node1,Node2,…,Node} are active elements included in one of the sets Snodei, Sstgj, Snask, Sraspm; Connect(t)={ Connect1, Connect2,…, Connect} are active connections by users to the virtualized applications; App(t)={App1, App2,… Appn} are active instances of applications running on virtual resources. So we determine the structure of a cloud system and mechanisms of its component interaction. In such a system simultaneous servicing heterogeneous user requests is not trivial task.
To optimize the mechanism of access to information system resources it is necessary to analyze the main data flows transferred within the cloud system. Model of data flows in highload information systems based on cloud computing For flows analysis in our study, we used information systems of educational institutions. For analysis the most popular multimedia services have been determined. The research considered distance education systems (DES) consisting of different interactive applications.

318
In our research has built a level classification of applications:  Level 1: The subsystem for monitoring the students' knowledge in real time;  Level 2: The subsystem of the electronic library;  Level 3: The subsystem of webcasts and webinars. In our study, we have determined the general features of the use of the local DC's equipment.
 the load on the key resources is periodic and irregular;  requests to multiple types of resources come at the same time;  load distribution is not optimal, which results in loss of service at peak loads;  up to 90% of the load is predetermined, as pre-registration is used for access to resources;  up to 70% of the load arises due to multimedia educational resources. Information flows at each level have their own characteristics. The intensity of servicing requested flows in the information system depends on the target application level. In a study we use the statistical analysis of the load on the most popular applications used in information systems of the university. Evaluation time for requests to various applications allow to forecast flows and ensure efficient allocation of resources. We using the goodness of fit chi-square Pearson to obtain data to test the hypothesis of distribution laws requests for incoming flow. In general, the intensity of incoming and service of a request flow for each class of applications is determined by the distribution function, which is described by the following distribution laws:  for level 1 -Chi-squared distribution;  for level 2 -Weibull distribution;  for level 3 -Pareto distribution. Flows of data transmitted in the IAIS are usually processed in several phases. At the same time in each phase several similar elements can be used providing balancing and load sharing between the components of the information system. The number of components in each phase depends on the functionality of the information system and the number of applications included in its composition. Suppose an information system has the form: where i S -a component that performs data processing on the basis of the incoming flow of user requests, i = 1..r (rthe total number of components of the information system). The number of phases f in the flow path of user requests in an information system depends on its architecture.

319
The purpose of each phase according to its location in the processing sequence is: The first phase is the distribution of data flows between the IAIS resources in the cloud; The second phase is the dynamic scaling of the computing resources in the cloud; The third phase is data processing by user applications using storage systems and databases.
The components of the third phase include nodes of storage systems and database management systems for providing access to multimedia services in the cloud.
In detail the set of components of an information system is represented in form: where S j i is the i component of the j phase; mN, nN, kN are the numbers of components included in the system for the respective phases f. We also introduce the input components S 0 i which transmit data flows into an information system, and output components S 4 i receiving data flows from the cloud infrastructure. Consequently, the set describing the information system is transformed to: where pN, lN are the numbers of components in the input and output of cloud information system.
The service path for each flow can be dynamically changed. The number of unique flows depends on the number of components in each phase. A set of incoming flows at each phase j can be represented as: where j is the number of the service phases, nj is the number of flows at phase j. Consequently, all the incoming flows of the information system can be represented as: where f is the number of service phases. For output flows the similar conditions are used: To effectively serve user requests forming data flows in the information system, there must be an single-valued mapping of the form Y X R  : . In addition, for service of any request at each moment of time the matrix H of transitions between the phases of service is constructed depending on the class of the request and the current load of the system. The graph of transitions between phases can be built using the function: where j is the phases of service.
Then effluents element j i S directed to the element can describe the incoming and outcoming flows of phase j respectively. In real systems, outcoming flows can overlap and get serviced on the same computing node that results in the formation of internal queues at each service phase.
To describe this process it is necessary to determine the connections between output flows of component j i S at phase j and all the components at phase j +1.
Considering the above the set * j Y becomes: For a description of intersecting incoming flows within one phase two functions are introduced: ) ( where ) ( Consequently, an input data flow arriving on the component j i S at phase j from all the components at phase j-1 can be represented as: To describe the intersecting flows from the phase we introduce two functions: where ) ( . Thus data flows in an information system within a cloud can be represented as: Data flows and their characteristics may change over time and our representation thereof should also include time t. The description of an information system should include both internal and external factors so the parameter of external influence F should be introduced. Then data flows in a cloud system can be described in the form:

Cloud system virtual resources control algorithm
The above models allow to determine the most appropriate computing nodes of the information system and the virtual machines that contain the required instances of multimedia applications. The control system should provide uninterrupted user service and effective virtual resource control in case of limited physical resources. The main task of the control system is scheduling of computing resources at each moment of time. For highload information systems effective scheduling is important because the load on the services may vary greatly within short time intervals. In a cloud system there is a need to plan resource consumption optimally to prevent resource exhaustion for the application already running.
As distinct from other information systems the flow of user requests in the educational environment is predictable due to the subscriptions for multimedia services. The control algorithm for user access to virtual information resources consists of two interconnected processes. One of these processes is scheduling. The scheduling algorithm collects data on the incoming requests and classifies them by the levels determined with the priorities of applications for business processes. The input data for the algorithm are the applications described according to the template that includes a virtual machine image with the given configuration of hardware and software and user session characteristics. Based on this template and data analysis of connections the algorithm calculates the configuration to deploy the required service. In the case of identical sets of VM software the already stored images are used. To optimize the use of computing resources the algorithm generates three variants of virtual machine configurations. The first variant provides reserve performance in the case of unexpected increase in the number of users. The scaling factor in this case is calculated dynamically. The second variant provides a predetermined low performance of virtual machines for the given number of users. This approach is most effective for small special purpose user groups. It allows to reduce the overhead in case few working users, the number of subscribers being large. The third variant uses user-predetermined characteristics, including a fixed number of running instances of virtual machines regardless of the number of users. In this case the algorithm is only used to limit the computing resources. It calculates the maximum number of virtual machines that are available in the configuration selected by the user. The second process within the algorithm is direct service of user requests and resource scaling during the work of applications. The algorithm considers the total number of requests from each source which allows to predict the load on the running applications within the cloud. Then the algorithm migrates virtual machines between computing nodes based on the collected data in accordance with a predetermined plan, thereby scaling the work of applications. For efficient use of resources within the above processes, additional instances of virtual machines are created in the online storage of images for support the applications providing an access for the minimum amount of users. In the case of predicted load increase on a certain service, the algorithm deploys a full image of the media resource and analyzes the incoming user requests. If the load does not exceed the number of queries in an ordinary flow, the algorithm switches the load to the appropriate image and turns off the virtual machine. The scheme of an integrated approach to optimization using cloud computing, is presented in figure 1.

Fig.1 Scheme of optimizing access to information system based on cloud computing
Our approach allows to consider the physical limitations of computing resources and organize the work of a cloud information system adjusting the number of instances of running applications based on the incoming flow of user requests.

Experimental part
We have studied the work of the cloud information system with different parameters to evaluate the effectiveness of our virtual resource control algorithm. We have used the standard algorithms from the cloud system OpenStack [5] as reference for comparison in the experiment. In the experiment, we used the flow of requests similar to the real flow within the information system of distance learning. The number of concurrent requests received by the system was about 10,000, which is equal to the maximum number of potential users of the system. All the user requests are classified into six user groups corresponding to the types of user behavior. The requests from the first three user groups directed to the allocated application using other applications at the same time. The groups from 4 to 6 simulate the work of the application in the case of computing resource shortage because of an excess number of concurrent requests. The intensity of using the system components (video portal, testing system, and electronic library) and the amount of the requested data were assigned for each user group. Experiment lasted for one hour which corresponds to the longest period of peak load in the real system. Experimental results are presented in the Table 1.
The results of the experiments show a decrease of 12-15% of the number of service denials in accessing to multimedia services with limited resources. Within the experiment in the OpenStack cloud system we compared the consumption of virtual resources by the number of virtual servers for each of the subsystems. Our control algorithm provides collaborative work of all running instances of applications in accordance with user requirements due to the optimal allocation of resources on each computing node. So the optimization algorithms may release 20 to 30% of the allocated resources (virtual servers) (Fig. 2).