Académique Documents
Professionnel Documents
Culture Documents
Background
A lot of companies are nowadays moving towards the private cloud following a
SaaS (Software as a Service) model. The company centralizes all its applications
and provides some means for users to access these applications. A very popular
choice is to use a cluster of hypervisor servers (like VMware ESXi or Microsoft HyperV) to host a farm of Citrix servers that in turn host the required applications. In
such cloud systems, there are various algorithms at play at different layers the
hypervisors, the virtual machines and the application virtualization layer. This
makes it nearly impossible to deterministically predict end-user performance under
different conditions.
However, if IT engineers want to optimize such an infrastructure in terms of server
consolidation and power consumption then they would need to predict application
performance under different circumstances with a relatively marginal error. In order
to render the process easier they would also have to be able to predict application
performance, with high confidence, by considering server metrics only rather than
polling individual users for feedback. Hypervisor manufacturers do provide some
clustering mechanisms to share resources amongst virtual machines fairly. But the
algorithms behind these mechanisms are very simple. The typical approach is to
statistically measure current load (resource consumption) in each server. Virtual
machines are then migrated between clustered servers when the difference in load
between servers crosses a pre-determined threshold. The major drawback is that
such an approach is inherently reactive and cannot be used to provide guaranteed
performance.
In hybrid clouds, the term cloudbursting refers to the scenario where an
application scales into the public cloud in response to resource crunch in the private
cloud due to a surge in demand for resources. This adds another dimension to the
problem because companies have to pay for using public cloud services.
Cloudbursting can also be optimized if application performance is well understood.
In public clouds, a major concern is to provide the guaranteed level of performance
while allocating as little resource as possible.
Thus there are multiple goals that can be achieved only if application performance
can be understood and guaranteed only by considering server metrics. Yet, such a
complete understanding of application performance under different load conditions
is elusive to even the most experienced IT engineers.
Research Aim
A major obstacle in the proposed study is that this would probably require recurrent
neural networks to obtain best results. Recurrent networks are notoriously difficult
to train and I do not have much experience with them.
Conclusion
I intend to statistically relate server metrics to end-user application performance in
a cloud infrastructure. Such relationships can be used to optimize cloud
infrastructure. The major benefits are reduced costs and guaranteed performance
levels.