Over the last few years the trade press, industry analysts and management consultants have paid a lot of attention to cloud computing. Nicholas Carr, for example, recently wrote a book1 in which he declared unequivocally that cloud computing was the future of IT.
In contrast to the recent hyperbole, one of the interesting facts about cloud computing is that it is not a fundamentally new technology. Rather, cloud computing is a combination of a number of technological advances and innovations, including virtualization in its various forms, cluster computing, SaaS, Web 2.0, SOA, web services, etc.
Two of the leading cloud computing vendors are Google and Amazon, companies that traditionally have focused on the consumer market. Proponents of cloud computing assert that the massive scale of the infrastructure of cloud computing vendors like these can be used to justify an investment in advanced technologies that the typical IT organization could never afford. According to the proponents, these investments lead to dramatic improvements in operational efficiency, agility, and manageability. Geir Ramleth is the CIO for Bechtel and he is a believer in the potential cost efficiencies that are associated with cloud computing. According to Ramleth, Bechtel pays fifty times more for a megabit per second of WAN bandwidth than YouTube does and Bechtel pays almost forty times as much for a gigabyte of storage as Amazon charges for storage.
The goal of this edition of Performance First Insights (PFI) is to describe what is commonly meant by cloud computing and to identify some of the associated management challenges.
There are three general classes of cloud computing: public, private and hybrid. Cloud computing service providers (CCSPs) that provide their services over the public Internet are considered to be part of the Public Cloud. CCSPs such as Amazon generally have significant expertise in building and managing large-scale, virtualized data centers.
Enterprises such as Bechtel are pursuing a cloud computing strategy that calls for them to adopt cloud computing concepts within their internal IT environments. A cloud infrastructure internal to the enterprise is typically referred to as a Private Cloud. Private Clouds have the advantages of not being burdened by the same potential security exposures that are associated with public clouds. As will be discussed, the management of a private cloud is easier than the management of a public cloud. However, the private cloud may not be able to deliver the same economies of scale and elasticity of resources that CCSPs can.
Where an enterprise IT department uses a mixture of public and private cloud services, the result is referred to as a Hybrid Cloud. The hybrid cloud approach can offer the scalability of the public cloud coupled with the higher degree of control offered by the private cloud. In particular, a hybrid cloud might prove useful for enterprises that could benefit from offloading application workloads to the public cloud during transient spikes in demand. Hybrid clouds also readily support disaster recovery solutions and provide an evolutionary path to more complete outsourcing of IT resources to the public cloud.
Cloud Computing Services
This section of the PFI will discuss the three primary classes of cloud computing services: Infrastructure as a Service, Platform as a Service, and Software as a Service.
Infrastructure services are comprised of the basic compute, storage, and interconnect services that are required to run applications. An example of a cloud-based data storage service is Amazon’s Simple Storage Service (S3). According to Amazon, “Amazon S3 provides a simple web services interface that can be used to store and retrieve any amount of data, at any time, from anywhere on the web. It gives any developer access to the same highly scalable, reliable, fast, inexpensive data storage infrastructure that Amazon uses to run its own global network of web sites. The service aims to maximize benefits of scale and to pass those benefits on to developers.”
Another possibility is that the cloud-based service provides an extensive network of virtualized and/or physical compute devices. For example, an independent software vendor (ISV) could use an IaaS such as Amazon’s Elastic Compute Cloud (EC2) to access a large network of virtual web servers that facilitate the development and testing of distributed applications. An enterprise engineering department could use an IaaS to gain access to a large compute cluster that consists of hundreds or even thousands of servers executing a parallel processing high-performance computing (HPC) application.
Any web application can be considered to be a cloud application in the sense that it resides in the Internet. Salesforce.com is an example of a popular application that many firms access over the Internet and is what most people think of when they hear the phrase Software as a Service. However, it is likely going forward that a true SaaS-based application will be one that has been adapted to the platform interfaces of one or more IaaS vendors in order to become a cloud-based SaaS. Using PaaS interfaces, an ISV can adapt its existing SaaS software to the cloud’s IaaS’s and can develop new cloud-enabled SaaS applications. By cloud-enabling the SaaS, the ISV no longer needs to dedicate part of its own data center resources to SaaS application delivery because now the application can be delivered from the cloud. Some examples of cloud-based SaaS applications include Google gmail and Google docs and IBM DB2 database software packaged as an AMI for Amazon EC2. Cloud-enabled SaaS applications may be accessed by either individual users or by enterprise users.
As I have pointed out in the last few PFIs, effectively managing application performance is a complex task. A cloud computing approach to providing IT services will inherit most of the existing management challenges and will add a number of new ones.
Although most IT organizations do not use the phrase Private Cloud, the majority of IT organizations have begun to implement some of the basic concepts of private cloud computing. In particular, most IT organizations have:
While these initiatives have resulted in numerous benefits, they have also produced some new management challenges. For example, after implementing server virtualization an IT organization typically loses visibility into the traffic between the VMs on a particular physical server. Because it is easy to set up VMs, some IT organizations have experienced VM sprawl--a situation in which the IT organization loses visibility and control over the VMs that it supports.
While server virtualization complicates some management tasks, it also can improve a few key management processes. For example, a production VM can be transferred to a different physical server without service interruption2 . This enables workload management and optimization across the virtual infrastructure as well as zero-downtime maintenance. This capability also helps to streamline the provisioning of new applications as well as to improve backup and restore operations.
In most cases today, however, the ability to move VMs between different physical servers is not highly automated and is very complex. Part of the complexity stems from the need to ensure that when a VM is moved to a new server it maintains the same security and storage access, as well as QoS configurations and policies that it had previously. The success of cloud computing depends on the deployment of management software that effectively automates the provisioning and orchestration of virtualized IT environments.
In addition, once IT organizations have virtualized their servers, they need to modify all of their management processes to work in a virtualized environment. An example of this is performance monitoring, baselining, and thresholding. These capabilities are used to identify systemic deviations from normal performance and automatically update alert thresholds. These same capabilities are needed in a virtualized environment so that IT organizations can monitor the performance of individual VMs.
Given the likely scale of a CCSP, public cloud services will be more challenging to manage than are private cloud services. For example, a CCSP must implement management software that supports a higher degree of resource elasticity and agility than is required in private clouds. The CCSP must also implement management software that enables usage-based pricing and the ability to rapidly provision additional capacity to meet transient customer requirements. All of this functionality has to function in a secure fashion in a complex computing environment within a multi-tenant data center.
In the case of public cloud services there are at least three separate management domains:
There will be more than three management domains if the enterprise subscribes to more than one public cloud provider. Effective end-to-end management requires detailed, consistent management data to be gathered from each of the management domains. That is not likely to happen without coordinated planning.
To realize the full potential of cloud computing, CCSPs must be able to offer IT organizations management and security capabilities that can span the breadth and depth of the cloud. These capabilities should include IaaS/PaaS/SaaS solutions from a single vendor, plus highly heterogeneous solutions integrating services from multiple vendors at the different levels of the cloud and/or multiple vendors within a single level of the cloud. Ideally, management tools are extensions of the tools used today by enterprises for managing services delivered by a private virtualized data center.
Dennis Brouwer, the Vice President of Global Network Solutions for Savvis states that enterprise IT organizations which adopt public cloud services such as application hosting services, go through three phases. In the first phase they get comfortable moving one or more applications to a hosted data center. In the second phase, they move applications to two or more hosted data centers but in order to ensure acceptable performance, the IT organization specifies the hosted data centers. By the time the IT organization reaches the third phase of adopting cloud services, they are not concerned with how the cloud services are implemented. Their only concern is how the services perform.
From Brouwer’s perspective the primary management challenge associated with cloud services is the requirement to proactively manage network latency so that it appears to the user of the service that “the application is just around the corner.” Brouwer goes on to say, “Latency is the defining characteristic, and managing latency is critical.” He adds that IT organizations which adopt public cloud services would not do so if they did not trust the CCSPs. Brouwer asserts that independent of that trust, IT organizations need a view into the end-to-end network latency so they can verify they are receiving the contracted level of service.
If the enterprise’s private cloud uses management tools that are consistent with those of public cloud vendors, the enterprise should have the flexibility to seamlessly migrate applications, or even an entire virtual data center, back and forth between the public cloud providers and the private cloud. In this case, there are at least four management domains:
There will be more than four management domains if the enterprise subscribes to more than one public cloud provider.
The unrealistic hyperbole that surrounds cloud computing obscures the fact that many of the components of cloud computing have already been deployed by enterprises and service providers. There is no doubt that these deployments are simple when compared with the long-term vision of cloud computing. It is also clear that these deployments prove the basic tenets of the cloud computing value proposition, but also highlight some of the challenges that must be overcome before cloud computing can live up to the hyperbole that surrounds it.
For example, as IT organizations migrate to private clouds by implementing virtualized servers or to public cloud services by utilizing application hosting services, they are realizing that most of the management challenges in a traditional environment also exist in a cloud computing environment. It is also becoming clear that cloud computing introduces a number of new management challenges.
Managing latency is important in a traditional IT environment. Brouwer highlights the importance of proactively managing latency in a cloud computing environment when he states, “Latency is the defining characteristic and managing latency is critical.” While managing latency in a private cloud environment is challenging, it is not as challenging as managing latency in either a public or hybrid cloud environment. The fundamental difficulty associated with managing latency in either a public or hybrid cloud environment stems from the fact that in the best case there will be three management domains. In most instances, there will be more than three management domains. In any case, obtaining detailed, consistent management data from each management domain requires coordinated planning.
CA | NetQoS - Network Performance Management products and services for the world's largest networks. © 2001-2010 CA | NetQoS, Inc. All rights reserved.