Choosing a Cloud Provider: Building and Launching Servers (Part 2 of 2)

Note: The purpose of this post is to explain how we go about choosing a cloud provider and to share some of the lessons we’ve learned along the way. For context, while cloud vendors provide a wide and growing range of services, SOASTA’s primary requirement is acquiring scalable, pay-per-use compute power. Part 1 discusses what to look for as you get started.

After we get familiar with the cloud provider’s UI and have built our base image, it’s usually pretty easy to build and launch servers. In most cases, getting a public IP address is either straightforward or automatic. We have seen some vendors bring up servers only with a non-routable IP address, and then require multiple steps to tie the server to a publicly facing address. Most applications need a public-facing IP address, so this is inconvenient, at best.

Performance is important in every application, and understanding the requirements of a particular workload may influence which vendor and/or configuration options are best. Because everything happens in memory for SOASTA during a test, high performance I/O, which is challenging for cloud vendors, is not much of a factor. For some, I/O speed may be critical, in which case the range of available storage options, both for performance and persistence, becomes a much more important part of the decision criteria.

For SOASTA, bandwidth in and out of the data center and speed to provision are important. Bandwidth is important for any application that has high traffic. In some cases, available bandwidth is tied to the size of compute instance being used, the bigger the instance the greater the available bandwidth. In other cases bandwidth allocations are associated with a deployment, which may consist of multiple instances servicing a workload. And, there are data centers that have very restrictive pipes into the data center, which in our case limits the amount of load we can drive from that location.  Another consideration, as always, is cost.  Bandwidth in and out, along with utilization between data centers for a single provider can all have different price points.

For any application workload that regularly starts and stops servers, or that leverages elasticity to handle spikes or unexpected growth, provisioning speed is critical. How long it takes a server to launch is the first critical factor SOASTA looks at before committing to use an API. In addition, increasing the number of servers launched has a greater negative impact for some vendors than others. Amazon and Rackspace, for example, can launch a single server in a matter of minutes and, even if we request many servers at once, we’ve seen minimal impact on overall launch time. They handle parallel operations very well.

Some vendors ask that you launch only a small number of instances at a time, and in some cases don’t even support batch requests for servers. SOASTA’s implementation of those APIs takes this into account, making it easier for our customers and performance engineers. Still, if too many servers are launched at once, it can overwhelm the provisioning system with a much higher likelihood of bad instances or instances that timeout, which then need to be replaced. All providers queue requests so that the infrastructure can handle requests for new instances, however some have exceedingly low thresholds before forcing their customers to get in line. This means that, depending on how long the queue is, even if the actual deployment is relatively fast, you could be waiting quite a while for your servers to even start to spin up.

We also evaluate the UI based on the operations that are available for managing a server, which again is an indicator of what’s exposed in the API. Of course, starting and stopping instances is universal, but not status, error reporting, reboot, resize or rebuild, all of which can be useful and should be supported through the UI. The information you can get while in the UI also varies from vendor to vendor. You can always see how many servers are running, what locations they are running in, the configuration of the servers and their state (running, stopped, provisioning, etc.). But not all clouds will show how long a server has been running, who launched it and other metadata that can be helpful in managing cloud deployments.

After spending time working through the UI, we always peruse the tech support forums to see what kind of questions are asked. If there are many posts asking for help stopping or deleting or rebooting an instance we assume that the tools available to the consumer to manage their own instances are not quite complete. In addition, we consider how often they have maintenance windows, how pro-active are they in notifying the customer about maintenance and what’s the impact? For SOASTA, being locked out of starting instances for a few hours is incredibly inconvenient, and there are a number of vendors who still require downtime for an entire datacenter when they perform maintenance.

When support is needed, it’s important to look at the ease of use and comprehensiveness of the knowledgebase and other support resources, as well as how easy it is to actually talk to someone. As you can see, at SOASTA we consider a broad range of criteria in choosing our cloud vendors, but sometimes it comes down to a simple conversation to make a partnership work

Choosing a Cloud Provider: Start with the UI (Part 1 of 2)

Note: The purpose of this post is to explain how we go about choosing a cloud provider and to share some of the lessons we’ve learned along the way. For context, it while cloud vendors provide a wide and growing range of services, SOASTA’s primary requirement is acquiring scalable, pay-per-use compute power.

In a recent blog post we described how SOASTA leverages the APIs of cloud providers to quickly and reliably provision and terminate many servers across a wide range of vendors. Before writing to a new API, we evaluate each provider for the features that are important to us. We start with the user interface (UI) because we’ve found it to be a great way to understand the capabilities of potential cloud partners.

Every cloud infrastructure provider has a self-service UI, a dashboard that allows you to interactively manage instances. Working with the UI is a quick way to get a feel for the cloud provider’s overall implementation, and help determine if you want to take the time to integrate with the provider’s API. In most cases, this UI is built on the same API exposed for all users … and if it’s not, that’s probably a sign that the API is not quite ready.

There’s a lot you can learn before diving into the API. Beginning with the sign-on process, we’ve found that companies that haven’t made it really easy to sign up and get started are typically the ones who’ve not quite made the shift from a managed-services-centric approach to doing business in the cloud. If the support for signing up, should you need it, is less than stellar, that’s certainly another indicator that the provider’s heart is not yet in the cloud.

Before we get into the technology, it’s important to note that there are key differences in how different vendors do business. There are a number of account management features that may prove important for applications or use of the cloud. These include:

  • The ability to have master and child accounts. This might be useful for internal billing or possibly managing the visibility of instances for different users.
  • Requiring unique accounts for each data center. If your application can take advantage of using servers from multiple locations, perhaps for failover or performance reasons, you may not want to deal with the overhead of managing multiple accounts.
  • Role and user-based access to the account(s), which can be important if you don’t have a single set of requirements and need to control how the account is used; or need to charge for utilization by department, while maintaining centralized control.
  • Detailed accounting and tracking of all services from bandwidth and servers to IP addresses and VPN services.  Some vendors do a far better job of itemizing exactly what resources are being consumed.
  • Requirements for pre-paid, pre-allocated Internet Protocol (IP) addresses. This is problematic for an application like ours, which is constantly starting and stopping variable numbers of servers. Any application that takes advantage of auto-scaling or instantiates servers on a temporary basis for specific jobs (such as the massive, on-demand calculation and analysis applications in the financial and pharmaceutical industries) will find this inconvenient at best.
  • Disclosure of available IP ranges in cases where publicly used cloud resources need to be whitelisted.

After sign-up, the first experience will be interacting with the portal. Some are just plain slow, require too many clicks to achieve relatively simple goals, artificially limit what you can do (typically to protect some weakness on the back end), timeout too quickly and/or for no reason, or provide little to no feedback after a request has been made. Virtually every new provider, while still working out the kinks, suffers from some of these challenges. However, these issues tend to shake out, given time.

After gaining familiarity with the UI, we check out the available base images.  For some vendors it’s simply a set of common, stripped-down Linux and Windows operating system images. The range of images and the OS options that are supported may be an important consideration. Some vendors allow for publishing of public images and, as a result, have relatively large libraries of available starting points: the more robust the set of choices, the better.

We also consider the range of available server configuration options. Again, the more the better. The SOASTA application uses a slightly different configuration for each cloud provider. Because our tests often span providers, we try to normalize memory and CPU so that each instance across clouds is roughly comparable.

Once you’ve created an image you want it to be easily deployed; at SOASTA, we regularly deploy instances across data centers. Some providers do a great job of helping to keep distributed images manageable and consistent. Others force the customer to make sure they stay in sync. As an aside, not all ‘cloud’ providers allow you to build and save your own images. In some cases, if you want to instantiate versions of your own image, you need to keep it running and ready to be duplicated. This makes speedy deployments impossible unless you’re willing (or need to) have an instance running at all times. Rebuilding an image each time we want to deploy eliminates those vendors from SOASTA’s consideration.

In Part 2 of Choosing a Cloud Provider, we’ll cover Building and Launching Servers.

 

How SOASTA Leverages Cloud APIs

Our position as one of the first enterprise cloud platforms has given us a great view of the evolution of the cloud. On any given day at SOASTA we may be looking for thousands of cloud servers to simulate millions of virtual web site visitors. As a result, we’ve done more cloud testing across more cloud providers than anyone else.

In doing so, SOASTA is a consumer of Infrastructure-as-a-Service (IaaS) and a provider of Software-as-a-Service. Amazon was among the first to make IaaS available as an elastic, pay-as-you-go cloud service and have since been joined by many other providers such as Rackspace, IBM, Microsoft and GoGrid. Along with these public providers, companies like Eucalyptus and Nimbula offer behind-the-firewall cloud platform solutions.

SOASTA’s CloudTest platform depends on the swift provisioning and releasing of servers, and we must quickly identify bad instances and bring up replacements, while ensuring that each server plays its role in a distributed, multi-vendor architecture. To do this, and best take advantage of the capability to easily and affordably start, stop and manage servers, SOASTA uses cloud vendor APIs for automation.

Using Cloud APIs

Elasticity is most commonly associated with changes in supply and demand based on price. The application of elasticity in the cloud is not much different. An elastic API refers to an infrastructure vendor’s ability to respond to demand by allowing customers to quickly and automatically spin up servers, and just as quickly take them down. For applications such as performance testing this is incredibly important.

SOASTA evaluates cloud vendors based on a number of criteria. Our initial experience is always through the provider’s web-based user interface. If servers come up fast, the configuration options are appropriate, the GUI is capable, and the business model fits, we switch our attention to the cloud vendor’s API.

SOASTA’s unique grid provisioning technology uses the APIs to automate test setup. CloudTest is a sophisticated, distributed test platform that is controlled through a browser. Testers use a wizard (shown below) to quickly select provider(s), location(s), number of servers and type of servers so that within minutes they can execute large-scale performance tests.

The API should support functions such as start, stop, reboot and resize. In addition, using an API highlights the importance of intelligently handling errors and providing clear notifications. As always, meaningful error messages go a long way toward troubleshooting when there’s a problem.

No cloud vendor is immune to instances occasionally failing upon launch. Some vendors have fewer issues than others, but when it does happen you want to delete that instance and automatically replace it through the API. Also, you want to be able to do it no matter what state the instance is in. An image stuck in provisioning mode, or that can’t be stopped at all, is inconvenient, at best, and expensive if not caught.

If, like SOASTA, you have a requirement or can benefit from going cross-cloud, you’ll have to deal with the fact that every API is still quite different. Many vendors will tell you they have an AWS-like or compatible API, but that doesn’t mean you can just unplug from one and plug into another. We take care of this behind the scenes for our customers using CloudTest.

In addition to the proprietary APIs, there are ongoing collaborative efforts at creating standards within the cloud infrastructure community. Efforts like libCloud, DMTF’s Open Cloud Standards and OpenStack have gotten traction, the latter probably more so than any other to date. While still not ready for primetime, OpenStack, initiated by Rackspace and NASA Ames, is supported by players such as Dell, Cisco and Citrix, among others, and serves as the basis for the upcoming HP cloud offering. It’s also garnering interest from enterprises as the basis for internal deployments.

Cloud-based infrastructure services have matured dramatically in the last three years, with greatly increased reliability and capacity. Today, there are dozens of providers with locations around the world providing access to hundreds of thousands of affordable server resources. This access allows individual developers to exercise their creativity, and companies like SOASTA, using the available APIs, to provide services at a speed and cost that was impossible just a few short years ago.

Stop Cheating on your Tests!

I suppose we could have used a less inflammatory title for our recent webinar. It makes it sound like testers have been purposely doing something wrong. Perhaps we could have titled the webinar “Now you can execute more accurate and informative tests!” But the folks in marketing were right, and the intriguing title attracted our largest group of attendees ever. For those of you who didn’t attend, or if you did and would like to review the messages, you can watch the webinar here. This was the first in SOASTA’s latest webinar series, “Cloud Testing – Rewriting the Rules of Performance Testing”. Future webinars include “Run More Tests and Find More Issues” on October 27th and “Test On Your Schedule across the Lifecycle” on November 15th.

In this webinar, Scott Barber, President and CTO of PerfTestPlus, joins SOASTA’s VP of Performance Engineering, Rob Holcomb to discuss what performance engineers have done in the past to measure performance and find and fix issues; and why some of those techniques no longer reflect best practices. The focus is on web and mobile testing and why the higher scale, more distributed and often complex nature of that traffic is not well served by traditional testing tools or techniques.

After an introduction by SOASTA’s Brad Johnson, Scott, in his inimitable style, speaks to great effect about the four most common ‘cheats’ that performance testers have leveraged to overcome the constraints of inflexible test hardware, poor tool scalability, expensive pricing models and the lack of real-time information while testing. Scott begins his presentation by talking about the practice of modifying think times, typically to overcome licensing and/or hardware limitations imposed by the high cost of traditional load testing. His primary assertion: the only way to simulate production…is to simulate production. Interestingly, during the webinar a question came up suggesting that we’re testing computers, not humans, so why is accurately simulating user activity so important. In response, it was noted that it has become clear that variance in use absolutely has an impact on what happens to the infrastructure.

The second point discussed is the common practice of extrapolating results from a staging environment to predict what will happen in production. Architectures can be complicated, and the impact of those differences along with the additional complexities of ‘the real world’ make extrapolation problematic, at best. The best way to validate that your production environment will handle expected load is to test in production as part of your overall test strategy. (For more on testing in production check out this SOASTA webinar). Modeling user flows incorrectly is the third point addressed by Scott, reinforcing the notion that we’re not functionally testing the application, but need to make sure we’re putting a realistic load on the back end.

Finally, Scott presents a very interesting problem to illustrate the challenges associated with measuring performance, and how it can be as much an art as a science. Rob follows Scott’s presentation and, using SOASTA’s CloudTest, illustrates how we can use modern tools to, well, stop cheating. We hope you enjoy(ed) the webinar.

Cover Your Bases When Performance Testing

With the release of CloudTest Pro, SOASTA has reinforced a central theme of our Performance Test Methodology: expand your scope to include tried and true traditional methods of testing with the benefits of cloud testing: scale, speed, accuracy and flexibility.  CloudTest Pro makes it much easier to execute both internal and external testing without sacrificing control over your testing processes.  Also, it’s not just a matter of where you generate the load, you must test with a purpose.  CloudTest Pro CloudTest Pro makes it much easier to execute both internal and external testing.  We believe it’s not enough to simply measure end user experience or find out when bandwidth is saturated.  Important information?  Of course, but hardly the only keys to confidence.

To gain confidence you need to identify the things that are in your control and that you can address.  Can you improve the performance of the ecommerce shopper who’s at the end of a DSL line on a 3-year-old computer while the Roku box is streaming Netflix?  Of course, but not by buying all of your end users faster Internet access and better computers.  It’ll be by doing the same things you’d do to improve performance for everyone: ensuring that your web pages, applications, infrastructure, network and third party providers are performing optimally.

The main tenants of the SOASTA’s Performance Test Methodology include:

  • Testing both in the lab and live, web-based applications in production
  • Leveraging the Cloud to test at both typical and peak traffic levels, from hundreds of users to millions
  • Responding to the acceleration of development cycle times by making agile performance testing a realistic alternative
  • Generating geographically dispersed load for a valid representation of real-world traffic
  • Generating both internal and external load and using both the lab and production environments for the most efficient and effective results
  • Analyzing Performance Intelligence in real time to speed problem resolution

We’ve advocated external testing because it’s important to understand the performance from different geographic regions.  It’s also the most realistic and valid way to generate user traffic.  A complete suite of testing includes tests designed to identify specific issues or establish baseline metrics.  Comprehensive testing will often include stress, spike, endurance, failover, capacity and break-point tests in addition to understanding response times.  Make sure you’re getting a 360 degree view of performance.  For more thoughts on the topic, check out this post from last year: What Does Performance Testing Mean? http://www.soasta.com/2010/05/06/what-does-performance-testing-mean/

The Significance of SOASTA CloudTest Grid

With the release of CloudTest Pro, SOASTA’s customers now have direct access to one of my favorite capabilities as a SOASTA performance engineer: the CloudTest Grid. The CloudTest Grid continues our success in reducing the cost, complexity and time to results for performance testing websites and applications.

Like any other engineer, a performance engineer needs timely access to professional tools before they can add real tangible value; i.e. delivering evidence as to whether or not a website or application will meet target responsiveness and scalability requirements, and help identify bottlenecks and their root cause. Not only does the engineer need access to professional tools, the tools need to be suitable for the job.

Most performance testing tools require significant capital investment in licensed software and hardware infrastructure, as well as ongoing maintenance costs just to test what is typically a scaled-down version of a production environment. This may have been appropriate for the internally facing applications of the past where user numbers have been counted in the tens or hundreds. However today’s web applications are delivering the next generation of dynamic content to an extraordinary number of global users every day. Keeping pace with the speed and agility of today’s web development initiatives requires a new generation of testing and performance management solutions.

Before joining SOASTA I found that it was becoming increasingly difficult to help organizations justify investing in traditional performance testing tools simply due to the cost and complexity of these offerings. Often, by the time a business case had been constructed and a flexible commercial model agreed upon the initial requirement for testing had passed.

These challenges have led to cloud testing as a way to reduce costs and provide the benefits of scale and more accurate testing. However, there remained the challenge of how to best leverage the distributed nature of the cloud and optimize deployment to be as efficient as possible.

With CloudTest Grid, performance engineers can deploy hundreds, or even thousands of load servers around the globe in a matter of minutes. The servers are automatically spread across locations and cloud providers to meet the requirements of each specific test, and the Grid takes care of all error checking and fail-over to replace any bad instances or get additional capacity from other locations, as needed. Once deployed, the servers are monitored throughout testing to track their health and ensure the Grid is operating as expected.

With SOASTA CloudTest Grid it is possible to provision a complete test environment, when required, and just for the period of time that is needed, which significantly reduces the time to test. We routinely perform tests involving tens of thousands or hundreds of thousands of simulated real users. Combined with our industry leading performance analytics SOASTA CloudTest delivers increased business value over traditional testing tools at a fraction of the cost with significantly improved time to value.

Performance Testing Mobile Devices

Whether you’re talking about smartphones or tablets, the sales of mobile devices are skyrocketing. While Apple’s iPad and iPhone and Android dominate the news, and sales, there are a huge number of options. By their very nature they’re connected to the web, and use the Internet for much richer applications than just email and web browsing. Everything from games to video is available on these devices, putting a strain on bandwidth and the back end servers that are providing the content.

That makes performance testing very important and companies such as Netflix, Glu, Leapfrog, Capcom and more have come to SOASTA to make sure they can handle the load. So, how do we do it?

When testing any web application, whether browser-based or not, CloudTest captures, replays and reports on every message between the client and the application infrastructure. Unlike other tools, CloudTest’s unique recording does not depend on a browser plug-in to record web traffic, which also limits you to using the browsers for which you have plug-ins. Instead, we use a web proxy to record traffic. An agent, called Conductor, is installed on any system that you want to have act as a proxy. This can be the same laptop used to record browser-based traffic or a completely different system.

This means that all messages from any device (browser, phone, tablet, toy, etc) that you point to the proxy for web traffic can be recorded into a CloudTest test case. Once recorded, the requests can be parameterized, manipulated and otherwise massaged to create the complex usage patterns and high volumes of traffic needed for a load test.

Recording HTTP/HTTPS Traffic with SOASTA’s conductor agent acting as a proxy – regardless of the device, browser, application or operating system as long as it communicates over HTTP or HTTPS and supports a configurable proxy then CloudTest can record the traffic

In some cases, such as games running on the phone or tablet, the application is simply making web services calls. We record that message traffic just as easily as we do for any other application. CloudTest automatically parses WSDLs, has built-in OAuth support and features for making RESTful web services testing simple.

CloudTest is not testing the functional capabilities of the device. When we play the test back, we’re emulating the traffic just as the device would communicate to the infrastructure. This simulated load puts the same stress on the back end as the devices themselves would, at whatever levels of use you want to test.

CloudTest Analytics captures and displays in real-time the performance metrics on the message traffic, such as response times, bandwidth usage, error rates, time to first byte, etc. If monitoring is in place on the infrastructure we’ll also capture and display metrics such as CPU utilization, memory consumption, heap size and process counts on the same time-line as the performance metrics, helping our customers to quickly find the bottlenecks that might impact the user experience.

Join the forum discussion on this post
Email Us!
Subscribe to our Feed!
Join us on LinkedIn
Find us on Facebook
Follow our Tweets
See our pics