Performance Matters

Stop Cheating on your Tests!

I suppose we could have used a less inflammatory title for our recent webinar. It makes it sound like testers have been purposely doing something wrong. Perhaps we could have titled the webinar “Now you can execute more accurate and informative tests!” But the folks in marketing were right, and the intriguing title attracted our largest group of attendees ever. For those of you who didn’t attend, or if you did and would like to review the messages, you can watch the webinar here. This was the first in SOASTA’s latest webinar series, “Cloud Testing – Rewriting the Rules of Performance Testing”. Future webinars include “Run More Tests and Find More Issues” on October 27th and “Test On Your Schedule across the Lifecycle” on November 15th.

In this webinar, Scott Barber, President and CTO of PerfTestPlus, joins SOASTA’s VP of Performance Engineering, Rob Holcomb to discuss what performance engineers have done in the past to measure performance and find and fix issues; and why some of those techniques no longer reflect best practices. The focus is on web and mobile testing and why the higher scale, more distributed and often complex nature of that traffic is not well served by traditional testing tools or techniques.

After an introduction by SOASTA’s Brad Johnson, Scott, in his inimitable style, speaks to great effect about the four most common ‘cheats’ that performance testers have leveraged to overcome the constraints of inflexible test hardware, poor tool scalability, expensive pricing models and the lack of real-time information while testing. Scott begins his presentation by talking about the practice of modifying think times, typically to overcome licensing and/or hardware limitations imposed by the high cost of traditional load testing. His primary assertion: the only way to simulate production…is to simulate production. Interestingly, during the webinar a question came up suggesting that we’re testing computers, not humans, so why is accurately simulating user activity so important. In response, it was noted that it has become clear that variance in use absolutely has an impact on what happens to the infrastructure.

The second point discussed is the common practice of extrapolating results from a staging environment to predict what will happen in production. Architectures can be complicated, and the impact of those differences along with the additional complexities of ‘the real world’ make extrapolation problematic, at best. The best way to validate that your production environment will handle expected load is to test in production as part of your overall test strategy. (For more on testing in production check out this SOASTA webinar). Modeling user flows incorrectly is the third point addressed by Scott, reinforcing the notion that we’re not functionally testing the application, but need to make sure we’re putting a realistic load on the back end.

Finally, Scott presents a very interesting problem to illustrate the challenges associated with measuring performance, and how it can be as much an art as a science. Rob follows Scott’s presentation and, using SOASTA’s CloudTest, illustrates how we can use modern tools to, well, stop cheating. We hope you enjoy(ed) the webinar.

2 Responses »

  1. Very nice presentation and a key topic, getting closer to the real workload, and understanding the system resource consumption to support that workload. I have come across many production performance issues that occur, even though they ran a performance test.
    The most common reason is they did not have the right workload, for instance they left out a key transaction that really represented 20% of the production workload.
    They also did not look past response time, they ran a test and saw an acceptable response time and stopped testing (often times acceptable is loosly defined). They did not realize the systems were above 80% utilization or more.

    Thanks

  2. Still cheating but we are getting closer. There’s no two ways about it the scalability of the cloud eliminates a lot of the extrapolations people working from smaller models and under tight budgets used to calculate in the past. That said there is still a long way to go to get close to realistic load for large web traffic volumes. Think time itself is one part of wait time (the other being what academics call active wait time). Active wait time (Bradford and Crovella 1998) is the time the browser takes between calls for embedded references (gif files etc.). Typically this is not calibrated by load drivers and would need a time driven (non blocking) architecture to simulate accurately. It’s not a big deal but most load generators will grab the embedded references as quickly as they can, it’s like saying we can simulate1,000 requests a second but in practice the driver will get the 1,000 in the first 100 milliseconds and 9/10ths of a second is relatively idle.

    The second cheat going on is that the browser at some point, when retrieving embedded references, will open another TCP connection (possibly up to 4 if configured in the browser and allowable by the HTTP server). The time at which the browser opens another TCP connection will vary depending on the browser and the other load on the client (as well as the server response times). Again the difference for this dynamic run time behavior is in milliseconds but the total number of TCP connections coming in from a typical load tool is often fixed (typically 1) per user equivalent, another cheat as in practice this TCP connection number is dynamic even for a fixed number of User Equivalents. Then there’s the whole issue of returning users and cached resources, which will vary not just by behavior of returning users but the amount of cache available to the browser and the total browsing history of the user. Don’t get me started but that’s a whole other video!

    Basically we all cheat and will continue to do so, but the pragmatic question is to do with the value of our models against our objectives and the cloud together with tools provided by companies like SOASTA will get us closer to a more useful conclusion.

Leave a Response

Email Us!
Subscribe to our Feed!
Join us on LinkedIn
Find us on Facebook
Follow our Tweets
See our pics