The Performance Beacon

The web performance, analytics, and optimization blog

The Effects of Using Think Time to Adjust Level of Load

Some load tests are run with restricted resources, either because of deficiency in load generation muscle or from licensing constraints with the load-testing tool being used. Sometimes the performance engineer wants to quickly start a test and ramp up to peak traffic (however that may be defined) very quickly. In these situations, it is tempting to use the think times in your test case scenarios as a variable for adjusting the “level of load” that you drive. This “level of load” may be defined by “number of HTTP requests per second”.

To see where I’m going with this, suppose you have set up your test cases and workload to generate your target load with 500 virtual users and 15 second think times between pages. Now, suppose you need to generate the same target load, but with only 100 users. It might be tempting to assume that you can divide the think times by 5 and thus increase the throughput by a factor of 5, thereby allowing you to divide the number of virtual users by 5. The assumption can be stated as follows:

Assumption: If 500 virtual users with 15 sec think times yields x requests/sec, then 100 virtual users with 3 sec think times should also yield x requests/sec.

This assumption is flawed for a couple of big reasons:

  1. The response times are not being taken into account – both the think times and response times are factors in determining how much throughput your virtual users will get.
  2. This assumption requires making another assumption – your application will respond linearly with additional load, regardless of think time. That is, you are assuming that response times will be the same between the 500-user and 100-user workload. Additionally this assumes the application only cares about the average request rate and nothing else.

Point #1 above can be illustrated with basic math. A calculation of “average HTTP requests per second” can be made with the following equation:

We have two workloads in question. For now, let’s just go with the assumption that average response times should be the same for each workload. Let’s say we are dealing with 2.5 sec average response times and that each user is making 10 requests per page, on average (the HTML document + 9 page assets). Then we have the following:

500-user workload with 15 sec think times

100-user workload with 3 sec think times

Clearly the two workloads above are not equivalent in terms of requests/sec being generated. Now that we are taking the response times into account (and assuming they will be the same in each workload), we can set x = think time/page and then solve the resulting equation. Indeed, if you do this you will find that you need to use about a 0.1 second think time for the 100-user workload to generate the same throughput as the 500-user workload. How unnatural is it to have users with 0.1 sec think times? At the very least, it’s significantly different user behavior from the original workload.

The second point above can be illustrated with a real-world experience of mine before I had the benefit of leveraging CloudTest Pro’s ability to scale to enormous numbers of users. I was working with a tool that is licensed by the number of concurrent virtual users in use. There were periods of time when multiple teams within the company were simultaneously making use of that license, thus constraining the number of users that each team could use.

I was testing a back office application, which essentially had 2 different app layers and a database layer. The test case was simple. It merely consisted of 2 web service calls – one to get a sales tax calculation and one to do an order placement, with think time in between. The sales tax request was made directly to the 1st app layer, which made several 3rd party calls outside, then forwarded the request to the 2nd app layer, which managed connections to the database. The 2nd app layer would then make a JDBC connection to the database and read the corresponding sales tax data. My standard workload consisted of 800 virtual users with a think time modeled after data from production logs. Through a little bit of calibration testing I figured out that I could reduce the number of virtual users to 160, reduce the think times significantly, and have a workload that still generated the same number of sales tax and order placement requests per second.

After multiple rounds of testing, with both workloads, I had a discrepancy. Both the 800-user and the 160-user workload generated the same transaction rate against the application, but only the former hit a bottleneck in the app. With the 160-user workload, the application ran flawlessly. With the 800-user workload the database CPU pegged at 100% and the connection pool became completely saturated. It turns out that longer think times can consume more resources than expected. In this case, the longer think times caused the apps to maintain and manage more concurrent connections, specifically database connections. With longer think times, enough concurrent connections were created to max out the database connection pool, thus causing a bottleneck and forcing the queuing of some incoming requests.

The underlying theme here is that there are so many variables involved in any web architecture that it’s best model your virtual users as closely as possible to real-world users – model your think times carefully. In my case, the JBoss connection pooling configuration and connection timeout values were the unseen variables that responded to changing think times. Who knows what it will be in your case?

SOASTA Marketing

About the Author

SOASTA Marketing

Follow @CloudTest