What You Don’t Know About 3rd Party Scripts Can Hurt You!

Your website is probably one of – if not the – primary revenue drivers in your company. But do you know what’s really happening to your visitors? If you have third party components on your site, how do you manage the risk and the ROI of the tools provided by these parties?
Join the discussion with Web Performance experts: Scott D. Lowe & Buddy Brewer, SVP of Products at SOASTA, to get a better understanding of the impact 3rd party scripts can have on your site. We will be joined by Jason Trester, CBS Interactive’s Senior Director of Site Engineering. Jason will share with the audience his experience with SOASTA and how it’s transformed their audience experience.


Scott: Good afternoon, or good morning or good evening, depending on where you’re from. Thank you for joining us today as we talk to you about what you don’t know about 3rd party scripts that can hurt you on your website. This webinar brought to you by SOASTA and ActualTech Media. Today we have some really good information for you. One of the things I love are customer case studies. Today we’re going to hear from Jason Trester who is the Senior Director of Site Engineering for CBS Interactive. Jason, thank you for joining us today to talk about your experience with SOASTA and CBS Interactive.


Jason: Thanks for having me Scott.


Scott: We’re also joined by Buddy Brewer, who is the Senior Vice President of Products for SOASTA. Buddy thank you for joining us today as well.


Buddy: Thanks for having me Scott.


Scott: My name is Scott Lowe, I’m a partner and Co-Founder of ActualTech Media and we’re going to teach you a lot about today about what you need to know about maintaining performance levels on your website.


You know if we think about the problem, where you have a website especially for those that are using the website for revenue generation-type activities, that can contain requests for the upward of 75% of those requests taking place through via third party scripts. That’s not necessarily a good situation for many organizations because you don’t always have insight into or control over what those scripts are actually doing. You need the have the right tools to be able to understand what those scripts are doing so that you can optimize the experience for your users and make sure you’re not leaving money on the table when it comes to potentially poor user experience.


Do you really know what kind of experience your visitors are having when they come to your site? More importantly, even if you do know what kind of experience they’re having, how do you manage the risks and the return on investment of the tools that are provided by the third parties that you’ve chosen to integrate with your sites?


To that end we need ways that we can measure 3rd party scripts to make sure that they’re not driving your customers away. That’s key. If anybody who has a bad experience on a website, I’m sure we’ve all done it. It’s taking too long to load, close the button and go somewhere else. We don’t want that to happen. We need to truly understand the impact of every script on visitors. We have to have a way that we can get really detailed and really granular so that we can make sure that we are optimizing the tools we’re using for our users. Probably just as importantly we have to make sure that decision makers can have detailed, critical but simple and actionable metrics so that they can make good decisions based on what users are seeing when they’re visiting the sites.


I’m going to jump right in and introduce once again Jason Trester the Senior Director of Site Engineering for CBS Interactive. He maintains sites including CNET, ZDNet and TechRepublic. Jason, why don’t you take it away and talk to us about what your experience has been with SOASTA and some of the tools you have in place to measure user experience?


Jason: Thanks Scott, will do. Do I have control of the presentation?


Scott: You should.


Jason: Cool. All right so starting back in 2014 I believe, we went through a pretty extensive site architecture overhaul. We re-platformed, moved from one code base to another, re-platformed our CMS, everything and anything to do with the site we changed. One of the primary goals from an engineering perspective at that time was to make sure that wherever we moved to was significantly more, performing much faster than what we had at that time.


This is just a high-level overview of the size, the volume of traffic we do, the number of page views. The thing that’s going to really jump out at you more than likely will be the number of requests we make per page load. Right now we have 200 million page views per month roughly. On average we have 50 million unique visitors per month and this is just for CNET alone. This doesn’t include ZDNet and TechRepublic. Then our average number of requests we make per page is 350 to 400 requests per page. Now the crazy piece about that is, the number that Scott threw out previously about 75% of requests on typical media sites can be upwards of 75% being 3rd party requests, we typically make roughly around 80% of all requests are 3rd party requests and I’ll get into that shortly.


When we first started back in 2014 our goal was to get to 8-second page load times, then our goal was to get to 6-second page load times and then we got down to 4-second page load times.


How do we do it? How do we get from 8 seconds, to 6 to 4 to 2? What have we done over that time? How has performance become an integral part of our culture? How do we get the product team? How do we involve the product team? How do we involve non-engineering teams in this process? How do we set certain thresholds, performance budgets? Et cetera, et cetera. Those are some of the things we’re going to speak to and what SOASTA has allowed us to do, and not only SOASTA but some of the ways we’re using SOASTA which I think may be unique to our situation. Go ahead and go to the next slide.


It begins by knowing your numbers. Like I said, we set the goals at a high level for page load times and then when we went through the re-platforming where we identified, we went through it and did an audit of our site and identified … What we wanted to do is understand what every request did to our site, not only our requests but also 3rd party requests. What was the cost, the performance cost of everything on our site?


Obviously our product team and our business team, they knew the return on investment from a business perspective so what we wanted to do is identify the cost of those things. Then obviously we needed to leverage tools to enable that and we wanted to develop a data-driven optimization process that we not only used on the engineering side but we also have a data team that this plays into. Can you go to the next slide please?


This is a … This is what I see every day when I come into my office. This is SOASTA. They have a ton of different dashboards you can use and we use a lot of them. This is the one that … What this is, is it shows the page load time for our top, what is that? 12 highest priority page types. You can see like I said earlier our average page load time is in the two and a half second range. Every day I come in my office you can see that I’m watching on the 60-minute rolling dashboard what our page load times are across the site.


You can also see in the background we have … That’s a 50-inch monitor in my office. We also have a 60-inch monitor over the developers and engineers’ section so that they can monitor SOASA and various metrics on our site. Can you go to the next slide please?


Scott: Sure, before we do that Jason I just want to make sure the audience is aware that if you have questions for Jason or Buddy, please use the questions control panel and then go to webinar control panel and we’ll get to as many questions as we can at the end of the events.


Jason: SOASTA, mPulse enables CBS Interactive stuff to put site metrics in front of non-technical audience. The last image you saw that where you could see all of the page load times across the site, that was really the first thing that we were able to put in front of our GMs and business teams so that they could very easily understand, this is what our current page load times are.


Everyone, non-engineering and engineering, everyone hears what the industry’s trying to move to from a performance standpoint. 8 second, I put up in front of them whenever we started the same exercise for ZDNet, 12-second page load times, 14-second page load times. Now we’re down to around roughly 4- to 5-second page load times because that exercise for ZDNet came after the CNET exercise. It’s really easy for non-technical teams to understand when they see, this is how long it takes for your page to load in black and white. It’s real-time user monitoring that’s an actual … This exactly what users are experiencing on average.


Then with SpeedCurve, which is one Of SOASA’s products, with SpeedCurve that allows us to do synthetic test and with the synthetic test we generate hard and in those hard files we’re able to dive into the very detail on granular data that I’m going to be showing you shortly, which allows us to start to one by one systematically tackle the various requests on a site, the various services on the site, the various things we load in those 350 to 400 requests so that we can gradually drive that overall page load time down over time. Next slide please.


Again this is just to recap and hopefully you can actually see the data here. You can see for our home page it’s 2.3 seconds article page. At this time 3 seconds et cetera, et cetera, review page. Another cool thing about this is you can see the number of page loads for our article page. In this example you would see 570,000 page loads, 250,000 roughly page loads for our home page but this is again, this is what we used to put in front of non-technical folks. Obviously I look at it every day as well. Next slide.


This shows our overall page load time. This is another dash board provided by SOASTA. You can see it’s 2.48 seconds at the time of this report. Then we tried to tie it to various business metrics; bounce rate, completed sessions, page views. These are the KPI’s that we’re monitoring as they relate to page load times. Average session link and over time you could see to a certain degree that these, the corresponding KPI has improved as page load time came down. You can also get page load time for each individual page, you can get by browser, by operating system, by country, et cetera, et cetera. The amount of data provided is pretty much endless. Next slide.


This is another aha moment for a lot of folks. When I first shared this with our product team, they were kind of blown away. I don’t know if they should be because every day it seems that we have another request to add yet another 3rd party service to our site, but I think it kind of gets lost over time when you don’t realize how many services you’re actually adding to the site. If you look at this report, the red is the number, the percentage of requests being made by a 3rd party on our site and the green is the percentage of requests that we make for our own content. It’s pretty staggering. It’s pretty astonishing that you can see on average, on average let’s say 80% of the requests on our site, we actually have control of. Everything else is a 3rd party.


It becomes sort of daunting when you first look at this and you think, “Wow, how in the world are we going to be able to make an impact to performance whenever we only control 20% of the puzzle, 20% of the pie?” We did reach out and I will kind of go through some of these examples. We did reach out and worked directly with a lot of 3rd party vendors to either replace their service or have them optimize their service. We made recommendations for their services. We worked together to identify what pieces of their services did we actually need and use. Then we worked to decouple some of the things we didn’t actually use.


Another cool thing about this particular graph is you can see roughly around midnight on April 1st that the number of internal requests, the percentage of internal requests significantly increased. The reason for that increase is because at that point in time we started serving direct sold ad campaigns. As we know, exchange campaigns, ad campaigns make a lot more request to 3rd parties because they’re not even serving the creative themselves. They are typically making multiple downstream requests. You can see exactly at what point in time when the exchanges stop serving and direct sold campaigns started. Next slide please.


This is a great … this is great overview of what SOASTA provides via SpeedCurve. With SpeedCurve, we are able to see individual runs.
Jason: Deployment release numbers is an actual deployment that we made to the site and what we can do is if we do see the times go up in SpeedCurve, we can try to tie that into one of our deployments to easily identify which deployment may have caused an increase or if the increase is related to a deployment at all.


You can also see with this report, this is the first time you start to see fully loaded time. 23.4 seconds probably jumps out to you whenever I’m sitting here talking about 2 second page load times and then I show you 23.4-second page load time. Well that’s the … That’s where we get into the 250 additional, 300 and some additional requests that we make. What we do is we prioritize and I’m going to get into that in a moment. What we do is we prioritize our requests in such a way that for initial page load, initial visual rendering of the page and what a user experiences or perceives to be the loading of the page we try to do the absolute best job we possibly can to prioritize all requests from the site so that we minimize the impact to what our users’ experiences on initial page load. That’s where we come up with 2.3-second page load time, which typically ties really closely, couples really closely with dot complete time.


Then the 23.4 seconds is a lot of stuff that we defer until after page load obviously we have now … now that we have optimized page load on CNET to this degree, one of the next steps in this process and it’s been … Like I said earlier, it started in 2014 and it’s still on going and it will continue to go on indefinitely but our next steps on CNET particularly is to reduce the fully loaded time from that 23.4-second time. We want to continue to chip away at that next. Next slide please.


One of the things we do, once you get down to the 2.3-second range or whatever your defined range is, whatever your goal is, once you can get there now the question becomes, “How do we stay there?” because as we know every day, or at least every week it seems, there’s going to be something else that someone wants to add to the site because it’s going to bring in more money. What we’ve developed is an A/B testing process that allows us to side-by-side compare every 3rd party script that is being requested to be added to the site.


The cool thing about SOASTA is it’ll, along with some other tools that we use, it allows us to side-by-side compare the exact impact of every 3rd party script that we add to the site. Now, there’s a couple ways you can do this. If it’s a 3rd party script that’s extremely heavy, and in one of the use cases I’m going to show you is header bidding, if it’s something that’s really heavy you can put in front of product folks exactly what it does to your overall page load time.


Now let’s just say that it’s not that heavy yet we know there is a performance cost to it but it’s in, let’s say, the hundred or the thousand millisecond range right? There is a performance cost to that but since page load times vary greatly, that’s when you get into … but you still want to be able to show the difference and the impact of that 3rd party script. That’s where you get into the synthetic testing that SpeedCurve offers. I’ll dive into a little bit about how we extract that data and present that in a way that’s easy to consume but first, next slide please, first we’ll probably jump into the most recent header bidding example that we went through.


You can see on December 7th we implemented, and this is SOASTA’s report, it’s similar to what I’ve showed you earlier, we implemented header bidding on our site. For those of you who don’t know, header bidding is a cool way for us to get more money for the same ad slots, but we allow advertisers to compete against each other and bid against each other whenever we do not have direct sold ad campaigns in those ad slots. What we don’t want to do as a business is leave money on the table but in order to not leave money on the table, we also had to consider the ramifications and the impact of adding these types of external services which are very resource-intensive and make a lot of requests on your site.


Back in December we added a header bidding service to our site. You can see our page load time for our article pages was 3.65 seconds. They gave us a couple of different time out variables. They allowed users to bid for the ad slot for 900 milliseconds in one group and 1500 milliseconds in another group. You can see that this service literally added 6 seconds to page load time. Real-time user monitoring, control versus non-control and A/B Test. Obviously that’s a pretty staggering number and that’s something that we took back to our revenue ops team. We took back to executive VPs et cetera and showed them that yes we can bring in what they propose is more revenue but we can also impact our page load times by 6 seconds, with a 267% increase in page load time over our control, next slide please.


What we did was we dove into what that service was doing exactly. Using SpeedCurve and using the hard files that are generated by SpeedCurve and using a pattern pending script that we wrote that goes through and analyzes all of the bandwidth consumed by a particular service and analyzes all the requests and rolls everything up, whether it be synchronous or asynchronous, and gives us a bandwidth consumption cost of that service.


The reason we use bandwidth consumption cost as one of the three primary determining factors as to whether or not we add a service to the site, is because every service you work with, every 3rd party you work with is going to come back and tell you that, “Hey we load this asynchronously. It’s not really going to do anything to perform it because it’s loaded asynchronously.” Well that would be true if browsers could run an infinite number of requests in parallel but that’s not the case. That would also be true if the devices that we use had an infinite amount of processing power, which that is not true.


The reality is while these requests may run in parallel we can’t run every request in parallel simultaneously right? At a certain point, there is a cost to your performance and what we try to do is not really look at the milliseconds which I’m going to show you and bandwidth consumption, time in milliseconds which is the middle column here in this document. We don’t look at it as that’s exactly how much time it adds to the bottom line but that is the performance cost in terms of milliseconds in terms of milliseconds for all 33 requests that your service is making.


This is a pretty cool example right here. What we did was we measured … These are the three things we look at. We look at the number of requests made. We look at time consumed in milliseconds synchronously and we look at the size in kilobytes for the service that we’re looking at adding to the site. We then compared that to the entire job descript library that we own that we load on our site for each page to work. Now this is an example on our front door.


I put that in front of the product team and say this service that you want to add to the site is roughly three … It picks up roughly three times more bandwidth to run than everything you want to do on the site with JavaScript. I think that’s a very powerful tool to put in front of folks so that they understand the cost of some of these services that they want to add to the site and rightfully so. We’re here to make money that’s what we’re trying to do as a business but we also have to do it another way that it does not impact our users too much. Let’s see here if there’s anything else.


Another cool thing to point out about this is the sample tool made 31 requests but the service itself that we use that so that that 3rd party service, they only make made five requests. They then made 27 additional 3rd party requests on their side. Going back to what SpeedCurve does is it gives us a hard file that runs automatically and then we are able to extract this data from that hard file so that we can assign these performance costs to each service that they want add to the side. Next slide please.


That was on 12/7, and then we worked with this 3rd party provider for roughly a month … No sorry roughly almost two months you can see here. On 1/26 they came back with, “Hey this is our new improved service. We’ve implemented the recommendations and suggestions that you guys have made. We changed some things on our site. They removed some of the …” What they call the bad bidders open. I don’t want to call out names but they removed some of the other 3rd parties that they were calling because they were … We identified them as being the most resource-intensive services that 3rd parties that they were calling, so they excluded them from the program for us and they were able to improve performance and get it down to where our page load time was 8.1 seconds. In that same amount of time, we were able to continue to improve our site so again it was sort of a watch but yet you could see that they made significant improvements over the request that they made during the time that we have been working them. Can you go to the next slide please?


Now that we have this, what do we do with it right? we develop data driven optimization processes. We used the results of these performance metrics to make decisions based on performance of 3rd party scripts. We use the information to make critical decisions and this is where we’ve going to get into what I was speaking about earlier with determining load priority when we want to load things on the site, what takes a higher … what’s a higher priority, and establishing our load order requests.


Then another thing is when you put a low and you’re going to see it in a second but when you out your low priority in front of folks it allows them to understand, “Wow this is what’s most important to us.” Then it allows you make some pretty important decisions regarding, is that actually the case? Should we be loading this thing first or we should we be loading this other thing first because it’s a higher priority? Next slide please.


Before we started doing the load priority order, for us the thing we wanted to do was identify and this goes back to what we did with the A/B test for whenever new services get added. What we wanted to do is go through every request on the site, every single request, and roll it out every single service on the site and then roll up every request that relates to that service or that is called by that service made by that service to that request, so that we could assign a performance cost to every single service on our site.


Right now you can see that we have 14 3rd party services on our site. This doesn’t include the GPT request made with ads et cetera but this is things like ad insights, ad targeting, related content, ad framework, revenue, user surveys, heat maps, all the cool tools, service manager, all the cool stuff, DW tracking, Armature tracking comScore tracking, Nielsen tracking, et cetera et cetera, all the things that everyone for the most part has to have on their side if there’s a scale. What we did over that two-year time period was, we went through and analyzed every 3rd party tracking service on our site and again the way we did that was with SpeedCurve, the hard files that were generated and then extracting the data from the hard files in a meaningful manner and establishing these performance costs.


What we do is we have established around a thousand milliseconds as our ideal. If we’re going to add a 3rd party script to the site, it obviously depends on what they’re doing and how much revenue they bring to the table. Our goal for the most part is to stay below that 1,000 millisecond of total bandwidth consumed threshold. You can see on this spreadsheet that we have quite a few that are above that threshold but it has deemed by our business team and us in engineering that these services are critical to our business and we need to have them on the site. In addition to that we also say, “What can we do to work with these third parties to reduce the performance cost?” All right, can you go to the next slide please?


All right, so here you can see how we’ve set everything according to load priority. Here you can see that what we decided is obviously the thing most important to us is to get techs, the HTML, the CSS, in front of our users. It’s also important to us to us. One of the things we got into and I’ll speak to this a little bit in a little bit, one of the things we got into when we first started this exercise was we were outsourcing our 3rd party fonts. That created quite a few problems since fonts needed to learn very early in the process otherwise we experience what’s called font flash.


You can see that we have one that’s struck out here and heat maps. That’s because we replaced that with another service that for a number of reasons, but performance being one of them, was a better service to go with. You can also see here that we have these grey lines that give you a rough estimate of how we batch load these requests on our site. Visually rendered, RequireJS loaded, these are ways for us to group certain 3rd parties together but not only 3rd parties, also our internal request together and we set and establish priorities based on what they do for the business and what they do for user experience and what our priorities as the business are.


You can see that ads for example, we load that with RequireJS. RequireJS is a part of our JavaScript framework that we use. It’s the tool that allows us to sort of group JavaScript based on pages and news the way it’s used and make sure that we’re not loading too much JavaScript or unnecessary JavaScript on pages et cetera. You can see here that we currently load ads in with Require. Well something that we’re looking to do probably this week is remove the GPT ad call from there and move it up, to try to make that call sooner in order to increase revenue, et cetera, et cetera.


You can also see that after dot complete we defer a lot of requests to after dot complete user surveys, social promotion, tracking, light boxes affiliate link revenue, et cetera. We do a lot of this through a tag manager Tealium so that once we realized that we could use Tealium to not only insert the tags which that’s what certain groups wanted to do so it didn’t require engineering to insert these tags. Once we realized that we could use that as one of the tools in our tool belt to load or sequence these requests and load them in groups and sort of defer them until after a certain point, that became very powerful because then we could continue to drive that dot complete, that page load time down even further.


What we do when we meet a lot of times we meet regularly with non-engineering teams, with our product teams to go through the load priority and just we do this once a month or so to make sure that we are establishing, that the load priorities are in order with what we’ve established to be business priorities. We also when we … once a service goes through the A/B test and we decide yes it’s performing enough to add to this site, yes it adds enough revenue to the mix, yes it’s what we want to do as a business, we also at that point in time we talk about load priority and we decide where it should be loaded within them. We don’t do that just on the engineering team. We do that with the business team. Can you go to the next slide please?


Back to what I was talking about with fonts and I just want to call that example out because I think that really speaks to how much time, how much dedication, how much effort we spend focused on performance and how serious we are about it. When we did this back in 2014, when we did the site redesign and we moved over to a new architecture, one of the things we decided at that time the design team went out and they said we want to have new fonts and we’re going to use this 3rd party that hosts fonts and they’re going to load the fonts on our site and they’re going to serve the fonts et cetera, et cetera. We were like, “Cool, sounds good.”


Well what we found out was whenever we dove into the performance analysis, when we started going through this exercise post-redesign, what we learned was that that particular font provider was doing a lot of stuff that we didn’t really see as a priority. They did because it was techniques like font smoothing et cetera. They saw this a priority because it made their fonts in certain edge cases a bit prettier I guess but from a page rendering, page load perspective it had a huge impact.


What we decided to do was work directly with that provider and try and to establish some requirements and make sure that they were meeting certain thresholds just like we do with all the other providers at this point, but the reality became, they really couldn’t break it out and do that for us in the way that would be efficient so that way to get around the impact to our site’s performance if we continued to use that vendor. We said well why don’t we just host the fonts our self because obviously you can get Google fonts et cetera, et cetera?


Unfortunately in this particular use case, our design team and product team had selected a font that was initially a print font and the font provider had worked out a deal with the print font company where they had created some fonts that work with their service only that can be used on the site, that could be used on websites. Then we found ourselves in a pretty strange predicament where it was okay, either we change the fonts across the site, which our design team did not want to, or we try to recreate these fonts. What we did was we reached out to the original font founders that created the fonts for this other company and that’s where the six months’ process came in.


I learned a ton about fonts, we all did. We learned to love fonts and we worked with them over a long period of time to create font files specifically for us that were very performant and then we hosted those fonts ourselves. Now obviously I don’t recommend that to anyone. What I would recommend is you go out, you find web fonts that work for your site and you buy those fonts and you host them yourself if you can. That’s where we got to now.


The way SOASTA enables these efforts is a lot of what I’ve already been speaking to, is it was pretty easy once we got into the data that SpeedCurve and mPulse provides to understand exactly where the issues were. Before then we didn’t know that fonts were adding let’s say 20% or 30% to our overall page load time and they were. That’s sort of some of the things that are tied into this. Can you go to the next slide please?


I’ll give you a quick overview again over on the tools that we use and why they matter. We obviously use SOASTA. That is by far the most powerful tool that we use from a performance standpoint. We use mPulse, one of the products. We also use SpeedCurve. We use both of these things religiously. Every day like I said I come in that’s what we look at. It’s not just me. It’s my team. We also use Grafana, Sitespeed.io and PhantomJS. Jonathan Lee is a great engineering lead on my team that goes out, discovers these things, implements them for us, shows us how we can use them, how we can benefit from them. Grafana was the cool interface that you saw earlier in the slide that shows that 80% of our request are 3rd party requests.


We use Tealium and RequireJS, though differently. It’s not the tools that help us to uncover the data but these are the tools that we use that allow us to set load priorities and determine when things get loading on our site.


Buddy: Thanks Scott and Jason that was great. I mean before I move forward with some of this other content, I just wanted to make a couple comments about everything that Jason that just presented on behalf of the CNET team. In 15 years of working in performance, I’ve worked with … I couldn’t tell you how many companies … Hundreds of companies, most of them pretty large ones across all industries, media, retail, finance all these stuff and it’s just … It’s rare that you find someone who is as cutting edge and on top of everything as Jason and the CNET team are. You read all the tech press and you hear all these clichés thrown around, and vendors do it too. Talk about digital business and everyone talks about all these silos and how important it is for IT in a line of business to come together and all the rest.


If anybody wants to know what that looks like, just go back and look at the operationalizing intelligence content that Jason just went over because I think that’s a perfect example of bringing IT and a line of business together. You’ve got … Here’s a team that tracks in a single view right, all these technical metrics like bandwidths and the number of requests and the pay load for each of these 3rd parties and everything. That’s typically how they’re loaded. Are they loaded, directly coded in to the top of the page or the bottom of the page, or are they loaded via a tag manager like Tealium?


That’s all the technical metrics but then right next to that, is all of this information about how valuable that actually is to the business. What’s the use case? Not just is the request coming from such and such a host but what do we actually use it for? Is this for ad insights, for user surveys, for affiliate links? What’s the business value? It’s only by intersecting those things because it’s never the case that the most valuable to your business is also the fastest 3rd party. There’s always tough decisions that people have to make. The only way to make those tough decisions is to bring all that information together exactly the way that CNET’s doing it.


The topic that I did want to mention though as we wrap up the presentation today is CNET’s an interesting and unique somewhat example of someone who is very dialed in around performance. They understand what their performance goals are and everyone is motivated to go after them and they know where that is in their priority list. However, what often happens with a lot of the companies that we work with they’re great, they’ve built really big successful businesses, they have a lot of things going for them but one of the things that they don’t know exactly how to do is how to get started working on performance, or where it actually fits in their priority list.


One of the things that we’ve noticed, that I’ve noticed in all the years of working with companies, and this is just a truism that everyone on this call I’m sure knows, is that there’s never ever enough time to do all of the things that you know you need to do. Understanding the things that you need to fix are one thing, but knowing how to fit that in to all of the other activities that you have going on that perhaps have nothing to do with performance, like new feature development or things like that, is important. There’s all the stuff that we provide to our customers that Jason just went over I wanted to cover something else that we do for people that are just trying to figure out how this all fits in the context of their overall business. The first one that I think is critically important is everyone ought to have a really clear idea of exactly how fast their website needs to be.


A lot of times I’ll ask people this question, “Do you know how fast you are?” a lot of times people know. There are a lot of really great tools in the market that will measure how fast your website is. We think we have one of the best ones is SOASTA mPulse but there’s a lot of tools. A lot of people I’ll talk to and they … fast they are. Then the second question is, “How fast should you actually be?” and often times I find that the answer comes from different places and they can all be improved. Someone will say, “Well, we have this number because we had a meeting and we just kind of qualitatively felt like two and a half seconds seems snappy enough so we decided to go with that. We needed a goal so we had a meeting we picked a goal.” That’s one way.


Another way is people say well, they’ll refer to a report that was done by one of the large analyst firms and they’ll say, “Well the Gardner said or Forester said or whoever said that such and such percent of people will leave my website if takes longer than this amount of time.” It’s a survey. It’s about other people not that site’s customers but it’s better than just picking a number out of thin air so that’s what they go with and a variation on that is they’ll go with something that another company that did a study and there’s are several of these out there. Unfortunately, I’m not sure of any that are in the media space but in retail, Walmart did a study in search Google and Bing did studies, they had data on how delays affect people.


However, none of this stuff is actually your users, the people that are on your website. What we do, we collect all the information about all the visitor passing through your site using, we’ll use your measurement, using RAM. We collect how long they’re waiting and we collect how that waiting impacts their behavior. For media sites like CNET what we usually recommend is looking at the number of pages per session and that’s what this chart that we’re looking at right now illustrates is that the slower the website is, that’s the X-axis is page load time tending towards 10 seconds, the Y-axis which is invisible here I removed it, shows you how many pages per session for this particular website that we looked at. What you can see is that the slower the site is and this is intuitive, the fewer pages people are looking at.


What you can also find is you can find the specific maximum. In this case, in this visible range it’s around 2.4 seconds or so where there’s a little peak in the green where people are looking at the most pages in their session and if you’re a media company who’s primarily generating revenue from ad impressions, then that’s the target you want to hit. You want your visitors looking at as many pages as possible. By using this type of analysis, we’re able to tell you exactly how fast you need to be with a backing methodology using data that’s based on your users, not other people’s users or not opinions. Once we’ve got that piece done, we know how fast we are and we know how fast we need to be, then we can start figuring out how we plot a course to get there. The next slide please, Scott.


A lot of times what people do is they’ll make a big mistake. They’ll look at … Because no one really makes the whole website faster. What you really do almost all the time when you actually sit down and make improvements as you make improvements on a particular type of page. You might embark on an initiative to make your article pages better or maybe you’re going to work on your home page. How do you know where to begin? It goes back to that whole idea of there is more stuff that you could do than there is time to do all of those things in. You have some tough decisions to make about what you work on and what I’ve seen a lot of times is people will make that decision by measuring all of the different types of pages on their website and then they’ll start with the slowest one and they’ll work on it. They say well, “Once I fix the slowest page then I’ll look at the second slowest, and then the third slowest.”


What we see when we actually dig into the data is that that is almost always a mistake. It’s frequently the case that the slowest page on your website actually isn’t the page that you should work on first, if what you care about is some other metric that’s relevant to your business like ad revenue. It’s a little bit complicated to describe and I’ll talk through this chart just a little bit. We created an algorithm that produces a metric that we call the activity impact score. What we do is we collect every user experience passing through your site and we look at how long they had to wait on all of the pages within their session, and of course they go to different types of pages. Then for those, we also look at how the little delays or speed-ups on each type of page affected their likelihood to continue to engage with the website.


We find that depending on the type of the page, sometimes visitors are more or less patient. Earlier we talked about your whole site should target this load time. When we start to decompose that a bit and pick it apart into our action plan, we find that you need different goals for different types of pages and that different pages should be worked on first because there might be certain pages that while they’re not the slowest, they’re actually driving your users away the most because that’s the part of your site where they’re most sensitive to performance.


On this view we see that the green line is actually showing you the load times for all of the different types of pages on this particular site. Somewhere down about 60% of the way through, we see a page, there’s a blue bar which is that activity impact score that’s really low. That page where the green dot is the highest, is the slowest page on this website. The naïve view would be to say, “Let’s go to that page and work on it because it’s the slowest page,” but in reality, the page that is somewhere in the middle it’s neither the fastest nor the slowest but the one that’s all the way over on the left which in this case is home page, then right behind that is politics. Those are not the slowest pages on the site. However, they’re the ones where users are most sensitive to performance.


If you were going to start working on perf and you only had time to work on two pages out of your website, you would skip the slowest pages and you would work on those two instead because the data shows that that’s where you’re going to get the highest ROI if what you’re targeting is user engagement. That’s the activity impact score. We’re trying to help you prioritize all of the things that you could do by ROI order.


Should I be working on or do I have the time to work on performance at all right now? Maybe you have competing priorities within your team like you’ve got demands to build new features or whatever it is that you’re responsible for in addition to performance. There’s the question of is working on performance right now the highest ROI to my business?


As part of our analytics we give people the ability to make revenue forecasts of how a given speed-up in performance would affect their revenue. If you are someone who you’re conversion-based, you’re in retail or you’re in media but you have a conversion funnel that’s very important to you and has a dollar value associated with it when someone completes that conversion, the math is … I can describe it in a straightforward way. The underlying algorithms are actually kind of sophisticated but generally what we do is we trace the path between if people that are waiting a long time or waiting a little time we put them all in buckets right. We figure out within each of those buckets the four second load times, the five seconds, the six seconds, seven second. Within each of those buckets what is their conversion rate?
Right in the website we collect with our mPulse analytics and pull it back and we can associate the revenue with those conversions and then we can make … Usually what happens is that your performance will show up as a histogram. Everyone knows just because you’re average or you’re medium, there’s two and a half seconds doesn’t mean everyone gets that experience. You’ve got people that are getting faster and slower experiences on any given day you’re delivering good and bad experiences at the same time.


This is one of those places where you don’t want to look at a summary metric. You want to look at the whole distribution. What we do is we plot that distribution and then our algorithm has little sliders where you can move it to the left and we’ll shift that whole where we compute that whole population if they were all just a little bit faster and produce whatever their target load time is, what would that mean in terms of the cascading events that go through a conversion rate and then order value and then ultimately we produce a revenue metric for you. You can literally move these sliders around and forecast, “What would happen if I sped the website up by a second? How much additional money would I make per month?” You can also do risk analysis this way by sliding it to the right, “If I slowed down by a second how much would I be leaving on the table?” That’s on the conversion side. For traffic and ad revenue we do a similar approach.


The user experience … for the user of the mPulse tool is the same. It’s these sliders and it kicks out a revenue projection. Our methodology is a little different because we’re not going through a conversion to get there. What we’re doing is we’re looking at this is back to that session length. Users who get faster experiences look at more pages. We figure out as performance changes among those users who got two-second, three-second and four-second load times, how many pages did they look at in their session?


We produce that histogram, then we can start to mess with to and say well, “If everybody got a little bit faster experience, which bucket would that put all of these different people in? What would their aggregate number of pages per session be?” which and suppose that means I go from a million page views to a million and a half page views, so a 50% lift. Knowing what my CPM is in terms of ad revenue and everything and I know how many ad impressions and we count all that stuff up and we produce an estimate that tells you what we think your additional ad revenue would be as a result of speeding the web page up.


The whole reason for doing this is so that you can say at the end of the day working on performance this month or this quarter or whatever it is means you’re playing for a, let’s say, $15 million revenue opportunity. We want you to be able to quantify that so then you can sit down with all the other initiatives, “We could do this, or we could build this new features. How much do we think that’s going to generate? Or we could do these other activities within the business.” Then you can normalize it all down to the revenue impact what you think the return on all those investments will be. Often times we find that there’s so much opportunity to improve your business by making your website faster that this performance initiative frequently shuffles to the top of the list.


You’ll end up working on this first a lot of times but at least you can do that secure in the knowledge that by doing do you’re having the most positive impact you can on your business. That’s really what we’re all about with all the things that we do at SOASTA. We want to give you the tools to fix and analyze the problems but we want to do so in the context of your overall business because we’re seeing more and more of our customers who are on that path to get to where CNET is today, where they’re taking all of their performance operations and they’re doing it in the context of what that actually means to CNET’s revenue.


Thanks Scott for running the slides. That’s it for this piece that I wanted to go over. I guess in closing, if you’re interested in SOASTA it’s really easy to get started. You can just hit that link or come to our website. The product is called mPulse and you can actually get it through the website. We have plans that start at $99 a month. We collect everything about what your users are doing, connect it to how it’s impacting their behavior to help you put performance in the context of your business.


Scott: Buddy and Jason, thank you very much. That was an excellent presentation. We have a number of questions from the audience.


One of the questions that came in was it’s a two-part question I believe. He asked, “How are you … I’m interested in how you overlap real user monitoring in synthetic in your organization. Are you actually using anything apart from SpeedCurve for synthetic? For example, how do you do journey monitoring and alerting?” He follows up with that and says, “My query is more about the operational angle how you monitor and react to if one of your parties experiences an issue for example while you’re asleep.”


Jason: Can you break that down? Let’s do a piece at a time. I think I heard how do we monitor and how do we react? We use … Primarily we’re not using SOASTA for the monitoring piece of it as far as the large scale. We have other services Pingdom for example, Rigor for example that we use for the monitoring and alerts as far as that goes. We mainly use SOASTA for the actual granular reporting of page speed data, 3rd party services, et cetera. We also use New Relic. I think we’re using pretty much every performance service that exists but what were some of the other parts of the question because that was kind of … I mean you were cutting out a little bit?


Scott: Sure. He asks, how you overlap real user monitoring in synthetic.


Jason: Okay so the synthetic monitoring is continuous. It’s ongoing and that was the piece of it that we really loved about the SpeedCurve. What it does is it runs I think it’s every … Let’s just say it’s every … It runs a test every six hours so four times a day, it runs a synthetic test for specific page types that we want to report on and with those tests that’s how we get the hard files for those tests so it’s essentially a waterfall, a breakdown of every request that was made on a synthetic test.


We also load SOASTA and run mPulse within our site. It’s an embedded script just like one of the other 3rd parties and we make that call every time we load a page and that’s where I’m monitoring, real time user monitoring. They do, they can work very easily and parallel you’ve just got to think of it as for SpeedCurve that’s how we’re able to dive into all the requests that are made with the synthetic tests that run, like I said, four times a day.


We can also use SOASTA and we have used SOASTA to put embedded timers in at particular points so that if we want to determine exactly when an ad, I’m sorry, exactly when our video player starts or can start, we want to determine exactly when certain things happen within the page and we want to start to group various actions in buckets to determine how long they take. Like I said earlier that can happen, that’s with the RAM monitoring that SOASTA provides, as the mPulse provides within SOASTA.


Scott: Buddy, this is a question for you. The SpeedCurve licensing can include mPulse.


Buddy: It’s the other way round. For our customers who join up with SOASTA and purchase mPulse then we can provide them access to SpeedCurve as part of that.


Jason: I want to speak to a little bit about SpeedCurve, a little piece that I left out. The reason that we went to SpeedCurve because I’m sure a lot of folks on the call that have done this type of performance reporting before have used webpage test but what SpeedCurve does is that it automates those tests so that we’re not manually running tests each time. We don’t have to get in a queue. We don’t have to rely on another service. We use SpeedCurve because it’s much easier, much faster, much more efficient for us to run the type of reports we need to get to the very granular data that we need with the hard files.


Scott: Someone else, someone says they perform better with RAM as well, so I think they meant that differently though.


Buddy: Yeah I can speak to that. First of all, if you’re talking about you personally perform better with RAM I agree. I recommend RAM if you’re looking for a good one. I use the Celerra method I think similarly to what they use for cherry. Until recently I had a bottle in my office. If you’re talking about RAM on your website, and how it performs better like you get faster load times on your website with RAM versus synthetic. That’s not uncommon for us to see either.


A lot of times it has to do with caching effect. The standard methodology for synthetic agents that come to your website is to arrive with an empty cache instead of showing you the worst case scenario a page load a time, whereas RAM because of the methodology and the fact that we’re measuring real users, if that user was just on your site 15 minutes ago, if you have very passionate loyal visitors with high return rates for example then they may indeed get faster load times and perform better because they have a broad cache.


Jason: I can add to that as well. One other thing that you’re going to … A lot of times you’re going to notice like if that person said that they see faster RAM times than they do synthetic times, a lot of times the synthetic monitors throttle bandwidth and the reason they throttle bandwidth is to allow for the number of synthetic requests that are made so that users are not consuming a ton of their bandwidth. Now the cool thing about SpeedCurve is you can set the bandwidth to whatever you want and then that way you can throttle it to … With SOASTA and with mPulse you can determine the average bandwidth of your users so then you can set the speed curve bandwidth consumption, the bandwidth rate to the same as what your actual users are seeing.


Scott: Very good. We have one more question. There’s a bunch more but this is the one the last one we have time for. This is more around infrastructure not necessarily software but it’s an interesting question. They’re wondering what the role of things like flash drives and stuff come into page load times, Jason do you have any insight into that?


Buddy: Yeah I would say that first of all you want to look at where users are spending most of their time, right? Are they using … Are they spending most of their time on the front end in other words after the base page arrives, parsing the content, downloading all the CSS, 68 to 70% of the start parties and everything that Jason was talking about earlier? Is that where they’re spending most of their time? Then flash drives, at least on your servers, probably aren’t going to help.


If on the other hand you have areas of your site where the back end time is taking a long time, well now you have a little bit more digging to do. You might go to some of your other tools as well for example there’s a whole ecosystem of APN tools assuming the time is on your back end time versus front end, so CNET’s case Jason’s very focused on front end because that’s where users are spending a lot of time, if you’ve got a back end problem, then you can dig in and you want to pick apart, “Am I spending too much time on my back end because my database queries aren’t optimized? Do I have inefficient code running in my application servers?” That’s where that server-level instrumentation comes to play.


If you find that you’re I/O-bound on a particular server or serving content, then there’s a couple things you could consider, right? Obviously one is, take a look at your CDN, user CDN or figure out why they’re not able to accelerate that content on your behalf. There’s a lot of really great CDNs out there that use flash drives to deliver content for exactly that reason. If there’s some reason where you’re I/O-bound on a server and you want to continue serving that content out and that’s the bottle neck then sure, I think that’s a case where flash drives could help that I could think of. It’s kind of a narrow case though.


Scott: It comes down to … It sounds like to me it comes down to knowing your bottle neck is. You have to identify where you’re having a problem before you can identify a solution.


Buddy: Absolutely. Our view on that is it’s starts from measuring the user experience and even before we get involved in how long are people waiting let’s figure out, again going back to the point I was making earlier, how long do they actually need to wait? The users will tell you. The data will tell you what, how patient your users are and what speeds you need to target. Then you go find the places where you’re violating what your users expect and then you dig in from there and then yeah, maybe that leads you to this particular I/O-bound case on the server where the solution to that is to install flash drives. Maybe it leads you somewhere else, but the path should always begin with understanding the user.


Buddy: Thank you everyone.