How I saved Super Bowl 50 for Amazon

CBS Sports Super Bowl 50 Streaming Ad

Since today is the Super Bowl, I figured it would be a good moment to reminisce about something that happened a few years ago when I was working at Amazon. 

Part I: The oops

It all started because someone at Amazon forgot to return an email.

It was December 2015 and CBS Sports was beginning their run up to Super Bowl 50 which, in addition to the NFL Playoff games, was to be streamed for free online. Here’s an article which ran at the time about it: 

Fierce Wireless: As Super Bowl preps its largest live stream ever, the OTT industry scurries to catch up to broadcast

February 3, 2016

…CBS will make the Super Bowl stream available on desktops and tablets via its website, as well as on its CBS Sports app on Apple TV, Roku, and Xbox One. (Verizon smartphone users will be the only viewers able to live-stream the game on their mobile phones, using the NFL Mobile App.)

The broadcaster's digital unit has been preparing to stream the Super Bowl for the better part of a year, according to Jeff Gerttula, SVP and general manager of CBS Interactive.

"What we've had to do is go through … every piece of the stack and work to optimize it. From a stability standpoint we're making sure we're eliminating any single points of failure, making sure we have redundancies built everywhere, and from a latency standpoint we're making sure that there's no unnecessary inefficiencies," Gerttula said. "For us, we're inserting ad tech, we're inserting viewer tracking. There's a lot of data components with this and making sure we're testing each of those and optimizing along the way is really important to reducing latency."

[Emphasis mine.]

By the end of 2015, Fire TV had been out for over a year and a half and it was really quite successful. In fact, by then it was already the most popular streaming media box being used according to our internal numbers. And yet, CBS Sports was going to be streaming to Roku, Apple TV and the XBox but not Fire TV.  It seemed pretty crazy that CBS was simply ignoring the platform or had balked at supporting Fire TV for some business reason - especially since they were streaming the games for free.

Only they hadn’t. From what I understand, the previous summer, someone at CBS Sports had reached out to someone at Amazon to talk about their plans to live stream the NFL Playoffs and the Super Bowl and wanted to include Fire TV as one of the streaming media devices supported. Only no one got back to them.

Fast forward to mid-December and CBS streamed their first playoff game and started promoting the Super Bowl with ads and a press release about the free streams and all the devices you can use to view them on - with the noticeable lack of support for Fire TV. That was the weekend when I first learned about it. What the hell? Amazon hadn’t done a deal with CBS? Or did they not consider Fire TV an important platform to support? What was going on?

At the time, I worked for the web tech group at Amazon Lab 126 which ported the Chromium browser engine to Fire tablets and Fire TV. (If you’re unfamiliar with it, Lab 126 is based in Silicon Valley and is the devices division of Amazon that also made the Kindle and Echo.) The resulting Amazon WebView (AWV) is an important library which powers a ton of apps, including BBC and Pluto, among many other big name media companies and hundreds of smaller media channels. I worked as the Technical Evangelist for the group, where I would do stuff like write up articles and blog posts, create sample projects and give presentations about how to best use the WebView to create web powered apps for both our Fire tablets and Fire TV.

So as soon as I got into the office on Monday after the first live stream, I went to my manager to ask what the deal was. Why wasn’t CBS streaming the Super Bowl to Fire TV? 

I can’t remember if he already knew about the debacle or if he went off to find out more info, but my general understanding was that I wasn’t the only one at Amazon who had noticed and shit was already hitting the proverbial fan. Especially since the Seahawks were in the playoffs, and Amazon execs up in Seattle weren’t able to watch the streams on their Fire TVs at home.

It seems someone had definitely screwed the pooch by not following up with CBS. Worse, it turned out that CBS Sports didn’t even have an app on Fire TV to stream the games even if somehow a rush deal could be signed. Given that the Super Bowl was just over a month away, it looked like Amazon was going to be the only platform not invited to the viewing party. 

This is where I come in. 

One of things I did in my job as Tech Evangelist was to come up with lots and lots of prototypes and ideas for how to create WebView apps. A year or so earlier I had helped come up with the idea for a template app that mobile app developers could use to get started making apps for TVs. We had been talking to devs from Grokker - a tablet app that played yoga videos - and they were completely clueless how to port their content to a TV interface. Going from a touch based interface to using a remote control is filled with gotchas, and these and other mobile devs were clueless where to begin. The idea turned into the Web App Starter Kit for Fire TV, a batteries-included open source web project which generated complete media apps with support for streaming video stored on a web server, or serving channels using the YouTube API. By the end of 2015, I had developed dozens of example apps - including helping with some live products. 

So, after learning that Amazon was basically screwed, I decided to see if I could come up with a quick solution. How hard could this be? It’s all web tech - if CBS is streaming to PC browsers, then there was no reason it couldn’t stream to a Fire TV using an app with AWV - especially since our media player supported all the common streaming systems used (something only our WebView could do at the time). I went back to my desk and started poking around the CBS Sports website and found the micro site where they were hosting the playoff streams. By this time, they had already streamed at least one game, so they had a “Come back next Sunday” message on the page that would contain the media player during the live streams.

CBS Sports stream placeholder

Doing some spelunking in the source of the web page and scripts on that page, I found the endpoints for the HTTP Live Streaming (HLS) URLs which would be used during the games. With a bit of experimentation, I was able to find a test stream that was broadcast in a loop, which I assume was used during the development of the various streaming clients. Perfect! I quickly made a test app using the stream (with a little hacking since our Starter Kit hadn’t been modified yet to support the live video yet) and it worked as expected. I wasn’t even surprised.

So then I looked on YouTube and sure enough, CBS Sports had its own channel filled with content. I created a new YouTube API dev key which would allow me to use their API to pull in a list of the CBS Sports videos, playlists, etc. from their YT channel, then pulled in some of the graphics from their website for the background and icon and within an hour or so had a prototype for a fully branded CBS Sports app, filled with the latest CBS Sports videos and support for the live NFL playoff streams. Easy peasy.

I then called over my manager and said, “Check this out.” You might think he would have had his eyes bug out or had to pick up his jaw off the floor, but by this time I had come up with this sort of slight-of-hand a few times before, so he just nodded at me, smiled and said, “Send that to me,” and walked away. Since AWV powered apps are essentially just Android apps wrapping a browser engine displaying a web app, hosted on a regular web server, I sent him a link to where I was hosting it and sat back and waited while he ran the app up the chain of corporate execs.

I don’t know what sort of mad scramble of phone calls, legal wrangling, promises or begging, etc. this set off, but within a day or so a deal was signed, and a group was tasked to launch the app for real in the next few weeks using my prototype as the foundation.

There was, as you can imagine, work to be done to convert a prototype into an app that could be robust enough to be used for such an important event. Before it launched, it had to be approved by the techies at CBS who were incredibly nervous about this last-minute addition to their streaming targets. They had never streamed to such a large audience before and were terrified that some rogue client would take down their servers somehow and cause mass chaos on Super Bowl Sunday.

This is foreshadowing.

Part 2: The app

Fire TV CBS Sports web app for streaming Super Bowl 50

Converting my prototype to an app that could be used by 10s of thousands of customers in just a few weeks required a bit of work. 

After a few days of a deal being signed with CBS, I received a single PDF document which laid out the requirements for the app, including official URL endpoints for the media stream to be used for testing and production. I had already figured out most of what they sent, but it was nice to get confirmation, as well as officially sanctioned access. 

Again, to be clear, web-based media apps on the Fire TV are simple Android wrapper apps which contain a browser engine which pulls the HTML, CSS and JavaScript from a web server for the user interface. So it’s basically a standard web page. The main difference is that your controller acts as a keyboard sending up/down/right/left/enter key commands to the page rather than a mouse, so the UI has to be made to highlight the various buttons and menus using just those buttons. When a video is playing, then the play/pause/rewind/fast-forward keys are used as well. The Web App Starter Kit (WASK) my group created has all this pre-made, so really all you need to do is modify the images and text to get a media app up and running. In addition, the YouTube API support allowed the app to automagically get all this stuff from a specific YouTube account and fill it all in for you.

So the first thing I did was set up an AWS account for the app, create an S3 bucket and appropriate public gateway services to serve the static pages. This could easily serve thousands of requests per second, and so was a simple, scalable way to host the web app. 

First on the CBS requirements list was that the app had to have a kill switch embedded in the app which CBS could use to shut down the streams for all the Fire TVs in case something went wrong and they caused the stream issues for the millions of other viewers expected to be viewing it on their PCs, smartphones and other set top boxes. That was relatively simple to implement: During playback, every few seconds the app would query a plain text document on the web server and if it contained a “false”, it would shut down the media player, killing the stream for any Fire TV that was viewing it. CBS was super twitchy about the stability of their streams given the high-profile nature of the event and having never done anything on this scale before. The late addition of another device using the stream must have given the techies there fits. So, if CBS decided something was up with the Fire TV CBS Sports app, they could pull the plug and the game would go dark for Amazon customers. Honestly, I wasn’t worried about this much at all - the Android media player was solid and well tested at this point (it had been used for major apps for over a year), so to me, this was just a formality.

Next up was captions. This was a bit more difficult and out of my hands. Captions were a legal requirement for any stream being broadcast publicly on American networks, even if people are accessing them via the internet. CBS wasn’t using web standard WebVTT captions, but instead embedding them into the binary data within the MP4 based media being streamed. 

The first issue was that the basic WASK app template didn’t have caption support, so the user interface to turn them on and off needed to be added to the UI. But secondly, the Fire TV’s native media player used by the browser didn’t support the FCC mandated CEA-708 embedded captions in the stream. I’m not a low-level developer by any stretch, so this work was done by another developer on the team who would modify the player to support these types of captions, wire it up to the browser so the standard media player caption APIs would work and then wrap it all up into an Android media app. To give ample credit where it is due, this was a non-trivial task for the Android devs who - from what I understand - had to figure out the vagaries of this odd caption standard, get it working reliably in the player, and then get it all compiled into a custom browser engine which would be bundled into the Android wrapper app. All I had to do was update the web UI with a button and menu - a much easier task. 

Next up was metrics. CBS of course wanted to know everything about the viewing of the game. How many users viewed the pre-game show and the game itself, for how long, and most importantly, which ads they viewed. CBS wanted two reports, a preliminary count immediately after the game completed (I assume for PR purposes) and a more complete report after (I assume for monetary reasons). 

I implemented this using Amazon Mobile Analytics - the AWS commercial metrics service which is designed for exactly this type of data tracking. I wired up all the events you can imagine: App start, app end, player start, player end, ad start/end, YouTube videos viewed, etc. This was done using a small chunk of JavaScript provided by Analytics, which would pull the appropriate library from their server and provide an API to use to create various events to capture, pinging their servers with the data which would collect it all and present charts and graphs. If there was a connection problem, it would cache the events so nothing was lost. In other words, it was designed to be a robust, reliable way of keeping track of a mobile app’s usage, which is essentially what a Fire TV app is. 

For the privacy conscious out there, this is all done anonymously - Amazon has strict rules about associating user accounts with analytics. Each app would have its own unique ID created when the app is installed, but it’s not tied to user accounts in any way. So you can tell that user #12345 viewed such and such video, but there’s no way of knowing who exactly that user is or what Amazon account they use to purchase things. In other words, the app wasn’t spying on anyone in order to sell them more stuff, it was just keeping track of viewership in the same way Nielsen or any other web tech does. If you uninstalled the app or wiped its storage, a whole new ID would be created.

The Analytics service was a bit opaque in terms of raw data, and the results could take a while to be aggregated, so in the same JavaScript function which called the Amazon Mobile Analytics APIs which posted user events to their servers, I added my own web request as well, sending the same data to my own database which I set up on AWS using Lambda and a DynamoDB table. This way I could keep track of the app events “live” and know what was going on as the apps were being used. Unlike Analytics, my data would just be a lengthy list of raw events, but I felt it was useful to keep track. 

Come back when the stream is ready.

I then modified the app’s media player which was going to be used for the live stream, disabling the fast-forward button, and giving a message to users who tried to tune in before or after the stream was being broadcast. In addition, I copied the CBS Sports countdown timer they had on their Super Bowl mini site, which showed the number of days, hours and minutes until the Big Game. 

Finally, I had to make sure that all these apps didn’t blow through the meager YouTube API quota provided by Google. Google allows apps and websites to embed videos using their JavaScript player without any limitations. You can view as many YouTube videos as you want for as many times as you want and they don’t keep track or care, which makes sense as any advertisements are embedded in the video player itself. Though, because Fire TVs don’t identify themselves as standard browser User Agents - just as custom Android WebView apps - advertisers didn’t target it back then and app users never saw ads. 

The YouTube API, however, was limited. You signed up for a free developer ID from Google per app and could use it to get lists of videos, playlists, channels or do searches. Each request cost a certain number of credits and was rate-limited to a certain number of requests per second. The number of credits an app gets is numbered in the low thousands, so not a lot to work with. There was a contact email provided online which supposedly allowed devs to request a higher quota of credits, but I’ve never heard of an app actually getting any. I assume this low number was to make sure that Google’s YouTube app couldn’t be recreated completely by using the API to simply get a list of every video or for competitors to use to scrape YouTube’s library of videos. (As if there’d be any other service out there that would be able to deal with the millions of videos uploaded every minute). Additionally, it was against the Terms of Service to cache any API requests on a server, so you couldn’t, for example, blow through all your daily credits with some automated script and then serve that data from your server. 

All this meant that apps had to be very judicious in how they handled API requests. Searches would always need to be live, but the app would need to be set up so that it did a single set of API requests (usually filling out a channel requires several to get a list all the playlists and videos) only once a day, and to re-request the list in case the requests per minute limit was hit and the API barfed at the app the first time. The WASK template was just meant as an example app, so didn’t have any of this stuff built in, so I had to add it to make sure the thousands of CBS Sports apps that were used to watch the game didn’t tap out the number of credits. To help lower the requests, I hard-coded the playlist IDs in the app, which would have failed if CBS suddenly decided to update their YouTube channel, but back then they only added a few videos every couple days, so I was confident it would be fine. I’d update the code after the big game to pull the live list again once there weren’t so many people using it at once.

And that was about it for the app. I never got any official graphics or other assets from CBS, just that single PDF with some media URLs and requirements. So I just ripped off anything I could find from the CBS website, whipped up some backgrounds and anything else needed. We worked on the app throughout January, testing during the playoffs and CBS finally approved the app just before Super Bowl weekend. 

Part 3: The Big Game

Amazon App Store entry for CBS Sports

Once it was clear that the app would be ready, the marketing department went into full swing and started promoting Super Bowl 50 on Fire TVs, and CBS started including it in their online promotions. Though as you can see from the article above, even by the 3rd of February, only the original three set top boxes were mentioned by CBS. Interestingly, the PS4 and Android TV were also added at the last minute. So one wonders if there was a scramble inside Google and Sony during those weeks as well. 

On Friday before the game, I uploaded the CBS Sports app to the Amazon Store, and the Fire TV Home Screen was updated to encourage users to download it. 

We had a discussion amongst ourselves about how many actual Fire TV users would download the app given how late we were at launching it. I guessed around 15,000 downloads, max. Given the numbers I had seen from other apps at the time, it seemed like a realistic number. Though we had millions of Fire TVs in the market at the time, there was only about 48 hours until the big game and I couldn’t imagine that many people out there would bother. It takes a while for people to get around to installing new apps, especially on TVs. They usually just go with the default installs and many never install anything else.

It turns out I was off by an order of magnitude.

I went home that weekend and set up my own little monitoring station to keep track of the installs and watch the metrics to see how things were going. I had several Fire TVs set up with the app running just in case there was some memory leak or something, and I had one logging debug messages to my computer. I also had a custom Lambda script set up to update a web page with current stats that would refresh automatically. Remember, the UI side of the app is just a web page which is loaded every time the app starts, so if there was a last-minute issue, I could tweak that side of things - though if the Android wrapper had issues, it would require an App Store update. Happily, that didn't happen. 

At the end of Saturday there were already over 35,000 installs and some moderate usage. People would see the ad, download the app, peruse the YouTube videos, try the live stream, get the message to check back on game day and then close it. Things seemed to be fine.

Broncos vs. Panthers

On Sunday, February 7th, 2016 the Broncos played the Panthers just down the road from me at Levi’s Stadium. To this day, I have no idea what happened during the game even though it was playing on the TV in front of me. Looking it up now, I only just now realized who was playing and who won. To say I was occupied during the game is an understatement.

The game was to be played midday, and preceded by a live pre-game show which was also streamed. The downloads picked up all morning, in just a few hours another 90,000 installs happened … by the time the game started, over 110,000 Fire TVs were simultaneously running my app. Not millions, to be sure, but given the late start we had, and my low expectations - I was astounded.

The problems started soon after the live stream began during the pre-game show. At first things worked fine, but as more people started streaming, the trouble began. Every time an advertisement appeared, all of the Fire TVs which had tuned in early simultaneously fired off a web request to the metrics servers, both Amazon Analytics and my own Lamda/DynamoDB server. At about 50,000 simultaneous users, shit hit the fan. I was monitoring the AWS console logs and could see it was starting to buckle under the load, rejecting thousands of requests. Fuuuuuuuuck!

First, I was worried the rejection might cause issues with the playback, but I happily had added a try-catch around my code’s request and the browser engine was blithely ignoring the error, so the stream was unaffected for now. Phew!! But what was going on? My code was set up to have a 30 second timeout and the server settings were at 5,000 simultaneous requests per second. The WebView engine should submit the request and then wait until it got a connection, get a response and disconnect. It should have easily been able to handle the traffic, but the console was showing a bright red line at 3,000 requests per second and dropping everything else.

Things were working, but for how long? Those rejected requests were being handled by the browser, sure, but they might cause issues in the long run, causing memory leaks or networking issues and possibly lock up the app, or maybe the whole Fire TV. 

Meanwhile, the Fire TV’s debug log started flipping out. It was scrolling so fast I couldn’t read the messages without pausing the output. Something was seriously wrong. The stream was still playing fine, but the metrics were completely hosed. Stopping and starting playback didn't help. Even restarting the app didn't help. I checked my second and third Fire TVs and they’re all doing the same thing, so it wasn’t just an issue with the one box. 

I had no idea what was happening and at this point the game was going to be starting in less than an hour.

Another member of my team was monitoring an internal CBS support group chat in case they gave the word to fire the kill switch or any other issues came up. We also had a chat open for just Amazon people as well, which I was on. I started swearing like a sailor into the chat. “Fuck. Fuck. Fuck. Fuck. The fucking logs are going nuts. I have no fucking idea what the fuck is going on!!” Amusingly, he was at home and one of his children came over to see what Daddy was doing at that moment and wanted to see the chat. He had to shoo him away before he was subjected to my furious expletive-filled rantings. It was conveyed to me that I wasn’t the only one with issues - other device makers were also struggling and losing it in the CBS chat (which I thankfully wasn’t in). Misery loves company I guess.

Finally, I realized the settings I was entering into the server console to increase the request limit on my script were being ignored for some reason, so I found the AWS support page and clicked the “call me” button.

What I hadn’t realized until that moment is that internal AWS projects have top tier support. Clicking that button got essentially the same level of response as a client like Netflix. To my surprise, my phone immediately rang and a support person asked what she could do to help. (OMG. Thank all that is holy.) I explain that the gateway is barfing at 3,000 requests per second and she asks me to hold on while she takes a look.

Finally relaxing a little bit, I took the opportunity to figure out what was going on with the debug log. The requests to my service should have generated a single error message per failure, not this crazy chaos. 

Failing over and over and over again...

The problem was with Amazon Mobile Analytics servers. Once I took a close look, it wasn’t my code which was vomiting all over the logs, but the API library. Something was wrong and it was filling the screen with errors. Oof. Their servers were just rejecting requests or timing out. I decided to come back to that after.

The support person came back on the phone and told me she found the issue. There was a hard limit to the number of requests per second an end user could set on the gateway. You could put any number you wanted in, but it would only serve 3,000. She could, however, enter a number from her end. What number would I like? 8,000! Now! I nearly yelled, and in a second or two she said “done.” I pulled up the traffic graph and the red line had disappeared. All the requests were being handled with lots of capacity to spare. I checked the status page I had created and could see all the metrics being saved. The stream was still playing on all my Fire TVs. With just an hour or so left until gametime.

Now, about the Analytics errors. I hadn’t fully debugged it (did I somehow mess up an endpoint?), but I figured while I was on the phone with support I would ask. So I say to her, “OK, I’m also having issues with Amazon Mobile Analytics. Their servers seem to be down. Is that something you can help with?”

She replies, “Oh, they seem to be having an outage of some sort. Apparently, there’s a distributed bot attack on their servers. They’re on a group call right now, should I patch you in?” 

I immediately knew it wasn't a coincidence. Oh, shit. What did I do??

I said, “Ummm, OK.” and waited while she contacted someone and before I knew it, I was connected into the middle of an ongoing discussion… I could hear raised voices of multiple people talking at once. Which then immediately stopped as soon as I connected. 

“Who just called in?” someone asked into the ominous silence. 

I’ll let Douglas Adams describe the exact feeling I had at that moment. 

“Vogons!” snapped Ford. “We’re under attack!” 

Arthur gibbered. 

“Well, what are you doing? Let’s get out of here!” 

“Can’t. Computer’s jammed.” 

“Jammed?” 

“It says all its circuits are occupied. There’s no power anywhere in the ship.” 

Ford moved away from the computer terminal, wiped a sleeve across his forehead and slumped back against the wall. “Nothing we can do,” he said. He glared at nothing and bit his lip.

When Arthur had been a boy at school, long before the Earth had been demolished, he had used to play football. He had not been at all good at it, and his particular speciality had been scoring own goals in important matches. Whenever this happened he used to experience a peculiar tingling round the back of his neck that would slowly creep up across his cheeks and heat his brow. The image of mud and grass and lots of little jeering boys flinging it at him suddenly came vividly to his mind at this moment. 

A peculiar tingling sensation at the back of his neck was creeping up across his cheeks and heating his brow. He started to speak, and stopped. He started to speak again and stopped again. Finally he managed to speak. 

“Er,” he said. He cleared his throat. 

“Tell me,” he continued, and said it so nervously that the others all turned to stare at him. He glanced at the approaching yellow blob on the vision screen.

“Tell me,” he said again, “did the computer say what was occupying it? I just ask out of interest.…” 

Their eyes were riveted on him. 

“And, er … well, that’s it really, just asking.”

I spoke up in a small, timid voice. “Umm, hi? I’m Russ - I work in the Fire TV group. I think I’m the one attacking your servers.”

From there, the conversation proceeded as you might expect. 

Apparently, what had happened was that 100,000 Fire TVs had suddenly, and without warning or provocation, simultaneously started hitting the Analytics servers every time an advertisement was shown on the stream. As one might expect, Amazon has a robust and extensive anti-bot system which immediately reacted and clamped down hard on the traffic. Every time they tried to lift the barrier, it would get flooded again and slam back down. 

The first questions for me, of course, were who the hell am I, what the hell am I talking about, what the hell did I do to cause this and most importantly, after I explained the situation completely, did I modify their code?

Oh shit, did I?

I didn’t think I did. As the conversation continued, I frantically pulled up the code I had used for the Analytics service, then grabbed the latest SDK from the public server and compared. The code was identical. I hadn’t done anything to it and was using it exactly as intended. Dodged that bullet. I emailed off a copy of the code, and a link to the actual running website to show my code wasn’t the culprit.

Then, while still in the call, several of us started examining the Analytics JavaScript library and found the problem almost immediately. There was a bug in the caching code.  The networking backoff logic that should have run if there was a connection issue didn’t know how to respond if it didn’t get an answer from the server. The DDOS defense mechanism Amazon uses just black-holes requests. It doesn’t respond at all. The library’s logic didn’t account for this, so when the request went into a black hole, the library looped immediately and fired the request again. And again. And again. As fast as it could, which it turns out, is quite fast.

By the time the game started there were 110,000+ Fire TVs hammering the servers multiple times a second. Millions of requests per minute. Hundreds of millions. Over and over and over. And every time an advertisement played, it cached the event and the payload of the requests got larger and larger. 

Every time the Analytics group tried to lift the defenses and respond, the network would get overwhelmed and slam the door shut again. Still on the call, they brainstormed ideas to mitigate the issue, and eventually asked if it would be OK if they “blacklisted” the account, which would cause the client code to shut off and stop logging events. This didn't sound wonderful considering CBS was expecting that data, but in the spirit of cooperation, I agreed. But then they tried it and it didn’t help. At all. The problem was the blacklist command would never get to the clients as they were in an endless loop trying to post their backlog of metrics events. 

Absurd amount of traffic...

The amount of traffic the TVs were generating was truly absurd. I saw a graph in the weeks afterwards, it had a line barely above the bottom line - normal traffic - and then an almost vertical line to the top of the graph where it flattened out for hours before finally coming back down again. It was pretty impressive.

The call ended and I tried to think of a way to update the apps to stop the requests, but there was nothing I could do. I patched the code on the server so any apps starting from then on wouldn’t join the frenzy, but any code that was running would keep looping until it connected to the backend or the app was closed. The Analytics service was out of commission for good.

So now, the only thing keeping a record of the ad views and other metrics for the freaking Super Bowl was my slapped together Lambda script and a DynamoDB. Joy. Also, my initial worry in the first half hour or so about my own system's requests causing memory or network issues bringing down the stream were now scaled up accordingly. 

To the credit of our browser engine, Fire OS and media app teams, the app remained rock solid and the stream played without a hitch, despite the fact that in the background a rogue JavaScript library was doing its level best to take out some of AWS's servers.

Epilogue

I monitored my Fire TVs for the rest of the game pulling up the memory overlay to see if anything horrible was going to happen, but it looked fine. The stream survived and then after the game I pulled the DynamoDB data down from the server and prepped the after-game report for CBS. I had lost a half hour or so of ads during the pregame show, but the rest were all recorded perfectly. This isn’t surprising as 110,000 clients isn’t really all that much in the scheme of modern-day web development, so the traffic of that many requests every 30 seconds or so - even if made nearly simultaneously - wasn’t an issue. I could see in the data where Lambda had to throttle a few times, but the numbers of errors were tiny and could be filled in based on the start/stop times of each client.

I loaded the data into a SQLite database and wrote up some scripts to clean up the data and export to Excel. I don't think anyone at Amazon would care almost 6 years later, so here’s the inexact numbers from Friday launch til midnight on gameday: 

While I was working on the stats to send to CBS after the game, I got a call at about 10 that night. It was one of the Analytics guys. The traffic was still hammering the servers even hours after the game had ended. He wanted to know when the app would stop. I was initially surprised, then I thought about it. Apparently, many people after the game had just turned off their TV and gone off to do other things, leaving the Fire TV and the app still running. The stream had stopped, but the browser engine was still alive with the script in the background endlessly looping forever. I told him that until the users went back to the home page or opened another app, the app would just keep running forever - especially if the media element was still in the foreground. The apps are specifically made to make sure apps don’t time out if media is playing; no user interaction needed. He thanked me and hung up. Poor guy.

Besides a congratulations email which listed my name among a bunch of others and some extra RSUs at the end of the year (those were nice), the only thing I ever really heard about this event afterwards was some emails from the postmortem meeting where I saw the insane traffic numbers our inadvertent bot storm had caused.

Then a month later, I happened to get on the elevator with a Lab 126 director. Just as he was about to get off on his floor, he turned to me.

“Super Bowl?” he said.

Startled, I turned and said, “Uh… Oh! Yes?”

“Nice job.” 

Hey, I'll take it.