SREcon19 Americas has ended
Wednesday, March 27 • 10:10am - 10:40am
Learning from Learnings: Anatomy of Three Incidents

Sign up or log in to save this to your schedule and see who's attending!

The best response to a system outage is not "What did you do?", but "What did we learn?" This session will walk through three system-wide outages at Google, at Stitch Fix, and at WeWork—their incidents, aftermaths, and recoveries. In all cases, many things went right and a few went wrong; also in all cases, because of blameless cultures, we buckled down, learned a lot, and made substantial improvements in the systems for the future. Looking back with the perspective of 20-20 hindsight, all of these incidents were seminal events that changed the focus and trajectory of engineering at each organization. You will leave with a set of actionable suggestions in dealing with customers, engineering teams, and upper management. You will also enjoy a few war stories from the trenches, none of which has been previously told fully in public.

avatar for Randy Shoup

Randy Shoup

Over the past several decades, Randy Shoup has led high-performing engineering teams at eBay, Google, Stitch Fix, and WeWork. A long-time advocate of DevOps practices, Randy specializes in scaling engineering organizations, company cultures, and technology infrastructures. He is equally... Read More →

Wednesday March 27, 2019 10:10am - 10:40am
Grand Ballroom D

Attendees (137)