SREcon19 Americas has ended
Back To Schedule
Monday, March 25 • 5:00pm - 5:30pm
Operating within Normal Parameters: Monitoring Kubernetes

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

After Kubernetes takes over your data centers, how can you be sure that it's operating within normal parameters? What does "normal" even mean? By formalizing your expected quality of service, you can measure and compare against known targets with open source tools like Prometheus. In this talk, we'll use Kubernetes as a case study for introducing service level objectives (SLOs) to guide monitoring efforts. Come learn the how and why of metric selection for monitoring Kubernetes quality of service, what gaps exist in the open source Kubernetes monitoring ecosystem, how to use Prometheus and its exporters to establish predictability and "normal" baselines, and how to use this telemetry to debug service degradations in a Kubernetes cluster.


Elana Hashman

Two Sigma
Elana Hashman currently works as a Reliability Engineer at Two Sigma, wrangling Kubernetes clusters and automating operations. She is a currently a member of the Kubernetes Instrumentation SIG, where she focuses on benchmarking and metrics usability. In the wider FOSS community, she... Read More →

Monday March 25, 2019 5:00pm - 5:30pm EDT
Grand Ballroom ABC