SRE as She Is Spoke

Date Tuesday, 25 October, 2022 - 09:45–10:30
Presenter Andrew Clay Shafer (@littleidea)
URL https://www.usenix.org/conference/srecon22emea/presentation/shafer

Abstract

Some things get lost in translation. Words have meaning but all language is alive and ever changing. In order to better understand what SRE ‘could be’ in an imagined future, what tools do we have to understand what SRE ‘is’? Now and forever? The co-evolution of SRE language and practice suggests there are already obvious points of divergence in the understanding and application of both. A naive analysis of the evolution of similar movements suggests some divergence may be inevitable, but at what point do we lose the ’essence of SRE’. In every possible SRE ‘could’ is there ever an SRE ‘should’? Are we able to move SRE beyond eternal ‘it depends’? If we are, who counts as part of ‘we’? Is my SRE your SRE? Would either of us benefit if this was the case?

Notes

  • We begin with an intro of how the beginnings of the Agile software development movement didn’t include servers, because at the time, software mostly shipped onto CDs, so shipping often wasn’t part of any value chain. Rather it was more of waterfall planning to deliver software onto CDs.
  • DevOps was not about making dev do ops, but rather about making ops go away through dev automation. Importantly, this still included product feedback throughout the process.
  • People learn by necessity. They’ll also stop learning/investing further once their needs are met.
  • Dev vs Ops is like Innovation vs Efficiency.
  • You build it, you run it does not mean dev runs infra1
  • The goal of SRE is to drive Toil out of the system.
  • An organization that doesn’t need SRE may do superficial or unnatural things, but call it Site Reliability Engineering anyway (because they want keep up with the buzzwords and hype)
  • Organizations are bad at stopping things. Often they only add and add and add more things.