Presentation in here
What is latency?
What is an SLI?
What is an SLA?
What is an SLO?
Methods to stimate SLOs:
Considering the task: “count all request over $period served faster than $threshold”
Methods:
log data:
counter metrics
HDR historgram metrics
See below figure
WIth histograms you can aggregate latency histrograms over nodes, endpoints and time. Get total latency distribution over SLO timeframe ( weeks, months)
Count samples in bins below the thresholds to compute SLOs. You can sum or agregate two similar bins to get a final overll number.
Full flexibility in choosing thresholds, and aggregation intervals and levels
Cost effective ( 300b/histogram value).
Need HDR histogram instruments and metrics store
Tools: CIrconus,IronDB + graphite / Grafana
Conclusions: