cotalks.dev

Engineering for Reliability

Channel: Google Cloud Tech

Videos (24)

1 — Alerting on error budget burn rate
2 — Getting started with SLOs
3 — Defining SLIs with custom metrics
4 — Distributed tracing with OpenTelemetry and Cloud Trace
5 — Best practices for Cloud Logging
6 — Engineering for Reliability is here!
7 — How to set up Prometheus monitoring for your services
8 — Getting started with Managed Service for Prometheus: Ingestion
9 — How to troubleshoot the Ops Agent
10 — How to monitor quotas in Google Cloud
11 — How to find cloud logs and manage logging costs
12 — Best Practices for Cloud Monitoring
13 — Best practices for Cloud Operations in the enterprise
14 — How to use metrics scopes in Cloud Monitoring
15 — Observing container environments with Cloud Operations
16 — Maintaining reliable services with advanced Cloud Logging features
17 — Monitoring compute infrastructure with the Cloud Ops Agent
18 — Understand your services with Cloud Logging
19 — Manage GKE services with Cloud Operations
20 — Defining SLIs with platform metrics
21 — Managing GKE infrastructure at scale
22 — Creating custom metrics with OpenTelemetry
23 — Migrating to the managed service for Prometheus
24 — Automatic instrumentation with OpenTelemetry

© 2026 cotalks.devAbout Technologies Feedback