PROFESSIONAL-CLOUD-DEVOPS-ENGINEER · Question #53
PROFESSIONAL-CLOUD-DEVOPS-ENGINEER Question #53: Real Exam Question with Answer & Explanation
The correct answer is B: Identify, measure, and eliminate toil by automating repetitive tasks.. When a service consistently exceeds its availability target with limited engineering resources, following Google-recommended SRE practices involves identifying and eliminating toil to free up engineers for more valuable work.
Question
You work for a global organization and run a service with an availability target of 99% with limited engineering resources. For the current calendar month, you noticed that the service has 99.5% availability. You must ensure that your service meets the defined availability goals and can react to business changes, including the upcoming launch of new features. You also need to reduce technical debt while minimizing operational costs. You want to follow Google-recommended practices. What should you do?
Options
- AAdd N+1 redundancy to your service by adding additional compute resources to the service.
- BIdentify, measure, and eliminate toil by automating repetitive tasks.
- CDefine an error budget for your service level availability and minimize the remaining error budget.
- DAllocate available engineers to the feature backlog while you ensure that the service remains
Explanation
When a service consistently exceeds its availability target with limited engineering resources, following Google-recommended SRE practices involves identifying and eliminating toil to free up engineers for more valuable work.
Common mistakes.
- A. Adding N+1 redundancy would increase operational costs and complexity for a service that is already exceeding its availability target, making it an inefficient use of resources.
- C. While defining an error budget is crucial, the goal is to use that budget wisely, not simply 'minimize the remaining error budget,' which could imply spending it frivolously or pursuing unnecessary reliability when the service already meets its goals.
- D. Allocating all available engineers to the feature backlog without addressing underlying operational inefficiencies or technical debt is a reactive approach that doesn't ensure long-term stability or cost minimization, potentially leading to increased toil and future incidents.
Concept tested. SRE principles: managing toil and error budgets
Topics
Community Discussion
No community discussion yet for this question.