nerdexam
AmazonAmazon

AIP-C01 · Question #19

AIP-C01 Question #19: Real Exam Question with Answer & Explanation

The correct answer is B: Configure Amazon Bedrock evaluations that use Anthropic Claude Sonnet as a judge model to. Option B meets the requirement to combine scalable automated evaluation with targeted human oversight using managed AWS GenAI capabilities. Amazon Bedrock evaluations enable systematic, repeatable quality assessment across large volumes of interactions. Using an LLM- as-a-judge a

Deployment, Operations, and Optimization

Question

A financial technology company is using Amazon Bedrock to build an assessment system for the company's customer service AI assistant. The AI assistant must provide financial recommendations that are factually accurate, compliant with financial regulations, and conversationally appropriate. The company needs to combine automated quality evaluations at scale with targeted human reviews of critical interactions. What solution will meet these requirements?

Options

  • AConfigure a pipeline in which financial experts manually score all responses for accuracy,
  • BConfigure Amazon Bedrock evaluations that use Anthropic Claude Sonnet as a judge model to
  • CCreate an Amazon Lex bot to manage customer service interactions. Configure AWS Lambda
  • DConfigure Amazon CloudWatch to monitor response patterns from the AI assistant. Configure

Explanation

Option B meets the requirement to combine scalable automated evaluation with targeted human oversight using managed AWS GenAI capabilities. Amazon Bedrock evaluations enable systematic, repeatable quality assessment across large volumes of interactions. Using an LLM- as-a-judge approach with a strong evaluator model such as Anthropic Claude Sonnet allows the company to automatically score outputs for dimensions like factual accuracy, conversational appropriateness, and policy alignment. This directly supports "automated quality evaluations at scale" without building custom scoring models. However, financial recommendations add higher risk because regulatory compliance requires additional enforcement beyond general quality scoring. Amazon Bedrock guardrails provide a dedicated policy enforcement layer that can block or intervene when responses violate compliance constraints. Guardrails are particularly important for preventing disallowed financial guidance patterns and ensuring consistent behavior across deployments. The requirement also calls for "targeted human reviews of critical interactions." Amazon Augmented AI (A2I) is a managed human review service that supports routing specific items to human reviewers based on rules or confidence thresholds. In this design, the system can automatically send only high- risk or policy-flagged interactions to qualified financial experts for review, keeping human effort focused where it matters most while maintaining scale.

Topics

#Amazon Bedrock#Model Evaluation#Judge Models#Generative AI Quality

Community Discussion

No community discussion yet for this question.

Full AIP-C01 PracticeBrowse All AIP-C01 Questions