A Generative AI Engineer who was prototyping an LLM system accidentally ran thousands of inference queries against a Foundation Model endpoint over the weekend. They want to take action to prevent thi

Sign in or unlock GENERATIVE-AI-ENGINEER-ASSOCIATE to reveal the answer and full explanation for question #53. The question stem and answer options stay visible for context.

Foundation Model Management and Operationalization

Question

A Generative AI Engineer who was prototyping an LLM system accidentally ran thousands of inference queries against a Foundation Model endpoint over the weekend. They want to take action to prevent this from unintentionally happening again in the future. What action should they take?

Options

AUse prompt engineering to instruct the LLM endpoints to refuse too many subsequent queries.
BRequire that all development code which interfaces with a Foundation Model endpoint must be
CBuild a pyfunc model which proxies to the Foundation Model endpoint and add throttling within
DConfigure rate limiting on the Foundation Model endpoints.

Unlock GENERATIVE-AI-ENGINEER-ASSOCIATE to see the answer

You've previewed enough free GENERATIVE-AI-ENGINEER-ASSOCIATE questions. Unlock GENERATIVE-AI-ENGINEER-ASSOCIATE for full answers, explanations, the timed quiz mode, progress tracking, and the master PDF. Question stem and options stay visible so you can still see what's on the exam.

Unlock GENERATIVE-AI-ENGINEER-ASSOCIATE - $49.99 / 30 days Sign in

Topics

#Rate Limiting#Foundation Models#API Management#Cost Optimization

Full GENERATIVE-AI-ENGINEER-ASSOCIATE Practice