nerdexam
DatabricksDatabricks

GENERATIVE-AI-ENGINEER-ASSOCIATE · Question #98

GENERATIVE-AI-ENGINEER-ASSOCIATE Question #98: Real Exam Question with Answer & Explanation

The correct answer is B: Deploy the model using pay-per-token throughput as it comes with cost guarantees. See the full explanation below for the reasoning.

LLM Deployment and Cost Management

Question

A Generative AI Engineer developed an LLM application using the provisioned throughput Foundation Model API. Now that the application is ready to be deployed, they realize their volume of requests are not sufficiently high enough to create their own provisioned throughput endpoint. They want to choose a strategy that ensures the best cost-effectiveness for their application. What strategy should the Generative AI Engineer use?

Options

  • ASwitch to using External Models instead
  • BDeploy the model using pay-per-token throughput as it comes with cost guarantees
  • CChange to a model with a fewer number of parameters in order to reduce hardware constraint
  • DThrottle the incoming batch of requests manually to avoid rate limiting issues

Topics

#LLM Deployment#Cost Optimization#Foundation Models#API Throughput

Community Discussion

No community discussion yet for this question.

Full GENERATIVE-AI-ENGINEER-ASSOCIATE PracticeBrowse All GENERATIVE-AI-ENGINEER-ASSOCIATE Questions