CERTIFIED-MACHINE-LEARNING-PROFESSIONAL · Question #3
CERTIFIED-MACHINE-LEARNING-PROFESSIONAL Question #3: Real Exam Question with Answer & Explanation
The correct answer is A: mlflow.models.schema.infer_schema. mlflow.models.schema.infer_signature - wait, that's option B's name. Let me be precise. Option A (mlflow.models.schema.infer_schema) is correct because it operates directly on a raw dataset (e.g., a pandas DataFrame or numpy array) and returns an MLflow Schema object - no model l
Question
A data scientist has developed a scikit-learn random forest model model, but they have not yet logged model with MLflow. They want to obtain the input schema and the output schema of the model so they can document what type of data is expected as input. Which of the following MLflow operations can be used to perform this task?
Options
- Amlflow.models.schema.infer_schema
- Bmlflow.models.signature.infer_signature
- Cmlflow.models.Model.get_input_schema
- Dmlflow.models.Model.signature
- EThere is no way to obtain the input schema and the output schema of an unlogged model.
Explanation
mlflow.models.schema.infer_signature - wait, that's option B's name. Let me be precise.
Option A (mlflow.models.schema.infer_schema) is correct because it operates directly on a raw dataset (e.g., a pandas DataFrame or numpy array) and returns an MLflow Schema object - no model logging required. You simply call it on your input data to get the input schema, and on your model's predictions to get the output schema, making it fully usable at any point in development.
Option B (infer_signature) is wrong here because infer_signature produces a ModelSignature object that bundles input and output schemas together - it's designed specifically as a preparation step for logging (passed as the signature= argument to log_model). The question asks about schema inspection outside the logging workflow.
Options C and D are wrong because Model.get_input_schema() and Model.signature are both members of the mlflow.models.Model class, which represents an already-logged artifact. Calling them requires a model that has been persisted to MLflow, which the question explicitly says has not happened.
Option E is wrong because schema inference from raw data has never required a logged model.
Memory tip: Think "schema from data, signature for logging." infer_schema is a standalone data utility; infer_signature is part of the logging pipeline.
Community Discussion
No community discussion yet for this question.