DP-100 · Question #473
DP-100 Question #473: Real Exam Question with Answer & Explanation
This question requires identifying Azure Machine Learning compute targets that support autoscaling and enable the use of on-premises compute resources for both training and inference activities.
Question
Drag and Drop Question You are designing an Azure Machine Learning solution by using the Python SDK v2. You must train and deploy the solution by using a compute target. The compute target must meet the following requirements: - Enable the use of on-premises compute resources. - Support autoscaling. You need to configure a compute target for training and inference. Which compute targets should you configure? To answer, select the appropriate options in the answer area. NOTE: Each correct selection is worth one point. Answer:
Explanation
This question requires identifying Azure Machine Learning compute targets that support autoscaling and enable the use of on-premises compute resources for both training and inference activities.
Approach. The correct approach involves understanding the capabilities of each compute target against the specified requirements: 'Enable the use of on-premises compute resources' and 'Support autoscaling'.
-
For Training: Apache Spark pools
- Autoscaling: Azure Synapse Spark pools (which are integrated as Apache Spark pools in Azure ML) inherently support autoscaling, allowing them to scale resources up or down based on workload demand, which is crucial for large-scale training tasks.
- Enable the use of on-premises compute resources: While the Spark cluster itself runs in Azure, Apache Spark is widely used for processing large datasets. In enterprise scenarios, these datasets often originate from on-premises sources. By integrating Azure Synapse (and its Spark pools) with on-premises data stores (e.g., via Virtual Networks, ExpressRoute, or Self-hosted Integration Runtimes), you enable the scalable processing of these on-premises data resources. This interpretation fulfills the requirement in the context of data-intensive training.
-
For Inference: Azure Machine Learning Kubernetes
- Autoscaling: Kubernetes is an industry-standard for container orchestration, providing robust autoscaling capabilities for deploying and managing services. This is ideal for inference endpoints that need to scale dynamically based on incoming request load.
- Enable the use of on-premises compute resources: This is a direct and strong fit. Azure Arc-enabled Kubernetes allows you to connect any Kubernetes cluster (including those running on-premises or on other clouds) to Azure Machine Learning. This enables you to deploy models for inference directly onto your on-premises Kubernetes infrastructure, fulfilling the requirement to use on-premises compute resources for model serving.
Common mistakes.
- common_mistake. Selecting 'Local computer' for either training or inference is a common mistake because while it does represent an on-premises resource, it fundamentally lacks the capability to autoscale, which is a mandatory requirement for both activities. Choosing 'Apache Spark pools' for inference, while technically possible for batch inference and supporting autoscaling, is less optimized for real-time, low-latency serving compared to Kubernetes, and its 'on-premises compute' aspect is less direct than Arc-enabled Kubernetes. Conversely, choosing 'Azure Machine Learning Kubernetes' for training is perfectly valid (as it also meets both requirements), but the question implies selecting distinct, optimal choices for each activity, and Spark is often highlighted for its big data processing strengths in training scenarios.
Concept tested. Understanding Azure Machine Learning compute targets, their specific features (autoscaling, hybrid capabilities via Azure Arc), and their suitability for different machine learning lifecycle stages (training vs. inference) in cloud and hybrid environments.
Topics
Community Discussion
No community discussion yet for this question.