MLS-C01 Exam Questions
388 real MLS-C01 exam questions with expert-verified answers and explanations. Page 1 of 8.
- Question #1Modeling
An interactive online dictionary wants to add a widget that displays words used in similar contexts. A Machine Learning Specialist is asked to provide word features for the downstr...
Natural Language ProcessingWord EmbeddingsFeature EngineeringSimilarity Search - Question #2ML Implementation and Operations
A company is using Amazon Polly to translate plaintext documents to speech for automated company announcements. However, company acronyms are being mispronounced in the current doc...
Amazon PollyText-to-SpeechPronunciation LexiconCustom Pronunciation - Question #3Modeling
An insurance company is developing a new device for vehicles that uses a camera to observe drivers' behavior and alert them when they appear distracted. The company created approxi...
OverfittingRegularizationData AugmentationModel Training - Question #4ML Implementation and Operations
When submitting Amazon SageMaker training jobs using one of the built-in algorithms, which common parameters MUST be specified? (Choose three.)
SageMaker TrainingTraining Job ConfigurationIAM PermissionsS3 Storage - Question #5Data Engineering
A monitoring service generates 1 TB of scale metrics record data every minute. A Research team performs queries on this data using Amazon Athena. The queries run slowly due to the...
Amazon AthenaData Storage FormatsQuery Performance OptimizationColumnar Data Formats - Question #6Modeling
Machine Learning Specialist is working with a media company to perform classification on popular articles from the company's website. The company is using random forests to classif...
Data PreprocessingCategorical EncodingOne-Hot Encoding - Question #7Modeling
A gaming company has launched an online game where people can start playing for free, but they need to pay if they choose to use certain features. The company needs to build an aut...
Imbalanced DataCost-sensitive LearningOversamplingOverfitting - Question #8Data Engineering
A Data Scientist is developing a machine learning model to predict future patient outcomes based on information collected about each patient and their treatment plans. The model sh...
Data CleaningMissing Data ImputationData Preprocessing - Question #9Data Engineering
A Data Science team is designing a dataset repository where it will store a large amount of training data commonly used in its machine learning models. As Data Scientists may creat...
AWS S3Data LakeScalabilityCost Optimization - Question #10Machine Learning Implementation and Operations
A Machine Learning Specialist deployed a model that provides product recommendations on a company's website. Initially, the model was performing very well and resulted in customers...
Model DriftData DriftModel RetrainingMLOps - Question #11Data Engineering
A Machine Learning Specialist working for an online fashion company wants to build a data ingestion solution for the company's Amazon S3-based data lake. The Specialist wants to cr...
Data LakeData IngestionReal-time AnalyticsAWS Glue - Question #12Machine Learning Implementation and Operations
A company is observing low accuracy while training on the default built-in image classification algorithm in Amazon SageMaker. The Data Science team wants to use an Inception neura...
SageMaker CustomizationBring Your Own Model (BYOM)Deep Learning ArchitecturesCustom Training Jobs - Question #13Modeling
A Machine Learning Specialist built an image classification deep learning model. However, the Specialist ran into an overfitting problem in which the training and testing accuracie...
OverfittingDropoutNeural NetworksRegularization - Question #14Machine Learning Implementation and Operations
A Machine Learning team uses Amazon SageMaker to train an Apache MXNet handwritten digit classifier model using a research dataset. The team wants to receive a notification when th...
SageMaker MonitoringCloudTrail AuditingCloudWatch AlarmsOverfitting Notification - Question #15Modeling
A Machine Learning Specialist is building a prediction model for a large number of features using linear models, such as linear regression and logistic regression. During explorato...
Dimensionality ReductionPrincipal Component Analysis (PCA)MulticollinearityFeature Engineering - Question #16Modeling
A Machine Learning Specialist is implementing a full Bayesian network on a dataset that describes public transit in New York City. One of the random variables is discrete, and repr...
Probability DistributionsBayesian NetworksDiscrete VariablesPoisson Distribution - Question #17Machine Learning Implementation and Operations
A Data Science team within a large company uses Amazon SageMaker notebooks to access data stored in Amazon S3 buckets. The IT Security team is concerned that internet-enabled noteb...
SageMaker NetworkingVPC EndpointsPrivate SubnetAWS Security - Question #18Modeling
A Machine Learning Specialist has created a deep learning neural network model that performs well on the training data but performs poorly on the test data. Which of the following...
OverfittingRegularizationDropoutDeep Learning Models - Question #19Data Engineering
A Data Scientist needs to create a serverless ingestion and analytics solution for high-velocity, real-time streaming data. The ingestion process must buffer and convert incoming r...
Streaming DataServerless ArchitectureData LakeETL - Question #20Modeling
An online reseller has a large, multi-column dataset with one column missing 30% of its data. A Machine Learning Specialist believes that certain columns in the dataset could be us...
Missing Data ImputationData PreprocessingDataset IntegrityMultiple Imputation - Question #21ML Implementation and Operations
A company is setting up an Amazon SageMaker environment. The corporate data security policy does not allow communication over the internet. How can the company enable the Amazon Sa...
VPC EndpointsAWS PrivateLinkNetwork SecuritySageMaker Networking - Question #22Modeling
Machine Learning Specialist is training a model to identify the make and model of vehicles in images. The Specialist wants to use transfer learning and an existing model trained on...
Transfer LearningModel InitializationComputer VisionFine-tuning - Question #23Data Engineering
An office security agency conducted a successful pilot using 100 cameras installed at key locations within the main office. Images from the cameras were uploaded to Amazon S3 and t...
Real-time video processingKinesis Video StreamsRekognition VideoEdge data ingestion - Question #24Modeling
A Marketing Manager at a pet insurance company plans to launch a targeted marketing campaign on social media to acquire new customers. Currently, the company has the following data...
ClusteringMarket SegmentationUnsupervised LearningCustomer Acquisition - Question #25Modeling
A manufacturing company has a large set of labeled historical sales data. The manufacturer would like to predict how many units of a particular part should be produced each quarter...
RegressionSupervised LearningAlgorithm SelectionPredictive Modeling - Question #26Data Engineering
A financial services company is building a robust serverless data lake on Amazon S3. The data lake should be flexible and meet the following requirements: - Support querying old an...
Data Lake ArchitectureServerless ETLAWS GlueMetadata Management - Question #27Machine Learning Implementation and Operations
A company's Machine Learning Specialist needs to improve the training speed of a time-series forecasting model using TensorFlow. The training is currently implemented on a single-G...
Distributed TrainingTensorFlowAWS SageMakerHorovod - Question #28Modeling
Which of the following metrics should a Machine Learning Specialist generally use to compare/evaluate machine learning classification models against each other?
Model evaluationClassification metricsAUCModel comparison - Question #29Machine Learning Implementation and Operations
A Machine Learning Specialist is working with a large cybersecurity company that manages security events in real time for companies around the world. The cybersecurity company want...
Real-time MLStreaming Data ProcessingAnomaly DetectionAWS Kinesis - Question #30Data Engineering
A Data Scientist wants to gain real-time insights into a data stream of GZIP files. Which solution would allow the use of SQL to query the stream with the LEAST latency?
Streaming DataReal-time AnalyticsKinesis Data AnalyticsData Transformation - Question #31Modeling
A retail company intends to use machine learning to categorize new products. A labeled dataset of current products was provided to the Data Science team. The dataset includes 1,200...
XGBoostMulti-class ClassificationModel SelectionTabular Data - Question #32Modeling
A Data Scientist is working on an application that performs sentiment analysis. The validation accuracy is poor, and the Data Scientist thinks that the cause may be a rich vocabula...
Natural Language Processing (NLP)Word EmbeddingsAmazon SageMakerBlazingText - Question #33Data Engineering
A Data Scientist needs to migrate an existing on-premises ETL process to the cloud. The current process runs at regular time intervals and uses PySpark to combine and format multip...
AWS GlueETLPySparkServerless Data Processing - Question #34Machine Learning Implementation and Operations
A Machine Learning team has several large CSV datasets in Amazon S3. Historically, models built with the Amazon SageMaker Linear Learner algorithm have taken hours to train on simi...
Amazon SageMakerTraining OptimizationPipe ModeData Ingestion - Question #35Modeling
A term frequency-inverse document frequency (tf-idf) matrix using both unigrams and bigrams is built from a text corpus consisting of the following two sentences: 1. Please call th...
TF-IDFText RepresentationFeature EngineeringNLP - Question #36ML Implementation and Operations
A large mobile network operating company is building a machine learning model to predict customers who are likely to unsubscribe from the service. The company plans to offer an inc...
Confusion MatrixFalse Positives (FP)False Negatives (FN)Business Value of ML - Question #37Modeling
A Machine Learning Specialist is designing a system for improving sales for a company. The objective is to use the large amount of information the company has on users' behavior an...
Recommendation SystemsCollaborative FilteringSpark MLAmazon EMR - Question #38Data Engineering
A Mobile Network Operator is building an analytics platform to analyze and optimize a company's operations using Amazon Athena and Amazon S3. The source systems send data in .CSV f...
Real-time data processingData format conversionServerless ETLData lake architecture - Question #39Modeling
A city wants to monitor its air quality to address the consequences of air pollution. A Machine Learning Specialist needs to forecast the air quality in parts per million of contam...
Time Series ForecastingAmazon SageMakerLinear Learner AlgorithmModel Selection - Question #40Data Engineering
A Data Engineer needs to build a model using a dataset containing customer credit card information. How can the Data Engineer ensure the data remains encrypted and the credit card...
Data EncryptionAWS KMSSensitive Data RedactionSageMaker Data Security - Question #41Machine Learning Implementation and Operations
A Machine Learning Specialist is using an Amazon SageMaker notebook instance in a private subnet of a corporate VPC. The ML Specialist has important data stored on the Amazon SageM...
SageMaker NotebooksAWS Service ArchitectureVPC NetworkingResource Visibility - Question #42Machine Learning Implementation and Operations
A Machine Learning Specialist is building a model that will perform time series forecasting using Amazon SageMaker. The Specialist has finished training the model and is now planni...
SageMaker MonitoringCloudWatch DashboardsEndpoint PerformanceLoad Testing - Question #43Data Engineering
A manufacturing company has structured and unstructured data stored in an Amazon S3 bucket. A Machine Learning Specialist wants to use SQL to run queries on this data. Which soluti...
AWS GlueAmazon AthenaData CatalogSQL on S3 - Question #44Modeling
A Machine Learning Specialist is developing a custom video recommendation model for an application. The dataset used to train this model is very large with millions of data points...
SageMakerModel TrainingLarge DatasetsS3 - Question #45Data Engineering
A company is setting up a system to manage all of the datasets it stores in Amazon S3. The company would like to automate running transformation jobs on the data and maintaining a...
AWS GlueData CatalogETLServerless Data Processing - Question #46Modeling
A Data Scientist is working on optimizing a model during the training process by varying multiple parameters. The Data Scientist observes that, during multiple runs with identical...
Learning RateBatch SizeModel OptimizationLoss Convergence - Question #47Machine Learning Implementation and Operations
A Machine Learning Specialist is configuring Amazon SageMaker so multiple Data Scientists can access notebooks, train models, and deploy endpoints. To ensure the best operational p...
SageMaker monitoringCloudWatchCloudTrailMLOps - Question #48Data Engineering
A retail chain has been ingesting purchasing records from its network of 20,000 stores to Amazon S3 using Amazon Kinesis Data Firehose. To support training an improved machine lear...
Data TransformationKinesis Data AnalyticsStreaming DataServerless Computing - Question #49Modeling
A Machine Learning Specialist is building a convolutional neural network (CNN) that will classify 10 types of animals. The Specialist has built a series of layers in a neural netwo...
Softmax activationOutput layerMulti-class classificationProbability distribution - Question #50Modeling
A Machine Learning Specialist trained a regression model, but the first iteration needs optimizing. The Specialist needs to understand whether the model is more frequently overesti...
Regression model evaluationResidual analysisModel diagnosticsBias detection