Amazon AWS Certified Machine Learning Engineer - Associate - MLA-C01 FREE EXAM DUMPS QUESTIONS & ANSWERS]

Question 1

An ML engineer develops a neural network model to predict whether customers will continue to subscribe to a service. The model performs well on training data. However, the accuracy of the model decreases significantly on evaluation data.
The ML engineer must resolve the model performance issue.
Which solution will meet this requirement?

A. Remove dropout layers from the neural network. B. Train the model for longer by increasing the number of epochs. C. Penalize large weights by using L1 or L2 regularization. D. Capture complex patterns by increasing the number of layers.

Discussion 0

Correct Answer: C Vote an answer

Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).

Question 2

An ML engineer wants to use, prepare, and load data from Amazon S3 for analytics. The ML engineer must run an extract, transform, and load (ETL) job to discover the schema of the data and to store the metadata.
Which solution will meet these requirements with the LEAST manual effort?

A. Create an ETL pipeline by using Amazon Athena integrated with AWS Step Functions. Use the pipeline to run the ETL job to discover the schema and to store the associated metadata in an S3 bucket. B. Use AWS Glue to run the ETL job. Use the job to discover the schema and to store the associated metadata in the AWS Glue Data Catalog. C. Launch an Amazon EC2 instance that includes the scikit-learn library to run the ETL job. Use the job to discover the schema and to store the associated metadata in Amazon Redshift. D. Create an Amazon SageMaker Data Wrangler flow to run the ETL job. Use the job to discover the schema and to store the associated metadata in an S3 bucket.

Discussion 0

Correct Answer: B Vote an answer

Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).

Question 3

A company runs an ML model on Amazon SageMaker AI. The company uses an automatic process that makes API calls to create training jobs for the model. The company has new compliance rules that prohibit the collection of aggregated metadata from training jobs.
Which solution will prevent SageMaker AI from collecting metadata from the training jobs?

A. Encrypt the training data with an AWS Key Management Service (AWS KMS) customer managed key. B. Ensure that training jobs are running in a private subnet in a custom VPC. C. Opt out of metadata tracking for any training job that is submitted. D. Reconfigure the training jobs to use only AWS Nitro instances.

Discussion 0

Correct Answer: C Vote an answer

Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).

Question 4

A company uses a batching solution to process data analytics each day. The company wants to build an analytics platform to provide near real-time updates. The company wants to use open source technology and does not want to manage or scale the infrastructure.
Which solution will meet these requirements?

A. Create data streams in Amazon Kinesis Data Streams. Use AWS Application Auto Scaling to scale the infrastructure. B. Create Amazon Managed Streaming for Apache Kafka (Amazon MSK) Provisioned clusters. Configure the clusters based on data volume. C. Create self-hosted Apache Flink applications on Amazon EC2. Run the applications as containers. D. Create Amazon Managed Streaming for Apache Kafka (Amazon MSK) Serverless clusters to process the data.

Discussion 0

Correct Answer: D Vote an answer

Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).

Question 5

An ML engineer has trained an ML model by using Amazon SageMaker AI. The ML engineer determines that the model is overfitting and that the training data contains unnecessary features. The ML engineer must reduce the overfitting and the impact of the unnecessary features.
Which solution will meet these requirements?

A. Decrease the number of training iterations. Retrain the model. B. Use SageMaker Debugger to apply L1 regularization to the running model. C. Apply L1 regularization to the training data. Retrain the model. D. Increase the number of training iterations. Retrain the model.

Discussion 0

Correct Answer: C Vote an answer

Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).

Question 6

An airline company deploys ML models to one dozen Amazon SageMaker Al inference endpoints. The inference endpoints must be able to handle different types of workloads in a cost-effective way.
Select the correct inference option from the following list to handle each type of workload. Select each inference option one time. (Select FOUR.)
* Asynchronous inference
* Batch inference
* Real-time inference
* Serverless inference

Discussion 0

Correct Answer:

* Provide flight departure, arrival, and delay information, and provide updates for low-latency workloads# Real-time inference
* Advertise holiday travel promotional deals to millions of users in multiple markets before holiday seasons for spiky workloads# Serverless inference
* Generate quarterly and annual flight reports and insights for trend analysis of large datasets# Batch inference
* Generate online image and audio stories for passengers to watch or listen to while waiting at an airport# Asynchronous inference
* The correct mapping depends on latency requirement, traffic pattern, payload size, processing duration, and whether the workload needs a persistent endpoint.
* Real-time inference is the right choice for flight departure, arrival, and delay updates because this is an online user-facing workload that requires low latency. AWS states that SageMaker real-time inference is ideal for online inference workloads with low-latency or high-throughput requirements and uses a persistent fully managed endpoint. That fits flight status information because passengers and airline systems expect immediate responses.
* Serverless inference is the best choice for holiday promotional deals because this traffic is spiky, seasonal, and unpredictable. AWS describes SageMaker Serverless Inference as suitable for intermittent or unpredictable traffic patterns. It is cost-effective because SageMaker manages the infrastructure and scales down when there are no requests, so the company does not pay for idle endpoint capacity.
* Batch inference is correct for quarterly and annual flight reports because this workload analyzes large datasets offline and does not need an always-running endpoint. AWS says SageMaker batch transform is used to get inferences from large datasets and when a persistent endpoint is not required. Reports and trend analysis are scheduled, non-real-time analytics workloads, so batch inference is the most cost- effective option.
* Asynchronous inference is the right choice for generating online image and audio stories. These requests can have larger payloads and longer processing times than normal low-latency API calls. AWS states that SageMaker Asynchronous Inference queues incoming requests and is ideal for large payloads, long processing times, and near-real-time latency requirements. Image and audio generation can take seconds or minutes, so asynchronous inference is more appropriate than real-time inference.

Question 7

A company is developing an application that reads animal descriptions from user prompts and generates images based on the information in the prompts. The application reads a message from an Amazon Simple Queue Service (Amazon SQS) queue. Then the application uses Amazon Titan Image Generator on Amazon Bedrock to generate an image based on the information in the message. Finally, the application removes the message from SQS queue.
Which IAM permissions should the company assign to the application ' s IAM role? (Select TWO.)

A. Allow the sagemaker:PutRecord* action for the Amazon Titan Image Generator resource. B. Allow the bedrock:Get* action for the Amazon Titan Image Generator resource. C. Allow the sqs:ReceiveMessage action and the sqs:DeleteMessage action for the SQS queue resource. D. Allow the sqs:GetQueueAttributes action and the sqs:DeleteMessage action for the SQS queue resource. E. Allow the bedrock:InvokeModel action for the Amazon Titan Image Generator resource.

Discussion 0

Correct Answer: C,E Vote an answer

Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).

Question 8

A company has a Retrieval Augmented Generation (RAG) application that uses a vector database to store embeddings of documents. The company must migrate the application to AWS and must implement a solution that provides semantic search of text files. The company has already migrated the text repository to an Amazon S3 bucket.
Which solution will meet these requirements?

A. Use an AWS Batch job to process the files and generate embeddings. Use AWS Glue to store the embeddings. Use SQL queries to perform the semantic searches. B. Use a custom Amazon SageMaker AI notebook to run a custom script to generate embeddings. Use SageMaker Feature Store to store the embeddings. Use SQL queries to perform the semantic searches. C. Use the Amazon Kendra S3 connector to ingest the documents from the S3 bucket into Amazon Kendra. Query Amazon Kendra to perform the semantic searches. D. Use an Amazon Textract asynchronous job to ingest the documents from the S3 bucket. Query Amazon Textract to perform the semantic searches.

Discussion 0

Correct Answer: C Vote an answer

Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).

Question 9

A company is planning to create several ML prediction models. The training data is stored in Amazon S3. The entire dataset is more than 5 ## in size and consists of CSV, JSON, Apache Parquet, and simple text files.
The data must be processed in several consecutive steps. The steps include complex manipulations that can take hours to finish running. Some of the processing involves natural language processing (NLP) transformations. The entire process must be automated.
Which solution will meet these requirements?

A. Process data at each step by using Amazon SageMaker Data Wrangler. Automate the process by using Data Wrangler jobs. B. Use Amazon SageMaker notebooks for each data processing step. Automate the process by using Amazon EventBridge. C. Use Amazon SageMaker Pipelines to create a pipeline of data processing steps. Automate the pipeline by using Amazon EventBridge. D. Process data at each step by using AWS Lambda functions. Automate the process by using AWS Step Functions and Amazon EventBridge.

Discussion 0

Correct Answer: C Vote an answer

Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).

Question 10

A company uses a batching solution to process daily analytics. The company wants to provide near real-time updates, use open-source technology, and avoid managing or scaling infrastructure.
Which solution will meet these requirements?

A. Create Amazon Kinesis Data Streams with Application Auto Scaling. B. Create Amazon MSK Provisioned clusters. C. Create Amazon Managed Streaming for Apache Kafka (Amazon MSK) Serverless clusters. D. Create self-hosted Apache Flink applications on Amazon EC2.

Discussion 0

Correct Answer: C Vote an answer

Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).

Question 11

An ML engineer needs to use AWS services to identify and extract meaningful unique keywords from documents.
Which solution will meet these requirements with the LEAST operational overhead?

A. Store the documents in an Amazon S3 bucket. Create AWS Lambda functions to process the documents and to run Python scripts for stemming and removal of stop words. Use bigram and trigram techniques to identify and extract relevant keywords. B. Use the Natural Language Toolkit (NLTK) library on Amazon EC2 instances for text pre-processing.
Use the Latent Dirichlet Allocation (LDA) algorithm to identify and extract relevant keywords. C. Use Amazon SageMaker and the BlazingText algorithm. Apply custom pre-processing steps for stemming and removal of stop words. Calculate term frequency-inverse document frequency (TF-IDF) scores to identify and extract relevant keywords. D. Use Amazon Comprehend custom entity recognition and key phrase extraction to identify and extract relevant keywords.

Discussion 0

Correct Answer: D Vote an answer

Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).

Question 12

A company plans to use Amazon SageMaker AI to build image classification models. The company has 6 TB of training data stored on Amazon FSx for NetApp ONTAP. The file system is in the same VPC as SageMaker AI.
An ML engineer must make the training data accessible to SageMaker AI training jobs.
Which solution will meet these requirements?

A. Create a catalog connection from SageMaker Data Wrangler to the FSx for ONTAP file system. B. Mount the FSx for ONTAP file system as a volume to the SageMaker AI instance. C. Create an Amazon S3 bucket and use Mountpoint for Amazon S3 to link the bucket to FSx for ONTAP. D. Create a direct connection from SageMaker Data Wrangler to the FSx for ONTAP file system.

Discussion 0

Correct Answer: B Vote an answer

Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).

Question 13

A company wants to host an ML model on Amazon SageMaker. An ML engineer is configuring a continuous integration and continuous delivery (Cl/CD) pipeline in AWS CodePipeline to deploy the model. The pipeline must run automatically when new training data for the model is uploaded to an Amazon S3 bucket.
Select and order the pipeline ' s correct steps from the following list. Each step should be selected one time or not at all. (Select and order three.)
* An S3 event notification invokes the pipeline when new data is uploaded.
* S3 Lifecycle rule invokes the pipeline when new data is uploaded.
* SageMaker retrains the model by using the data in the S3 bucket.
* The pipeline deploys the model to a SageMaker endpoint.
* The pipeline deploys the model to SageMaker Model Registry.

Discussion 0

Correct Answer:

Explanation:
Step 1: An S3 event notification invokes the pipeline when new data is uploaded.
Step 2: SageMaker retrains the model by using the data in the S3 bucket.
Step 3: The pipeline deploys the model to a SageMaker endpoint.

Step 1: An S3 Event Notification Invokes the Pipeline When New Data is Uploaded Why? The CI/CD pipeline should be triggered automatically whenever new training data is uploaded to Amazon S3. S3 event notifications can be configured to send events to AWS services like Lambda, which can then invoke AWS CodePipeline.
How? Configure the S3 bucket to send event notifications (e.g., s3:ObjectCreated:*) to AWS Lambda, which in turn triggers the CodePipeline.
Step 2: SageMaker Retrains the Model by Using the Data in the S3 Bucket Why? The uploaded data is used to retrain the ML model to incorporate new information and maintain performance. This step is critical to updating the model with fresh data.
How? Define a SageMaker training step in the CI/CD pipeline, which reads the training data from the S3 bucket and retrains the model.
Step 3: The Pipeline Deploys the Model to a SageMaker Endpoint
Why? Once retrained, the updated model must be deployed to a SageMaker endpoint to make it available for real-time inference.
How? Add a deployment step in the CI/CD pipeline, which automates the creation or update of the SageMaker endpoint with the retrained model.
Order Summary:
An S3 event notification invokes the pipeline when new data is uploaded.
SageMaker retrains the model by using the data in the S3 bucket.
The pipeline deploys the model to a SageMaker endpoint.
This configuration ensures an automated, efficient, and scalable CI/CD pipeline for continuous retraining and deployment of the ML model in Amazon SageMaker.

Question 14

A hospital wants to predict patient outcomes for the coming year An ML engineer must improve several existing ML models that currently perform poorly.
Select the correct regularization method from the following list to improve each model Select each regularization method one time, more than one time, or not at all. (Select THREE.)
* L1 regularization
* L2 regularization
* Early stopping

Discussion 0

Correct Answer:

Explanation:
Linear regression model whose coefficients should shrink but not become zero The answer: L2 regularization AWS says L2 produces smaller overall weight values and is the right fit when coefficients should be reduced without being forced to zero.
Polynomial regression model with irrelevant polynomial terms that should be eliminated The answer: L1 regularization AWS says L1 reduces the number of features used by pushing small weights to zero, which matches elimination of irrelevant terms.
Logistic regression model that has highly correlated features to eliminate highly redundant predictors The answer: L1 regularization This is the nuanced one. AWS says L2 stabilizes weights when there is high correlation between features, but because the question explicitly says eliminate highly redundant predictors, L1 is the better match since it creates sparsity and removes predictors by zeroing coefficients.