Get Jan-2024 Download Latest & Valid Questions For Amazon MLS-C01 exam
Ensure Success With Updated Verified MLS-C01 Exam Dumps
NEW QUESTION # 67
A Data Scientist wants to gain real-time insights into a data stream of GZIP files. Which solution would allow the use of SQL to query the stream with the LEAST latency?
- A. Amazon Kinesis Data Analytics with an AWS Lambda function to transform the data.
- B. AWS Glue with a custom ETL script to transform the data.
- C. Amazon Kinesis Data Firehose to transform the data and put it into an Amazon S3 bucket.
- D. An Amazon Kinesis Client Library to transform the data and save it to an Amazon ES cluster.
Answer: C
NEW QUESTION # 68
A Machine Learning Specialist is preparing data for training on Amazon SageMaker. The Specialist is using one of the SageMaker built-in algorithms for the training. The dataset is stored in .CSV format and is transformed into a numpy.array, which appears to be negatively affecting the speed of the training.
What should the Specialist do to optimize the data for training on SageMaker?
- A. Use AWS Glue to compress the data into the Apache Parquet format.
- B. Transform the dataset into the RecordIO protobuf format.
- C. Use the SageMaker hyperparameter optimization feature to automatically optimize the data.
- D. Use the SageMaker batch transform feature to transform the training data into a DataFrame.
Answer: B
NEW QUESTION # 69
A Marketing Manager at a pet insurance company plans to launch a targeted marketing campaign on social media to acquire new customers. Currently, the company has the following data in Amazon Aurora:
- Profiles for all past and existing customers
- Profiles for all past and existing insured pets
- Policy-level information
- Premiums received
- Claims paid
What steps should be taken to implement a machine learning model to identify potential new customers on social media?
- A. Use a recommendation engine on customer profile data to understand key characteristics of consumer segments. Find similar profiles on social media.
- B. Use regression on customer profile data to understand key characteristics of consumer segments. Find similar profiles on social media
- C. Use clustering on customer profile data to understand key characteristics of consumer segments.
Find similar profiles on social media - D. Use a decision tree classifier engine on customer profile data to understand key characteristics of consumer segments. Find similar profiles on social media.
Answer: C
Explanation:
https://docs.aws.amazon.com/sagemaker/latest/dg/algos.html
https://docs.aws.amazon.com/sagemaker/latest/dg/algo-kmeans-tech-notes.html
NEW QUESTION # 70
An interactive online dictionary wants to add a widget that displays words used in similar contexts. A Machine Learning Specialist is asked to provide word features for the downstream nearest neighbor model powering the widget.
What should the Specialist do to meet these requirements?
- A. Produce a set of synonyms for every word using Amazon Mechanical Turk.
- B. Create word embedding factors that store edit distance with every other word.
- C. Create one-hot word encoding vectors.
- D. Download word embedding's pre-trained on a large corpus.
Answer: D
NEW QUESTION # 71
An insurance company is developing a new device for vehicles that uses a camera to observe drivers' behavior and alert them when they appear distracted. The company created approximately 10,000 training images in a controlled environment that a Machine Learning Specialist will use to train and evaluate machine learning models.
During the model evaluation, the Specialist notices that the training error rate diminishes faster as the number of epochs increases and the model is not accurately inferring on the unseen test images.
Which of the following should be used to resolve this issue? (Choose two.)
- A. Use gradient checking in the model.
- B. Add vanishing gradient to the model.
- C. Make the neural network architecture complex.
- D. Add L2 regularization to the model.
- E. Perform data augmentation on the training data.
Answer: D,E
Explanation:
The model must have been overfitted. Regularization helps to solve the overfitting problem in machine learning (as well as data augmentation).
NEW QUESTION # 72
A company is interested in building a fraud detection model. Currently, the Data Scientist does not have a sufficient amount of information due to the low number of fraud cases.
Which method is MOST likely to detect the GREATEST number of valid fraud cases?
- A. Class weight adjustment
- B. Oversampling using SMOTE
- C. Undersampling
- D. Oversampling using bootstrapping
Answer: B
Explanation:
With datasets that are not fully populated, the Synthetic Minority Over-sampling Technique (SMOTE) adds new information by adding synthetic data points to the minority class. This technique would be the most effective in this scenario. Refer to Section 4.2 at this link for supporting informatio
NEW QUESTION # 73
An interactive online dictionary wants to add a widget that displays words used in similar contexts. A Machine Learning Specialist is asked to provide word features for the downstream nearest neighbor model powering the widget.
What should the Specialist do to meet these requirements?
- A. Create word embedding vectors that store edit distance with every other word.
- B. Produce a set of synonyms for every word using Amazon Mechanical Turk.
- C. Download word embeddings pre-trained on a large corpus.
- D. Create one-hot word encoding vectors.
Answer: C
Explanation:
As it is a interactive online dictionary, we need pre-trained word embedding thus the answer is D.
In addition, there is no mention that the online dictonary is unique and does not have a pre- trained word embedding.
NEW QUESTION # 74
A data scientist wants to use Amazon Forecast to build a forecasting model for inventory demand for a retail company. The company has provided a dataset of historic inventory demand for its products as a .csv file stored in an Amazon S3 bucket. The table below shows a sample of the dataset.
How should the data scientist transform the data?
- A. Use ETL jobs in AWS Glue to separate the dataset into a target time series dataset and an item metadata dataset. Upload both datasets as .csv files to Amazon S3.
- B. Use a Jupyter notebook in Amazon SageMaker to separate the dataset into a related time series dataset and an item metadata dataset. Upload both datasets as tables in Amazon Aurora.
- C. Use a Jupyter notebook in Amazon SageMaker to transform the data into the optimized protobuf recordIO format. Upload the dataset in this format to Amazon S3.
- D. Use AWS Batch jobs to separate the dataset into a target time series dataset, a related time series dataset, and an item metadata dataset. Upload them directly to Forecast from a local machine.
Answer: A
Explanation:
https://docs.aws.amazon.com/forecast/latest/dg/dataset-import-guidelines-troubleshooting.html
NEW QUESTION # 75
When submitting Amazon SageMaker training jobs using one of the built-in algorithms, which common parameters MUST be specified? (Select THREE.)
- A. The output path specifying where on an Amazon S3 bucket the trained model will persist.
- B. Hyperparameters in a JSON array as documented for the algorithm used.
- C. The training channel identifying the location of training data on an Amazon S3 bucket.
- D. The Amazon EC2 instance class specifying whether training will be run using CPU or GPU.
- E. The validation channel identifying the location of validation data on an Amazon S3 bucket.
- F. The 1AM role that Amazon SageMaker can assume to perform tasks on behalf of the users.
Answer: A,D,F
NEW QUESTION # 76
A Machine Learning Specialist has created a deep learning neural network model that performs well on the training data but performs poorly on the test data.
Which of the following methods should the Specialist consider using to correct this? (Select THREE.)
- A. Decrease dropout.
- B. Increase feature combinations.
- C. Decrease regularization.
- D. Increase regularization.
- E. Increase dropout.
- F. Decrease feature combinations.
Answer: B,C,E
NEW QUESTION # 77
When submitting Amazon SageMaker training jobs using one of the built-in algorithms, which common parameters MUST be specified? (Select THREE.)
- A. Hyperparameters in a JSON array as documented for the algorithm used.
- B. The Amazon EC2 instance class specifying whether training will be run using CPU or GPU.
- C. The training channel identifying the location of training data on an Amazon S3 bucket.
- D. The output path specifying where on an Amazon S3 bucket the trained model will persist.
- E. The validation channel identifying the location of validation data on an Amazon S3 bucket.
- F. The 1AM role that Amazon SageMaker can assume to perform tasks on behalf of the users.
Answer: C,E,F
NEW QUESTION # 78
Which of the following metrics should a Machine Learning Specialist generally use to compare/evaluate machine learning classification models against each other?
- A. Recall
- B. Mean absolute percentage error (MAPE)
- C. Misclassification rate
- D. Area Under the ROC Curve (AUC)
Answer: D
NEW QUESTION # 79
A Machine Learning Specialist is working with a large cybersecurily company that manages security events in real time for companies around the world The cybersecurity company wants to design a solution that will allow it to use machine learning to score malicious events as anomalies on the data as it is being ingested The company also wants be able to save the results in its data lake for later processing and analysis What is the MOST efficient way to accomplish these tasks'?
- A. Ingest the data and store it in Amazon S3. Have an AWS Glue job that is triggered on demand transform the new data Then use the built-in Random Cut Forest (RCF) model within Amazon SageMaker to detect anomalies in the data
- B. Ingest the data using Amazon Kinesis Data Firehose, and use Amazon Kinesis Data Analytics Random Cut Forest (RCF) for anomaly detection Then use Kinesis Data Firehose to stream the results to Amazon S3
- C. Ingest the data into Apache Spark Streaming using Amazon EMR. and use Spark MLlib with k-means to perform anomaly detection Then store the results in an Apache Hadoop Distributed File System (HDFS) using Amazon EMR with a replication factor of three as the data lake
- D. Ingest the data and store it in Amazon S3 Use AWS Batch along with the AWS Deep Learning AMIs to train a k-means model using TensorFlow on the data in Amazon S3.
Answer: B
NEW QUESTION # 80
A Machine Learning Specialist is packaging a custom ResNet model into a Docker container so the company can leverage Amazon SageMaker for training. The Specialist is using Amazon EC2 P3 instances to train the model and needs to properly configure the Docker container to leverage the NVIDIA GPUs.
What does the Specialist need to do?
- A. Bundle the NVIDIA drivers with the Docker image.
- B. Organize the Docker container's file structure to execute on GPU instances.
- C. Build the Docker container to be NVIDIA-Docker compatible.
- D. Set the GPU flag in the Amazon SageMaker CreateTrainingJob request body
Answer: C
NEW QUESTION # 81
A Marketing Manager at a pet insurance company plans to launch a targeted marketing campaign on social media to acquire new customers Currently, the company has the following data in Amazon Aurora
* Profiles for all past and existing customers
* Profiles for all past and existing insured pets
* Policy-level information
* Premiums received
* Claims paid
What steps should be taken to implement a machine learning model to identify potential new customers on social media?
- A. Use a decision tree classifier engine on customer profile data to understand key characteristics of consumer segments. Find similar profiles on social media
- B. Use a recommendation engine on customer profile data to understand key characteristics of consumer segments. Find similar profiles on social media
- C. Use regression on customer profile data to understand key characteristics of consumer segments Find similar profiles on social media.
- D. Use clustering on customer profile data to understand key characteristics of consumer segments Find similar profiles on social media.
Answer: C
NEW QUESTION # 82
A data scientist has developed a machine learning translation model for English to Japanese by using Amazon SageMaker's built-in seq2seq algorithm with 500,000 aligned sentence pairs. While testing with sample sentences, the data scientist finds that the translation quality is reasonable for an example as short as five words. However, the quality becomes unacceptable if the sentence is 100 words long.
Which action will resolve the problem?
- A. Adjust hyperparameters related to the attention mechanism.
- B. Change preprocessing to use n-grams.
- C. Choose a different weight initialization type.
- D. Add more nodes to the recurrent neural network (RNN) than the largest sentence's word count.
Answer: D
NEW QUESTION # 83
A retail company is selling products through a global online marketplace. The company wants to use machine learning (ML) to analyze customer feedback and identify specific areas for improvement. A developer has built a tool that collects customer reviews from the online marketplace and stores them in an Amazon S3 bucket. This process yields a dataset of 40 reviews. A data scientist building the ML models must identify additional sources of data to increase the size of the dataset.
Which data sources should the data scientist use to augment the dataset of reviews? (Choose three.)
- A. Instruction manuals for the company's products
- B. A publicly available collection of news articles
- C. Product sales revenue figures for the company
- D. Social media posts containing the name of the company or its products
- E. Emails exchanged by customers and the company's customer service agents
- F. A publicly available collection of customer reviews
Answer: A,D,F
NEW QUESTION # 84
A Machine Learning Specialist is planning to create a long-running Amazon EMR cluster. The EMR cluster will have 1 master node, 10 core nodes, and 20 task nodes. To save on costs, the Specialist will use Spot Instances in the EMR cluster.
Which nodes should the Specialist launch on Spot Instances?
- A. Master node
- B. Any of the core nodes
- C. Any of the task nodes
- D. Both core and task nodes
Answer: A
NEW QUESTION # 85
A Mobile Network Operator is building an analytics platform to analyze and optimize a company's operations using Amazon Athena and Amazon S3.
The source systems send data in .CSV format in real time. The Data Engineering team wants to transform the data to the Apache Parquet format before storing it on Amazon S3.
Which solution takes the LEAST effort to implement?
- A. Ingest .CSV data from Amazon Kinesis Data Streams and use Amazon Kinesis Data Firehose to convert data into Parquet.
- B. Ingest .CSV data using Apache Spark Structured Streaming in an Amazon EMR cluster and use Apache Spark to convert data into Parquet.
- C. Ingest .CSV data from Amazon Kinesis Data Streams and use Amazon Glue to convert data into Parquet.
- D. Ingest .CSV data using Apache Kafka Streams on Amazon EC2 instances and use Kafka Connect S3 to serialize data as Parquet
Answer: C
NEW QUESTION # 86
A large mobile network operating company is building a machine learning model to predict customers who are likely to unsubscribe from the service. The company plans to offer an incentive for these customers as the cost of churn is far greater than the cost of the incentive.
The model produces the following confusion matrix after evaluating on a test dataset of 100 customers:
Based on the model evaluation results, why is this a viable model for production?
- A. The model is 86% accurate and the cost incurred by the company as a result of false positives is less than the false negatives.
- B. The precision of the model is 86%, which is less than the accuracy of the model.
- C. The model is 86% accurate and the cost incurred by the company as a result of false negatives is less than the false positives.
- D. The precision of the model is 86%, which is greater than the accuracy of the model.
Answer: C
NEW QUESTION # 87
......
Exam Materials for You to Prepare & Pass MLS-C01 Exam: https://www.freecram.com/Amazon-certification/MLS-C01-exam-dumps.html
Pass Your MLS-C01 Exam at the First Try with 100% Real Exam: https://drive.google.com/open?id=1LaVGlfZoACAippaIFgqKyjykPZKqiM2V