Databricks Certified Machine Learning Associate - Databricks-Machine-Learning-Associate FREE EXAM DUMPS QUESTIONS & ANSWERS
A data scientist has written a data cleaning notebook that utilizes the pandas library, but their colleague has suggested that they refactor their notebook to scale with big data.
Which of the following approaches can the data scientist take to spend the least amount of time refactoring their notebook to scale with big data?
Which of the following approaches can the data scientist take to spend the least amount of time refactoring their notebook to scale with big data?
Correct Answer: A
Vote an answer
Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).
A data scientist has been given an incomplete notebook from the data engineering team. The notebook uses a Spark DataFrame spark_df on which the data scientist needs to perform further feature engineering. Unfortunately, the data scientist has not yet learned the PySpark DataFrame API.
Which of the following blocks of code can the data scientist run to be able to use the pandas API on Spark?
Which of the following blocks of code can the data scientist run to be able to use the pandas API on Spark?
Correct Answer: A
Vote an answer
Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).
The implementation of linear regression in Spark ML first attempts to solve the linear regression problem using matrix decomposition, but this method does not scale well to large datasets with a large number of variables.
Which of the following approaches does Spark ML use to distribute the training of a linear regression model for large data?
Which of the following approaches does Spark ML use to distribute the training of a linear regression model for large data?
Correct Answer: A
Vote an answer
Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).
Which of the following describes the relationship between native Spark DataFrames and pandas API on Spark DataFrames?
Correct Answer: C
Vote an answer
Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).
A machine learning engineering team has a Job with three successive tasks. Each task runs a single notebook. The team has been alerted that the Job has failed in its latest run.
Which of the following approaches can the team use to identify which task is the cause of the failure?
Which of the following approaches can the team use to identify which task is the cause of the failure?
Correct Answer: D
Vote an answer
Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).
A machine learning engineer is using the following code block to scale the inference of a single-node model on a Spark DataFrame with one million records:

Assuming the default Spark configuration is in place, which of the following is a benefit of using an Iterator?

Assuming the default Spark configuration is in place, which of the following is a benefit of using an Iterator?
Correct Answer: A
Vote an answer
Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).
A data scientist is developing a single-node machine learning model. They have a large number of model configurations to test as a part of their experiment. As a result, the model tuning process takes too long to complete. Which of the following approaches can be used to speed up the model tuning process?
Correct Answer: D
Vote an answer
Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).
A data scientist has produced two models for a single machine learning problem. One of the models performs well when one of the features has a value of less than 5, and the other model performs well when the value of that feature is greater than or equal to 5. The data scientist decides to combine the two models into a single machine learning solution.
Which of the following terms is used to describe this combination of models?
Which of the following terms is used to describe this combination of models?
Correct Answer: B
Vote an answer
Explanation: Only visible for FreeCram members. You can sign-up / login (it's free).