Exam Databricks-Machine-Learning-Associate Topic 2 Question 27 Discussion

Actual exam question for Databricks's Databricks-Machine-Learning-Associate exam
Question #: 27
Topic #: 2

A data scientist has defined a Pandas UDF function predict to parallelize the inference process for a single-node model:

They have written the following incomplete code block to use predict to score each record of Spark DataFrame spark_df:

Which of the following lines of code can be used to complete the code block to successfully complete the task?

A. predict(*spark_df.columns) B. mapInPandas(predict) C. predict(Iterator(spark_df)) D. mapInPandas(predict(spark_df.columns)) E. predict(spark_df.columns)

Suggested Answer: B Vote an answer

To apply the Pandas UDF predict to each record of a Spark DataFrame, you use the mapInPandas method. This method allows the Pandas UDF to operate on partitions of the DataFrame as pandas DataFrames, applying the specified function (predict in this case) to each partition. The correct code completion to execute this is simply mapInPandas(predict), which specifies the UDF to use without additional arguments or incorrect function calls.
Reference:
PySpark DataFrame documentation (Using mapInPandas with UDFs).

by Cornelius at Jun 21, 2026, 05:28 AM

Limited Time Offer

15%

Off

Get Premium Databricks-Machine-Learning-Associate Questions as Interactive Self Test Engine or PDF

Comments

0 Happy Clients

0 Shares

0 Demo Downloads

10 Years in Business