Exam Databricks-Generative-AI-Engineer-Associate Topic 5 Question 46 Discussion

Actual exam question for Databricks's Databricks-Generative-AI-Engineer-Associate exam
Question #: 46
Topic #: 5
A Generative AI Engineer is building a RAG application that will rely on context retrieved from source documents that are currently in PDF format. These PDFs can contain both text and images. They want to develop a solution using the least amount of lines of code.
Which Python package should be used to extract the text from the source documents?

Suggested Answer: B Vote an answer

* Problem Context: The engineer needs to extract text from PDF documents, which may contain both text and images. The goal is to find a Python package that simplifies this task using the least amount of code.
* Explanation of Options:
* Option A: flask: Flask is a web framework for Python, not suitable for processing or extracting content from PDFs.
* Option B: beautifulsoup: Beautiful Soup is designed for parsing HTML and XML documents, not PDFs.
* Option C: unstructured: This Python package is specifically designed to work with unstructured data, including extracting text from PDFs. It provides functionalities to handle various types of content in documents with minimal coding, making it ideal for the task.
* Option D: numpy: Numpy is a powerful library for numerical computing in Python and does not provide any tools for text extraction from PDFs.
Given the requirement,Option C(unstructured) is the most appropriate as it directly addresses the need to efficiently extract text from PDF documents with minimal code.

by Pag at Mar 27, 2025, 09:29 AM

Comments

Chosen Answer:
This is a voting comment (?) , you can switch to a simple comment.
Switch to a voting comment New
Nick name: Submit Cancel
CMDev
2025-04-17 14:02:28
Selected Answer: C
https://python.langchain.com/docs/integrations/providers/unstructured/
upvoted 1 times
...
reiii
2025-04-04 02:52:04
Correct answer should be unstructured
upvoted 3 times
...
A voting comment increases the vote count for the chosen answer by one.

Upvoting a comment with a selected answer will also increase the vote count towards that answer by one. So if you see a comment that you already agree with, you can upvote it instead of posting a new comment.

0
0
0
10