Exam Databricks-Generative-AI-Engineer-Associate Topic 5 Question 46 Discussion

Actual exam question for Databricks's Databricks-Generative-AI-Engineer-Associate exam
Question #: 46
Topic #: 5

A Generative AI Engineer is building a RAG application that will rely on context retrieved from source documents that are currently in PDF format. These PDFs can contain both text and images. They want to develop a solution using the least amount of lines of code.
Which Python package should be used to extract the text from the source documents?

A. flask B. beautifulsoup C. unstructured D. numpy

Suggested Answer: B Vote an answer

* Problem Context: The engineer needs to extract text from PDF documents, which may contain both text and images. The goal is to find a Python package that simplifies this task using the least amount of code.
* Explanation of Options:
* Option A: flask: Flask is a web framework for Python, not suitable for processing or extracting content from PDFs.
* Option B: beautifulsoup: Beautiful Soup is designed for parsing HTML and XML documents, not PDFs.
* Option C: unstructured: This Python package is specifically designed to work with unstructured data, including extracting text from PDFs. It provides functionalities to handle various types of content in documents with minimal coding, making it ideal for the task.
* Option D: numpy: Numpy is a powerful library for numerical computing in Python and does not provide any tools for text extraction from PDFs.
Given the requirement,Option C(unstructured) is the most appropriate as it directly addresses the need to efficiently extract text from PDF documents with minimal code.

by Pag at Mar 27, 2025, 09:29 AM

Limited Time Offer

15%

Off

Get Premium Databricks-Generative-AI-Engineer-Associate Questions as Interactive Self Test Engine or PDF

Comments

CMDev

2025-04-17 14:02:28

Selected Answer: C

https://python.langchain.com/docs/integrations/providers/unstructured/

upvoted 1 times

...

reiii

2025-04-04 02:52:04

Correct answer should be unstructured

upvoted 3 times

...

0 Happy Clients

0 Shares

0 Demo Downloads

10 Years in Business