DatabricksDatabricks
GENERATIVE-AI-ENGINEER-ASSOCIATE · Question #92
GENERATIVE-AI-ENGINEER-ASSOCIATE Question #92: Real Exam Question with Answer & Explanation
The correct answer is C: unstructured. See the full explanation below for the reasoning.
Data Ingestion for RAG
Question
A Generative AI Engineer is building a RAG application that will rely on context retrieved from source documents that are currently in PDF format. These PDFs can contain both text and images. They want to develop a solution using the least amount of lines of code. Which Python package should be used to extract the text from the source documents?
Options
- Aflask
- Bbeautifulsoup
- Cunstructured
- Dnumpy
Topics
#RAG Applications#Document Parsing#Text Extraction#Python Libraries
Community Discussion
No community discussion yet for this question.