nerdexam
DatabricksDatabricks

GENERATIVE-AI-ENGINEER-ASSOCIATE · Question #82

GENERATIVE-AI-ENGINEER-ASSOCIATE Question #82: Real Exam Question with Answer & Explanation

The correct answer is C: pytesseract. See the full explanation below for the reasoning.

Data Ingestion and Preprocessing for Generative AI

Question

A Generative AI Engineer is building a RAG application that will rely on context retrieved from source documents that have been scanned and saved as image files in formats like .jpeg or .png. They want to develop a solution using the least amount of lines of code. Which Python package should be used to extract the text from the source documents?

Options

  • Abeautifulsoup
  • Bscrapy
  • Cpytesseract
  • Dpyquery

Topics

#OCR#Python Libraries#RAG Data Sources#Image Text Extraction

Community Discussion

No community discussion yet for this question.

Full GENERATIVE-AI-ENGINEER-ASSOCIATE PracticeBrowse All GENERATIVE-AI-ENGINEER-ASSOCIATE Questions