GENERATIVE-AI-ENGINEER-ASSOCIATE · Question #11
GENERATIVE-AI-ENGINEER-ASSOCIATE Question #11: Real Exam Question with Answer & Explanation
The correct answer is D: beautifulsoup. BeautifulSoup is a Python package specifically designed for parsing HTML and XML documents. It provides simple methods to extract and manipulate text from HTML files, making it the most suitable choice for extracting text from HTML source documents with minimal lines of
Question
A Generative AI Engineer is building a RAG application that will rely on context retrieved from source documents that are currently in HTML format. They want to develop a solution using the least amount of lines of code. Which Python package should be used to extract the text from the source documents?
Options
- Apytesseract
- Bnumpy
- Cpypdf2
- Dbeautifulsoup
Explanation
BeautifulSoup is a Python package specifically designed for parsing HTML and XML documents. It provides simple methods to extract and manipulate text from HTML files, making it the most suitable choice for extracting text from HTML source documents with minimal lines of
Topics
Community Discussion
No community discussion yet for this question.