70-475 · Question #15
70-475 Question #15: Real Exam Question with Answer & Explanation
The correct answer is C. STORED AS ORC. The Optimized Row Columnar (ORC) file format provides a highly efficient way to store Hive data. It was designed to overcome limitations of the other Hive file formats. Using ORC files improves performance when Hive is reading, writing, and processing data. Compared with RCFile f
Question
Options
- ASTORED AS RCFILE
- BSTORED AS GZIP
- CSTORED AS ORC
- DSTORED AS TEXTFILE
Explanation
The Optimized Row Columnar (ORC) file format provides a highly efficient way to store Hive data. It was designed to overcome limitations of the other Hive file formats. Using ORC files improves performance when Hive is reading, writing, and processing data. Compared with RCFile format, for example, ORC file format has many advantages such as: a single file as the output of each task, which reduces the NameNode's load Hive type support including datetime, decimal, and the complex types (struct, list, map, and light-weight indexes stored within the file skip row groups that don't pass predicate filtering seek to a given row block-mode compression based on data type run-length encoding for integer columns dictionary encoding for string columns concurrent reads of the same file using separate RecordReaders ability to split files without scanning for markers bound the amount of memory needed for reading or writing metadata stored using Protocol Buffers, which allows addition and removal of fields https://cwiki.apache.org/confluence/display/Hive/LanguageManual+ORC#LanguageManualORC-
Community Discussion
No community discussion yet for this question.