Cloudera named a market leader in 2023 GigaOm Radar Report for Data Lakes & Lakehouses Get the report

Hive Key Features

Familiar SQL-like Interface:

Use existing SQL skills to run batch queries on data stored in Hadoop. Queries are written using a SQL-like language, HiveQL, and are executed through either MapReduce or Apache Spark™, making it simple for more users to process and analyze unlimited amounts of data.

Shared Data Structures:

Using HCatalog, a table and storage management layer for Hadoop, Hive metadata is exposed to other data processing tools, including Pig and MapReduce, as well as through a REST API. This allows users to easily read and write data without worrying about where the data is stored, what format it is, or redefining the structure for each tool.

Faster Batch Processing:

Hive-on-Spark features the next generation of batch processing for Hive. With queries executed through Apache Spark™, a powerful data processing tool, users will see dramatic performance improvements compared to MapReduce.  

Learn more about Spark

Common Use Cases

With its familiar interface, Hive is the tool-of-choice for a variety of batch processing workloads, including:  

  • Data preparation
  • ETL
  • Data mining
  • Ad optimization

Operational Database

Integrated across the platform

As an integrated part of Cloudera’s platform, users can run batch processing workloads with Apache Hive, while also analyzing the same data for interactive SQL or machine-learning workloads using tools like Impala or Apache Spark™ — all within a single platform.

Hive also benefits from unified resource management (through YARN), simple deployment and administration (through Cloudera Manager), and shared compliance-ready security and governance (through Apache Sentry and Cloudera Navigator) —- all critical for running in production.

Learn more

The shift to Hive-on-Spark

Apache Spark™ is a powerful data processing engine that has quickly emerged as an open standard for Hadoop due to its added speed and greater flexibility. Together with the community, Cloudera has been working to evolve the tools currently built on MapReduce, including Hive and Pig, and migrate them to the Spark execution engine for faster processing.

Get Started with Hive-on-Spark

Expert support for Hive

Trained by its creators, Cloudera has Hive experts available across the globe ready to deliver world-class support 24/7. With more experience across more production customers, for more use cases, Cloudera is the leader in Hive support so you can focus on results.

Learn more about Cloudera Support

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.