With the rise of Hadoop and big data infrastructure, a challenge appeared – how to efficiently move existing data from traditional infrastructure to Hadoop and how to leverage the existing (SQL) knowledge in the new Hadoop technologies.
Hive provides an SQL dialect, called Hive Query Language (HiveQL). By defining tables in Hive, one can access Hadoop's data that is situated either in flat files or in HBase. It is a Hadoop's warehouse for high-latency batch processing of data found in different formats, all accessed through structurally defined Hive tables. Hive is appropriate for data warehouse applications without updates, deletes and realtime access. Thus, one should resort to NoSQL databases (HBase, Cassandra) for realtime access to big data.