Think about the data you work with, as a developer or data analyst. What part of the data pipeline are you usually involved with? How do you think you might use Hive in your work?
Hive is our primary datastore. Data coming from different sources are stored in the Landing Zone which has restricted access. Landing database in Hive is created to query the data and only specialist people (Data scientists) are privilaged to access the landing zone which has raw data.
After the data transformation process (such as cleaning, standardization, )etc data is stored in Processed zone and main Hive databases / tables are created to point to processed zone information.
ETL tools such as Streamsets can write to Hive directly which simplifies lot of processes.
I have put together some generic tutorial to help developers MapR Hive Part 1 - YouTube
Retrieving data ...