An Introduction to Apache Hive - PowerPoint PPT Presentation

About This Presentation
Title:

An Introduction to Apache Hive

Description:

What is Apache Hive in terms of big data and Hadoop ? How does it relate to business intelligence and management reporting ? Can it be used with Business Objects ? – PowerPoint PPT presentation

Number of Views:3573

less

Transcript and Presenter's Notes

Title: An Introduction to Apache Hive


1
Apache Hadoop Hive
  • What is it ?
  • Architecture
  • Related Projects
  • Hive DDL
  • Hive DML
  • HiveQL Examples
  • Business Intelligence

2
Hadoop What is it ?
  • A data warehouse for Hadoop
  • Open source writen in Java
  • Holds meta data in a relational database
  • Allows SQL like queries
  • Supports big data data sets
  • Offers built in and user defined functions
  • Has indexing

3
Hive Architecture
  • Where does Hive sit in the Hadoop architecture ?

4
Hive Architecture
  • Given an existing HDFS and Hadoop cluster
  • Then add Hive and the meta data structure
  • Use Flume and Sqoop to move data
  • Use Hive LOAD DATA command to load from flat
    files
  • Use ODBC for connectivity to your BI layer

5
Hive Related Projects
  • Apache Flume move large data sets to Hadoop
  • Apache Sqoop cmd line, move rdbms data to
    Hadoop
  • Apache Hbase Non relational database
  • Apache Pig analyse large data sets
  • Apache Oozie work flow scheduler
  • Apache Mahout machine learning and data mining
  • Apache Hue Hadoop user interface
  • Apache Zoo Keeper configuration / build

6
Hive - DDL
  • Create table
  • hivegt CREATE TABLE customer (age INT, address
    STRING)
  • Partitions
  • hivegt CREATE TABLE customer (age INT, address
    STRING) PARTITIONED BY ( sdate STRING)
  • Show table
  • hivegt SHOW TABLES
  • Describe table
  • hivegt DESCRIBE customer

7
Hive - DDL
  • Alter table
  • hivegt ALTER TABLE customer ADD COLUMNS ( age INT)
  • Drop table
  • hivegt DROP TABLE customer

8
Hive - DML
  • Loading flat files into Hive
  • hivegt LOAD DATA LOCAL INPATH './data/home/x1a.txt'
    OVERWRITE INTO TABLE customer
  • No verification of incoming data

9
HiveQL Examples
  • HiveQL, an SQL like language
  • hivegt SELECT a.age FROM customer a WHERE a.sdate
    '2008-08-15'
  • selects all data from table for a partition but
    doesnt store it
  • hivegt INSERT OVERWRITE DIRECTORY
    '/data/hdfs_file' SELECT a. FROM customer a
    WHERE a.sdate'2008-08-15'
  • writes all of customer table to an hdfs directory

10
Hive Business Intelligence
  • Use ODBC to connect Hive to your BI layer
  • Now you can use BI tools like Business Objects
  • Create a universe over the Hive instance
  • Create reports against the universe
  • Create add hoc queries against the universe

11
Contact Us
  • Feel free to contact us at
  • www.semtech-solutions.co.nz
  • info_at_semtech-solutions.co.nz
  • We offer IT project consultancy
  • We are happy to hear about your problems
  • You can just pay for those hours that you need
  • To solve your problems
Write a Comment
User Comments (0)
About PowerShow.com