Apache Kylin - PowerPoint PPT Presentation

About This Presentation
Title:

Apache Kylin

Description:

This presentation gives an overview of the Apache Kylin project. It explains Kylin architecture in relation to Hadoop/HBase/Hive and Druid. Links for further information and connecting – PowerPoint PPT presentation

Number of Views:177
Slides: 13
Provided by: semtechs

less

Transcript and Presenter's Notes

Title: Apache Kylin


1
What Is Apache Kylin ?
  • An analytics data warehouse
  • For big data / Apache 2.0 license
  • Open source / written in Java
  • Kylin is an OLAP engine with SQL interface
  • For huge table (e.g., gt100 million rows)
  • Provides second level query performance at TB to
    PB level

2
How does Kylin work ?
  • Kylin runs on a Hadoop cluster
  • It needs these services
  • HDFS, YARN, MapReduce, Hive, HBase, Zookeeper
  • State information is stored in Hbase
  • Historic data / star schema stored in Hive
  • Access Kylin at http//lthostnamegt7070/kylin
  • Uses Lambda architecture for real time streaming
  • layers Batch, speed and serving
  • batch / near real-time processing

3
Kylin Software Requirements
  • Requirements as of release v3.0.1
  • Hadoop 2.7, 3.1 (since v2.5)
  • Hive 0.13 - 1.2.1
  • HBase 1.1, 2.0 (since v2.5)
  • Spark (optional) 2.3.0
  • Kafka (optional) 1.0.0 (since v2.5)
  • JDK 1.8 (since v2.5)
  • OS Linux only, CentOS 6.5 or Ubuntu 16.0.4

4
Kylin In Cluster Mode
5
Kylin Real Time Streaming Architecture
6
Kylin Real Time Streaming Architecture
  • Streaming Receiver
  • ingest data from stream data sources
  • Streaming Coordinator
  • coordinate work loads
  • Metadata Store
  • store streaming related metadata
  • Query Engine
  • query real-time data from streaming receiver
  • Build Engine
  • build cube from the real-time data

7
Kylin Vs Druid
  • Druid is more suitable for real time analysis.
    Kylin is more focused on the OLAP case.
  • Druid has good integration with Kafka for real
    time streaming analysis. The real time capability
    of Kylin (v3) is for real time OLAP.
  • Druid uses bitmap indexes for internal data
    structures. Kylin uses bitmap indexes for real
    time data and molap cubes for historical data.
  • Kylin provide ANSI SQL, Druid provides a specific
    query language.
  • Druid has limitations on table join, Kylin
    supports star schema.
  • Kylin has good integration with BI tools, such as
    Tableau or Excel. Druid has limited integration
    with existing BI tools.
  • Since Kylin supports molap cubes, it has very
    good performance for complex queries on billion
    level data sets.
  • Since Druid needs to scan the full index, the
    performance may be hurt if the data set and query
    range is too big.

8
Some Kylin Users
9
Kylin Ecosystem
10
Kylin Ecosystem
  • Kylin Core
  • Fundamental framework of Kylin OLAP Engine
    comprises of Metadata Engine, Query Engine, Job
    Engine and Storage Engine to run the entire
    stack. It also includes a REST Server to service
    client requests
  • Extensions
  • Plugins to support additional functions and
    features
  • Integration
  • Lifecycle Management Support to integrate with
    Job Scheduler, ETL, Monitoring and Alerting
    Systems
  • User Interface
  • Allows third party users to build customized
    user-interface atop Kylin core
  • Drivers
  • ODBC and JDBC drivers to support different tools
    and products, such as Tableau

11
Available Books
  • See Big Data Made Easy
  • Apress Jan 2015
  • See Mastering Apache Spark
  • Packt Oct 2015
  • See Complete Guide to Open Source Big Data
    Stack
  • Apress Jan 2018
  • Find the author on Amazon
  • www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
  • Connect on LinkedIn
  • www.linkedin.com/in/mike-frampton-38563020

12
Connect
  • Feel free to connect on LinkedIn
  • www.linkedin.com/in/mike-frampton-38563020
  • See my open source blog at
  • open-source-systems.blogspot.com/
  • I am always interested in
  • New technology
  • Opportunities
  • Technology based issues
  • Big data integration
Write a Comment
User Comments (0)
About PowerShow.com