Apache Kafka - PowerPoint PPT Presentation

About This Presentation
Title:

Apache Kafka

Description:

This presentation gives an overview of the Apache Kafka project. It covers areas like producer, consumer, topic, partitions, API's, architecture and usage. Links for further information and connecting Music by "Little Planet", composed and performed by Bensound from – PowerPoint PPT presentation

Number of Views:2186
Slides: 11
Provided by: semtechs

less

Transcript and Presenter's Notes

Title: Apache Kafka


1
What Is Apache Kafka ?
  • A stream processing platform
  • Open source / Apache 2.0 license
  • Written in Java and Scala
  • A publish/subscribe system for record streams
  • Scaleable / fault tolerant
  • Topic based partition FIFO queues

2
How Does Kafka Work ?
  • Kafka runs as a cluster of servers
  • Stores records in topics
  • Topics are partitioned into queues
  • Partitions are stored across cluster
  • Consumers organised into groups
  • Stream processors transform records
  • Reusable connectors process queues
  • For instance database connectors

3
Kafka API'S
  • Producer API
  • Allows applications to publish to topics
  • Consumer API
  • Applications subscribe to topics / process data
    streams
  • Streams API
  • Applications acts as stream processor,
    transforming stream
  • Connector API
  • Build reusable producers / consumers
  • I.E. RDBMS connectors/producers/consumers
  • Admin API
  • For topic and broker management

4
Kafka Logical Architecture
5
Kafka Topic Queue Offsets
6
Kafka Topic Queue Offsets
  • Records published to Topics
  • Topics are multi subscriber
  • Topics contain partition queues
  • A partition queue contains an sequence of records
  • Each record has a queue offset ( position )
  • Consumers use the offset to read records
  • Queue record retention is configurable

7
Kafka Producer Consumer
8
Kafka Producer Consumer
  • Producers write to partitions i.e. Producer1 ? P0
  • Producers responsible for record ? partition
    mapping
  • Kafka only guarantees order with a partition
  • Kafka cluster contains ltngt servers
  • Partitions mapped to servers
  • Consumers members of consumer groups
  • Each consumer must maintain it's partition read
    offset

9
Kafka's Stack Role
  • A low latency messaging system
  • Records load balanced across partitions
  • As a storage system
  • Using local file system storage
  • Scales horizontally in terms of performance
  • As a stream processing system
  • Using stream API to transform data
  • Data replication provides fault tolerance

10
Available Books
  • See Big Data Made Easy
  • Apress Jan 2015
  • See Mastering Apache Spark
  • Packt Oct 2015
  • See Complete Guide to Open Source Big Data
    Stack
  • Apress Jan 2018
  • Find the author on Amazon
  • www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
  • Connect on LinkedIn
  • www.linkedin.com/in/mike-frampton-38563020

11
Connect
  • Feel free to connect on LinkedIn
  • www.linkedin.com/in/mike-frampton-38563020
  • See my open source blog at
  • open-source-systems.blogspot.com/
  • I am always interested in
  • New technology
  • Opportunities
  • Technology based issues
  • Big data integration
Write a Comment
User Comments (0)
About PowerShow.com