Apache Kafka PowerPoint PPT Presentation

presentation player overlay

About This Presentation

Transcript and Presenter's Notes

Title: Apache Kafka

1
What Is Apache Kafka ?

A stream processing platform
Open source / Apache 2.0 license
Written in Java and Scala
A publish/subscribe system for record streams
Scaleable / fault tolerant
Topic based partition FIFO queues

2
How Does Kafka Work ?

Kafka runs as a cluster of servers
Stores records in topics
Topics are partitioned into queues
Partitions are stored across cluster
Consumers organised into groups
Stream processors transform records
Reusable connectors process queues
For instance database connectors

3
Kafka API'S

Producer API
Allows applications to publish to topics
Consumer API
Applications subscribe to topics / process data
streams
Streams API
Applications acts as stream processor,
transforming stream
Connector API
Build reusable producers / consumers
I.E. RDBMS connectors/producers/consumers
Admin API
For topic and broker management

4
Kafka Logical Architecture
5
Kafka Topic Queue Offsets
6
Kafka Topic Queue Offsets

Records published to Topics
Topics are multi subscriber
Topics contain partition queues
A partition queue contains an sequence of records
Each record has a queue offset ( position )
Consumers use the offset to read records
Queue record retention is configurable

7
Kafka Producer Consumer
8
Kafka Producer Consumer

Producers write to partitions i.e. Producer1 ? P0
Producers responsible for record ? partition
mapping
Kafka only guarantees order with a partition
Kafka cluster contains ltngt servers
Partitions mapped to servers
Consumers members of consumer groups
Each consumer must maintain it's partition read
offset

9
Kafka's Stack Role

A low latency messaging system
Records load balanced across partitions
As a storage system
Using local file system storage
Scales horizontally in terms of performance
As a stream processing system
Using stream API to transform data
Data replication provides fault tolerance

10
Available Books

See Big Data Made Easy
Apress Jan 2015
See Mastering Apache Spark
Packt Oct 2015
See Complete Guide to Open Source Big Data
Stack
Apress Jan 2018
Find the author on Amazon
www.amazon.com/Michael-Frampton/e/B00NIQDOOM/
Connect on LinkedIn
www.linkedin.com/in/mike-frampton-38563020

11
Connect

Feel free to connect on LinkedIn
www.linkedin.com/in/mike-frampton-38563020
See my open source blog at
open-source-systems.blogspot.com/
I am always interested in
New technology
Opportunities
Technology based issues
Big data integration

Write a Comment

User Comments (0)

About PowerShow.com

Recommended Relevance Latest Highest Rated Most Viewed

Sort by:

Related More from user