Beaconstac Analytics - PowerPoint PPT Presentation

About This Presentation
Title:

Beaconstac Analytics

Description:

Beaconstac is a proximity marketing and analytics platform for beacons Several beacon specific events are defined to aid proximity marketing. The events include Camp on event, beacon exit event, region enter, region exit etc. – PowerPoint PPT presentation

Number of Views:88
Slides: 12
Provided by: DeZyre

less

Transcript and Presenter's Notes

Title: Beaconstac Analytics


1
Big Data and Internet of things(IOT)
2
Project Morpheus (Beaconstac Analytics)
May 2015
Garima Batra Core Platform Engineer MobStac
3
A quick intro about Beaconstac
1
  • Beaconstac is a proximity marketing and analytics
    platform for beacons
  • Several beacon specific events are defined to aid
    proximity marketing
  • The events include Camp on event, beacon exit
    event, region enter, region exit etc.
  • Beaconstac analytics platform makes it easy for
    managers/marketers/developers to analyze event
    data
  • Components include Beaconstac iOS/Android sdk,
    beaconstac portal


4
Why Hadoop?
1
  • Collect event logs generated from Beaconstac SDK
    usage
  • Needed a system to answer queries like
  • Heat map of beacons by the number of visits
    received in a specified time interval.
  • Heat map of beacons by the amount of time spent
    in a specified time interval.
  • Average time spent by users near different
    beacons
  • Last seen per user
  • Last seen per beacon
  • Analyzing data with custom attributes filters
  • Traversed path in an area by individual users


5
Leveraging Amazon's EMR for Beaconstac Analytics
1
  • Amazon's Streaming API for writing mapper and
    reducer functions in Python
  • Input - Copy programs to Amazon S3
  • Output Copy the processed/output data to S3
  • Initial tests were run using Amazon's EMR
    console. Here you can define the following -
  • Cluster configuration Name, Termination
    protection, Logging, logs location on S3 etc.
  • Software configuration Hadoop AMI version,
    applications to be installed on startup etc.
  • Hardware configuration Types of nodes master,
    Core and Task
  • Security keys, allowed users
  • Bootstrap actions Configure Hadoop, Custom
    actions etc.
  • Steps Streaming program, Hive program, Pig
    program


6
Integrating EMR in production
1

7
Batch processing for Morpheus
1
AWS Data pipeline

8
Deep dive into EMR startup and job submission
1

9
How Does AWS Data Pipeline Work?
1
  • Pipeline definition - specifies the business
    logic of your data management
  • AWS Data pipeline web service - interprets the
    pipeline definition and assigns tasks to workers
    to move and transform data.
  • Task runner - polls the AWS Data Pipeline web
    service for tasks and then performs those tasks.


10
Morpheus version of Data pipeline
1
Copy the output to Elastic Search
Run EMR jobs
Copy logs from Kafka to S3
  • Runs every hour
  • Requires a Kafka consumer script
  • Runs once every day
  • Processes each job and produces output
  • Each job comprises of mapper and reducer scripts
  • Runs once every day
  • Inserts output in Elastic search


11
Settings file in each job
1
1
Questions??
Source Lorem Ipsum
Write a Comment
User Comments (0)
About PowerShow.com