An introduction to Apache Spark MLlib - PowerPoint PPT Presentation

About This Presentation
Title:

An introduction to Apache Spark MLlib

Description:

A introduction to Apache Spark MLlib, what is it and how does it work ? What can it do ? – PowerPoint PPT presentation

Number of Views:1212
Slides: 8
Provided by: semtechs

less

Transcript and Presenter's Notes

Title: An introduction to Apache Spark MLlib


1
Apache Spark MLlib
  • What is Apache Spark ?
  • What is MLlib ?
  • Functionality
  • Dependencies
  • Books
  • Eco-system

www.semtech-solutions.co.nz info_at_semtech-solutio
ns.co.nz
2
Spark What is it ?
  • Alternative to Map Reduce for certain
    applications
  • A low latency cluster computing system
  • For very large data sets
  • May be 100 times faster than Map Reduce
  • Used with Hadoop / HDFS
  • Uses in memory cluster computing
  • Memory access faster than disk access
  • Has API's written in Scala / Java / Python

www.semtech-solutions.co.nz info_at_semtech-solutio
ns.co.nz
3
Spark MLlib What is it ?
  • Spark Machine Learning Library
  • Provided with Spark Install
  • Code in Scala / Java / Python
  • Contain libraries
  • Spark.mllib
  • Spark.ml ( V1.2 )
  • Provides common functionality
  • classification, regression, clustering
  • collaborative filtering, dimensionality reduction

www.semtech-solutions.co.nz info_at_semtech-solutio
ns.co.nz
4
Spark MLlib Functionality
  • Basic Stats
  • Classification and regression
  • Collaborative Filtering
  • Clustering
  • Dimensionality reduction
  • Feature extraction and transformation
  • Optimization

www.semtech-solutions.co.nz info_at_semtech-solutio
ns.co.nz
5
Spark MLlib Dependencies
  • NumPy for Python
  • Breeze ( linear algebra )
  • Netlib-java
  • Jblas
  • Gfortran runtime library

www.semtech-solutions.co.nz info_at_semtech-solutio
ns.co.nz
6
Available Books
  • See our Hadoop book from Apress / Springer
  • Big Data Made Easy
  • Look out for our Apache Spark based book
  • from Packt in 2015

www.semtech-solutions.co.nz info_at_semtech-solutio
ns.co.nz
7
Spark Eco system
www.semtech-solutions.co.nz info_at_semtech-solutio
ns.co.nz
8
Contact Us
  • Feel free to contact us at
  • www.semtech-solutions.co.nz
  • info_at_semtech-solutions.co.nz
  • We offer IT project consultancy
  • We are happy to hear about your problems
  • You can just pay for those hours that you need
  • To solve your problems
Write a Comment
User Comments (0)
About PowerShow.com