An introduction to Apache Mahout - PowerPoint PPT Presentation

About This Presentation
Title:

An introduction to Apache Mahout

Description:

A introduction to Apache Mahout, what is it and how does it work ? What is machine inteligence ? How can mahout be installed and tested on Hadoop ? – PowerPoint PPT presentation

Number of Views:1342
Slides: 10
Provided by: semtechs
Tags: big_data | ai | apache | hadoop | mahout

less

Transcript and Presenter's Notes

Title: An introduction to Apache Mahout


1
Apache Mahout
  • What is it ?
  • How does it work ?
  • Machine Learning
  • Algorithms
  • Install

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
2
Mahout What is it ?
  • Machine learning
  • For large data
  • Based on Hadoop
  • But can work on a non Hadoop cluster
  • Scaleable
  • Licensed by Apache

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
3
Mahout How does it work ?
  • Uses Hadoop Map Reduce
  • Has many supplied algorithms
  • Supports four use cases
  • Recommendation mining
  • Clustering
  • Classification
  • Frequent Itemset Mining

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
4
Mahout - Machine Learning
  • Machine learning what does it mean ?
  • A branch of artificial intelligence
  • Systems that learn from data
  • Classify data after learning
  • Learn on test data sets
  • Generalisation the ability to classify unseen
    data sets
  • after learning

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
5
Mahout Algorithms
  • Some of the available algorithms (among many
    others)?
  • Collaborative filtering
  • Narrow Sense make predictions about user
    interests by collecting preferences
  • General - Multi agent collaboration for
    information filtering
  • Mean shift clustering
  • Mode seeking, used for visual tracking
  • Parallel frequent pattern mining
  • Find unique features

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
6
Mahout Install
  • So how do we install Mahout and test it ?
  • Install Maven
  • sudo apt-get install maven3
  • Install Apache Mahout
  • You will need subversion installed
  • svn co http//svn.apache.org/repos/asf/mahout/trun
    k
  • Go to dir containing pom.xml file
  • mvn install in ./trunk
  • Full details available in the Mahout install
    guide on our web site shop

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
7
Mahout Test Install
  • So let us run a test
  • cd MAHOUT_HOME/examples/bin
  • ./build-reuters.sh
  • choose option 1 kmeans clustering
  • Should finish with see next slide
  • Full details available in the Mahout install
    guide on our web site shop

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
8
Mahout Test Install
  • cd MAHOUT_HOME/examples/bin
    ./build-reuters.sh
  • Please call cluster-reuters.sh directly next
    time. This file is going away.
  • Please select a number to choose the
    corresponding clustering algorithm
  • 1. kmeans clustering
  • 2. fuzzykmeans clustering
  • 3. lda clustering
  • Enter your choice 1
  • ok. You chose 1 and we'll use kmeans Clustering
  • .................................
  • Inter-Cluster Density NaN
  • Intra-Cluster Density 0.0
  • CDbw Inter-Cluster Density NaN
  • CDbw Intra-Cluster Density NaN
  • CDbw Separation NaN
  • Full details available in the Mahout install
    guide on our web site shop

www.semtech-solutions.co.nz info_at_semtech-solutions
.co.nz
9
Contact Us
  • Feel free to contact us at
  • www.semtech-solutions.co.nz
  • info_at_semtech-solutions.co.nz
  • We offer IT project consultancy
  • We are happy to hear about your problems
  • You can just pay for those hours that you need
  • To solve your problems
Write a Comment
User Comments (0)
About PowerShow.com