Mapreduce In Hadoop PPT by Ravi Namboori Cisco Evangelist - PowerPoint PPT Presentation

About This Presentation
Title:

Mapreduce In Hadoop PPT by Ravi Namboori Cisco Evangelist

Description:

Ravi Namboori presenting How Mapreduce process works In Hadoop with a Flow diagram which explains the flow from Job Submission Process to initialization, Task Assignment & heartbeat method and Task Execution. – PowerPoint PPT presentation

Number of Views:89

less

Transcript and Presenter's Notes

Title: Mapreduce In Hadoop PPT by Ravi Namboori Cisco Evangelist


1
How Hadoop Runs A Mapreduce Job
Presentation by Ravi Namboori Visit us
http//ravinamboori.net
2
Involved Topics Are
  • Flow diagram
  • Job Submission Process
  • Job initialization
  • Task Assignment heartbeat
  • Task Execution
  • Task Runner

3
  • http//ravinamboori.net

Image source Computaholics
4
Job Submission Process
  • The job submission process implemented by
    JobClients submitJob() method
  • Asks the jobtracker for a new job ID (step 2)
  • Checks the output specification of the job
  • Computes the input splits for the job.
  • If the splits cannot be computed, because the
    input paths dont exist, for example, then the
    job is not submitted and an error is thrown to
    the MapReduce program.

http//ravinamboori.net
5
Job Initialization
  • When the JobTracker receives a call to its
    submitJob() method
  • It puts it into an internal queue from where the
    job scheduler will pick it up and initialize it.
  • Initialization involves
  • creating an object to represent the job being
    run, which encapsulates its tasks

6
Task Assignment heartbeat
  • Tasktrackers run a simple loop that periodically
    sends heartbeat method calls to the jobtracker.
  • Heartbeats tell the jobtracker that a tasktracker
    is alive.
  • Tasktrackers have a fixed number of slots for map
    tasks and for reduce tasks
  • The default scheduler fills empty map task slots
    before reduce task slots
  • If the tasktracker has at least one empty map
    task slot, the jobtracker will select a map task
    otherwise, it will select a reduce task.

http//ravinamboori.net
7
Task Execution
  • Once the tasktracker has been assigned a task
  • Task Execution localizes the job JAR by copying
    it from the shared filesystem to the
    tasktrackers filesystem
  • It also copies any files needed from the
    distributed cache by the application to the local
    disk(step 8)

8
Task Runner
  • TaskRunner launches a new Java Virtual Machine
    (step 9) run each task in(step 10).

http//ravinamboori.net
9
THANKS
Presentation by Ravi Namboori Visit us
http//ravinamboori.in
Write a Comment
User Comments (0)
About PowerShow.com