Introduction And Components Of Hadoop Architecture - PowerPoint PPT Presentation

About This Presentation
Title:

Introduction And Components Of Hadoop Architecture

Description:

Hadoop is a batch processing system for a cluster of nodes that gives the bases of the biggest Data analytic activities because it bundles two sets of functionality, most wanted to deal with huge unstructured datasets i.e Distributed file systems and MapReduce processing. – PowerPoint PPT presentation

Number of Views:197

less

Transcript and Presenter's Notes

Title: Introduction And Components Of Hadoop Architecture


1
Introduction Components ofHadoop Architecture
2
Hadoop
  • Hadoop is a batch processing system for a cluster
    of nodes that gives the bases of the biggest Data
    analytic activities because it bundles two sets
    of functionality, most wanted to deal with huge
    unstructured datasets i.e Distributed file
    systems and MapReduce processing.
  • It is a project from the Apache Software
    Foundation written in Java to assist
    data-intensive distributed applications.
  • Hadoop allows applications to operate with
    thousands of nodes and petabytes of data.
  • The incentive originates from Googles MapReduce
    and Google File System papers.
  • Hadoops biggest contributor has been the search
    giant Yahoo, where it is widely utilized across
    the business platform.

3
Map Reduce
  • Hadoop MapReduce is a programming model and
    software structure for writing applications that
    quickly make large amounts of data in parallel on
    big clusters of computer nodes. MapReduce uses
    the HDFS to access file parts and to save reduced
    results.

HDFS
  • Hadoop Distributed File System (HDFS) is the
    initial storage system handled by Hadoop
    applications. HDFS is, as its name implies, a
    distributed file system that gives high
    throughput access to application data creating
    multiple copies of data blocks and sharing them
    on computer nodes during a cluster to enable
    reliable and fast computations.

4
Architecture of Hadoop
  1. Hadoop is a Map/Reduce framework that works on
    HDFS or HBase.
  2. The central idea is to decompose a task into many
    identical tasks that can be executed closer to
    the data.
  3. Also, all tasks are parallelized the Map phase.
    Then all these intermediate results are joined
    into one result the Reduce phase.
  4. In Hadoop, The JobTracker is responsible for
    regulating the job, maintaining the Map/Reduce
    phase, retrying in case of failures.
  5. The TaskTrackers (Java process) are running on
    different DataNodes. Each Task Tracker performs
    the tasks of the job on the locally saved data.

5
Free Online Bigdata With Hadoop Fundamentals 
  • StudySection offers following Bigdata With Hadoop
    Online Certifications -gt
  • Bigdata With Hadoop Fundamentals Certification
    Exam (Foundation)
  • Bigdata With Hadoop Fundamentals Certification
    Exam (Advanced)
  • Bigdata With Hadoop Fundamentals Certification
    Exam (Expert)

6
About Study Section
  • Welcome to StudySection - the most loved online
    platform for eCertification in several subjects
    including but not limited to Software
    Development, Quality Assurance, Business
    Administration, Project Management, English,
    Aptitude and more. From more than 70 countries
    students are StudySection Certified. If you are
    not yet StudySection certified it's not late. You
    can start right now. 
  • Being StudySection Certified helps you take your
    education level few notches up and have an edge
    over other candidates when you need it the most.
    Globally, our students are employed in different
    organizations and are utilizing the benefit of
    being certified with us.
Write a Comment
User Comments (0)
About PowerShow.com