Hadoop online training - PowerPoint PPT Presentation

About This Presentation
Title:

Hadoop online training

Description:

Hadoop online training demos, Demo videos are provided on demand and you can schedule a class anytime during the weekdays or on weekends since we have a new batch that starts every weekend. – PowerPoint PPT presentation

Number of Views:30
Updated: 10 January 2017
Slides: 12
Provided by: john0322jp

less

Transcript and Presenter's Notes

Title: Hadoop online training


1
Hadoop interview questions?
For Free Demo Ph1 (646) 880-9474 For Details
Folkstrain.com
2
  • 1.What is Hadoop Map Reduce ?
  • For process giant information sets in parallel
    across a hadoop cluster, Hadoop MapReduce
    framework is employed. Data info analysis uses a
    two step map and reduce process.
  • 2.How Hadoop MapReduce works?
  • In MapReduce, throughout the map part it
    counts the words in every document, where as with
    in the reduce part it aggregates the data as per
    the document spanning the whole assortment.
    Through out the map part the input file is split
    into splits for analysis by map tasks running in
    parallel across Hadoop framework.

 
For Free Demo Ph1 (646) 880-9474 For Details
Folkstrain.com
3
  • 3.Differentiate between Structured and
    Unstructured data?
  • Data which might be hold on in traditional
    database systems within the sort of rows and
    columns, as an example the online purchase
    transactions may be stated as Structured data.
    data which might be hold only partially in
    traditional(DBS) database systems.
  • 4.Explain what is shuffling in MapReduce ?
  • The process by that the system performs
    the kind and transfers the map outputs to the
    reducer as inputs is understood because the
    shuffle.

For Free Demo Ph1 (646) 880-9474 For Details
Folkstrain.com
4
5.On what concept the Hadoop framework works? 
1) HDFS - Hadoop distributed file system is
that the java primarily based classification
system for ascendable and reliable storage of big
datasets. information in HDFS is hold within the
variety of blocks and it operates on the Master
Slave design. 2) Hadoop MapReduce -This could
be a java primarily based programming paradigm of
Hadoop framework that gives measurability across
numerous Hadoop clusters. MapReduce distributes
the employment into numerous tasks which will run
in parallel.
For Free Demo Ph1 (646) 880-9474 For Details
Folkstrain.com
5
  • 6.What are the main components of a Hadoop
    Application? 
  • Core elements of a Hadoop application are
  • Hadoop Common
  • HDFS
  • Hadoop MapReduce
  • YARN
  • 7.Explain what is distributed Cache in MapReduce
    Framework ?
  • Distributed Cache is a vital feature provided by
    map reduce framework. after you wish to share
    some files across all nodes in Hadoop Cluster,
    DistributedCache is employed. The files may be
    associate degree practicable jar files or easy
    properties file.

For Free Demo Ph1 (646) 880-9474 For Details
Folkstrain.com
6
8.What is Hadoop streaming?  Hadoop
distribution includes a generic application
programming interface for writing Map and scale
back jobs in any desired programming language
like Perl,Ruby,python etc. this is often said as
Hadoop Streaming. Users will produce and run jobs
with any quite shell scripts or feasible because
the clerk or Reducers. 9.Explain what is
NameNode in Hadoop? Hadoop distribution
includes a generic application programming
interface for writing Map and scale back jobs in
any desired programming language like
Perl,Ruby,python etc. this is often said as
Hadoop Streaming.
For Free Demo Ph1 (646) 880-9474 For Details
Folkstrain.com
7
10.What is the best hardware configuration to run
Hadoop?  The good configuration for execution
Hadoop jobs is dual core machines or dual
processors with 4GB or 8GB RAM that use computer
code memory. Hadoop extremely edges from
victimization computer code memory .it's not low
- finish. computer code memory is usually
recommended for running Hadoop as a result of
most of the Hadoop users have older varied check
errors by victimization non computer code
memory.
For Free Demo Ph1 (646) 880-9474 For Details
Folkstrain.com
8
  • 11.What are the most commonly defined input
    formats in Hadoop?
  •  
  • There are three most common Input Formats
    defined in Hadoop are
  • Text Input Format It is the default input format
    defined in Hadoop.
  • Key Value Input Format It input format is used
    for plain text files where in the files are
    broken down into lines.
  • Sequence File Input Format It input format is
    used for reading files in sequence.

For Free Demo Ph1 (646) 880-9474 For Details
Folkstrain.com
9
12.What do the four Vs of Big Data denote?
simple explanation for the four critical features
of big data a) Volume Scale of data b) Velocity
Analysis of streaming data c) Variety
Different forms of data d) Veracity Uncertainty
of data 13.Explain what is heartbeat in HDFS?
Heartbeat is brought up a signal used between an
information node and Name node, and between task
tracker and job tracker, if the Name node or job
tracker doesn't reply to the signal, then it's
thought of there's some problems with data node
or task tracker.
For Free Demo Ph1 (646) 880-9474 For Details
Folkstrain.com
10
14. What is WebDAV in Hadoop? To support
updating editing files WebDAV may be a set of
extensions to http. On most OS WebDAV shares are
often mounted as file systems , therefore it's
attainable to access HDFS as a customary filing
system by exposing HDFS over WebDAV. 15.What is
sqoop in Hadoop ? To transfer the data between
Hadoop HDFS and Relational database management
(RDBMS) tool is used known as Sqoop. Using Sqoop
information is transferred from RDMS like MySQL
or Oracle into HDFS likewise as mercantilism
information from HDFS file to RDBMS.
For Free Demo Ph1 (646) 880-9474 For Details
Folkstrain.com
11
Thank You
For Free Demo Ph1 (646) 880-9474 For Details
Folkstrain.com
Write a Comment
User Comments (0)
About PowerShow.com