Administering Apache Hadoop

COURSE OUTLINE:

Description The Administering Apache Hadoop training course is designed for administrators who are interested in learning how to deploy and manage a Hadoop cluster. The course consists an effective mix of interactive lecture and extensive hands-on lab exercises.

After successfully completing this training course each student will receive one free voucher for the Hadoop Certified Administrator exam.

Audience
Administrators who are interested in learning how to deploy and manage a Hadoop cluster.
We recommend students have previous experience with UNIX

Learning Objectives

  • Utilize best practices for deploying Hadoop clusters
  • Determine hardware needs
  • Monitor Hadoop clusters
  • Recover from NameNode failure
  • Handle DataNode failures
  • Manage hardware upgrade processes including node removal, configuration changes, node installation and rebalancing clusters
  • Manage log files
  • Install, configure, deploy verify and maintain Hadoop clusters including:
    o MapReduce
    o HDFS
    o Pig
    o Hive (and MySQL)
    o HBase (and ZooKeeper)
    o HCatalog
    o Oozie
    o Mahout

Day 1

  • Overview of Hadoop
  • Cluster Hardware and Installation of HDFS and MapReduce
  • Rack Topology
  • Setting up a Multi-user Environment
  • Using Schedulers
  • Hadoop Security with Kerberos
  • Logs and Log Rotation
  • Monitor, Maintain and Troubleshoot HDFS and MapReduce
  • NameNode Failure and Recovery
  • JobTracker Restarting

Day 2

  • Upgrade of Hardware Process
  • Rebalancing
  • Data Management
  • Install Configure, Deploy and Verify Pig
  • Install Configure, Deploy and Verify Hive
  • Install Configure, Deploy and Verify MySQL
  • Install Configure, Deploy and Verify HBase and ZooKeeper
  • Install Configure, Deploy and Verify Other Hadoop Ecosystem (HCatalog, Oozie, Mahout)
  • Install Configure, Deploy and Verify Nagios and Ganglia