Developing Solutions using Apache Hadoop


Description This four-day course provides Java programmers the necessary training for creating enterprise solutions using Apache Hadoop. It consists of an effective mix of interactive lecture and extensive hand-on lab exercises.

After successfully completing this course each student will receive one free voucher for the Hadoop Certified Developer exam

Java programmers who want to better understand how to create Apache Hadoop solutions

Learning Objectives

  • Write a MapReduce program using Hadoop API
  • Utilize HDFS for effective loading and processing of data with CLI and API.
  • Understand best practices for building, debugging and optimizing Hadoop solutions.
  • Use Pig, Hive, HBase and HCatalog effectively

Day 1

  • Overview
  • MapReduce Code
  • HDFS
  • MapReduce - JobTracker, TaskTracker and Running Jobs

Day 2
  • MapReduce Combiner
  • MapReduce Partitioner
  • MapReduce Distributed Cache
  • MapReduce Streaming
  • MapReduce Data Handling

Day 3
  • Pig Into
  • Pig Data Model
  • Pig Scripting Language
  • Hive - Part 1

Day 4

  • Hive - Part 2
  • HCatalog
  • HBase
  • Enterprise Integration
  • Future of Hadoop

Extensive hands-on lab experience

Students will work through the following exercises using the Hortonworks Data Platform:

  • Running a Hadoop Solution
  • MapReduce Programming
  • HDFS
  • MapReduce in Operation
  • MapReduce with Combiner
  • MapReduce with Partitioner
  • MapReduce with a Secondary Sort and a Custom Comparator
  • MapReduce with Distributed Cache
  • MapReduce with Data Handling
  • MapReduce with Streaming
  • Using Pig for a Join
  • Using Pig for a Clustering Algorithm
  • Using Pig with User-Defined Functions
  • Basic Hive
  • Using Hive for a Join
  • HBase Basics
  • HCatalog Basics
  • MapReduce, Pig, and Hive in a Combined Solutions
  • Solving a Problem from Conception to Completion