Developing Solutions using Apache Hadoop
COURSE OUTLINE:
After successfully completing this course each student will receive one free voucher for the Hadoop Certified Developer exam
Audience
Java programmers who want to better understand how to create Apache Hadoop solutions
Learning Objectives
- Write a MapReduce program using Hadoop API
- Utilize HDFS for effective loading and processing of data with CLI and API.
- Understand best practices for building, debugging and optimizing Hadoop solutions.
- Use Pig, Hive, HBase and HCatalog effectively
Day 1
- Overview
- MapReduce Code
- HDFS
- MapReduce - JobTracker, TaskTracker and Running Jobs
- MapReduce Combiner
- MapReduce Partitioner
- MapReduce Distributed Cache
- MapReduce Streaming
- MapReduce Data Handling
- Pig Into
- Pig Data Model
- Pig Scripting Language
- Hive - Part 1
Day 4
- Hive - Part 2
- HCatalog
- HBase
- Enterprise Integration
- Future of Hadoop
Extensive hands-on lab experience
Students will work through the following exercises using the Hortonworks Data Platform:
- Running a Hadoop Solution
- MapReduce Programming
- HDFS
- MapReduce in Operation
- MapReduce with Combiner
- MapReduce with Partitioner
- MapReduce with a Secondary Sort and a Custom Comparator
- MapReduce with Distributed Cache
- MapReduce with Data Handling
- MapReduce with Streaming
- Using Pig for a Join
- Using Pig for a Clustering Algorithm
- Using Pig with User-Defined Functions
- Basic Hive
- Using Hive for a Join
- HBase Basics
- HCatalog Basics
- MapReduce, Pig, and Hive in a Combined Solutions
- Solving a Problem from Conception to Completion