Flume and Sqoop for Ingesting Big Data

These hold/produce data to data stores like HDFS, HBase and Hive. Both the tools have default functionality and have the ability of abstracting away the users from the complication of transferring data among these systems.

$99.00

These hold/produce data to data stores like HDFS, HBase and Hive. Both the tools have default functionality and have the ability of abstracting away the users from the complication of transferring data among these systems.

More Information:

  • Learning Style: On Demand
  • Learning Style: Course
  • Difficulty: Intermediate
  • Course Duration: 2 Hours
  • Course Info: Download PDF
  • Certificate: See Sample

Need Training for 5 or More People?

Customized to your team's need:

  • Annual Subscriptions
  • Private Training
  • Flexible Pricing
  • Enterprise LMS
  • Dedicated Customer Success Manager

Course Information

About this course:

Import data: Flume and Sqoop have a crucial part to play in the Hadoop ecosystem. They have the responsibility of transferring the data from sources like local file systems, HTTP, MySQL and Twitter. These hold/produce data to data stores like HDFS, HBase and Hive. Both the tools have default functionality and have the ability of abstracting away the users from the complication of transferring data among these systems.

Flume: Flume Agents have the ability to transfer data created by a streaming application to data stores like HDFS and HBase.

Sqoop: Sqoop can be used to bulk import data from typical RDBMS to Hadoop storage structures like HDFS or Hive.

Learning Objectives:

Practical application for the various sources and data stores:

  • Sources: Twitter, MySQL, Spooling Directory, HTTP
  • Data stores: HDFS, HBase, Hive

Flume components:

  • Flume Agents
  • Flume Events
  • Event bucketing
  • Channel selectors
  • Interceptors

Sqoop components:

  • Sqoop import from MySQL
  • Incremental imports using Sqoop Jobs

Audience:

This course will be highly useful for those engineers who have the responsibility designing an application with HDFS/HBase/Hive as the data store. This will also be suitable for those engineers who intend to port data from legacy data stores to HDFS.

Requirements:

The course has a mandatory requirement of having knowledge of HDFS. You should also be having fundamental understanding of HBase and Hive shells, as HBase and Hive examples require that. Additionally, you should also be having a working installation of HDFS, because it is required to run majority of the examples. 

Career & Salary Insight

Outline

Credly Badge

Reviews

Write Your Own Review
Only registered users can write reviews. Please Sign in or create an account

Hit button to validate captcha