Flume and Sqoop for Ingesting Big Data

These hold/produce data to data stores like HDFS, HBase and Hive. Both the tools have default functionality and have the ability of abstracting away the users from the complication of transferring data among these systems.

Self-Paced

Learning Style

Course

Learning Style

Intermediate

Difficulty

2 Hours

Course Duration

Course Info

Download PDF

Certificate

See Sample

tab
About Individual Course:
  • Individual course plan gives you access to this course
On Sale!
Now Only $10.00 Regular Price $49.00
Now Only $10.00 Regular Price $49.00
/ Each
6 Learners Have Enrolled For This Course
When you subscribe, you get:
Learn Subscription plan gives you access to this course and over 847 other popular courses
On Sale!
Now Only $39.99 Regular Price $44.99
Now Only $39.99 Regular Price $44.99
/ Month
Team
Pricing
  • Buy 1-5 Enrollments And Save 0% ($39.99 monthly.)
  • Buy 6-9 Enrollments And Save 10% ($35.99 monthly.)
  • Buy 10-19 Enrollments And Save 20% ($31.99 monthly.)
  • Buy 20-above Enrollments And Save 30% ($27.99 monthly.)
6 Learners Have Enrolled For This Course

You have already taken demo for this course.

If you want to get access to demo again, feel free to contact our support at (855) 800-8240

These hold/produce data to data stores like HDFS, HBase and Hive. Both the tools have default functionality and have the ability of abstracting away the users from the complication of transferring data among these systems.

Course Information

About this course:

Import data: Flume and Sqoop have a crucial part to play in the Hadoop ecosystem. They have the responsibility of transferring the data from sources like local file systems, HTTP, MySQL and Twitter. These hold/produce data to data stores like HDFS, HBase and Hive. Both the tools have default functionality and have the ability of abstracting away the users from the complication of transferring data among these systems.

Flume: Flume Agents have the ability to transfer data created by a streaming application to data stores like HDFS and HBase.

Sqoop: Sqoop can be used to bulk import data from typical RDBMS to Hadoop storage structures like HDFS or Hive.

Learning Objectives:

Practical application for the various sources and data stores:

  • Sources: Twitter, MySQL, Spooling Directory, HTTP
  • Data stores: HDFS, HBase, Hive

Flume components:

  • Flume Agents
  • Flume Events
  • Event bucketing
  • Channel selectors
  • Interceptors

Sqoop components:

  • Sqoop import from MySQL
  • Incremental imports using Sqoop Jobs

Audience:

This course will be highly useful for those engineers who have the responsibility designing an application with HDFS/HBase/Hive as the data store. This will also be suitable for those engineers who intend to port data from legacy data stores to HDFS.

Requirements:

The course has a mandatory requirement of having knowledge of HDFS. You should also be having fundamental understanding of HBase and Hive shells, as HBase and Hive examples require that. Additionally, you should also be having a working installation of HDFS, because it is required to run majority of the examples. 

Outline

More Information

More Information
SubjectsBig Data
Lab AccessNo
Learning StyleSelf-Paced Learning
Learning TypeCourse
DifficultyIntermediate
Course Duration2 Hours
LanguageEnglish
VPA DiscountVPA Discount

Reviews

Write Your Own Review
Only registered users can write reviews. Please Sign in or create an account

Course Expert:

Author

Tom Robertson
(Data Science Enthusiast)

Tom is an innovator first, and then a Data Scientist & Software Architect. He has integrated expertise in business, product, technology and management. Tom has been involved in creating category defining new products in AI and big data for different industries, which generated more than hundred million revenue cumulatively, and served more than 10 million users.
As a Data Scientist and Software Architect Tom has extensive experience in data science, engineering, architecture and software development. To date Tom has accumulated over a decade of experience in R, Python & Linux Shell programming.

Tom has expertise on Python, SQL, and Spark. He has worked on several libraries including but not limited to Scikit-learn, Pandas, NumPy, Matplotlib, Seaborn, SciPy, NLTK, Keras, and Tensorflow.

Learn Subscription Includes:

Information
Self-Paced

Online Self-Paced Courses

Take self-paced online courses at your convenience and own pace, with unlimited access to courses in various emerging technologies.

900+ Self-Paced Courses
Information
College

E-Books, Case Studies, And White Papers

As part of informal learning, our platform will recommend E-books, white papers, case studies, articles, and videos. This is AI curated content closely aligned with your learning objectives.

E-Books, Case Studies, And White Papers
Information
College

Assessment Tests

Gauge your knowledge before you start your learning path to see exactly where your skill sets align.

Assessment Tests
Information
Dashboard

Learning Dashboard & Analytics

Access all your enrolled, completed, course statistics, and community discussions from one centralized and intuitive learning dashboard with built in analytics, course tracking, time spent, and more.

Analytics/Reporting
Information
Social

QuickStart Discussions

Engage with other learners where you can directly chat, ask questions, and socialize with other learners experts and instructors on a course subject.

Community Access Community Access
Information
Dashboard

Career Paths

Start a learning pathway towards understanding and mastering your career. With QuickStart career paths, you can fully understanding and being the best in your field.

Learning Paths Learning Paths
Information
Dashboard

Informal Learning

Access to AI curated content from various content publishers which can help in self-directed learning.

Informal Learning Informal Learning

Sign up for your FREE TRIAL, And Explore Hundreds Of Courses.


For Individuals
Start 7-Day Free Trial For Businesses
Explore Plans
click here