HDP Analyst Data Science (HW HDP DS)

Describe supervised and unsupervised learning differences
  • Virtual Classroom

    Learning Style
  • Intermediate

    Difficulty
  • 3 Days

    Course Duration
Pricing
About Individual Course:
  • Individual course plan gives you access to this course
$2,295.00
$2,295.00
/ Seat
Describe supervised and unsupervised learning differences

About Course

This course Provides instruction on the processes and practice of data science, including machine learning and natural language processing. Included are: tools and programming languages (Python, IPython, Mahout, Pig, NumPy, pandas, SciPy, Scikitlearn), the Natural Language Toolkit (NLTK), and Spark MLlib.

Course Objective:

Recognize use cases for data science on Hadoop

  • Describe the Hadoop and YARN architecture
  • Describe supervised and unsupervised learning differences
  • Use Mahout to run a machine learning algorithm on Hadoop
  • Describe the data science life cycle
  • Use Pig to transform and prepare data on Hadoop
  • Write a Python script
  • Describe options for running Python code on a Hadoop cluster
  • Write a Pig User-Defined Function in Python
  • Use Pig streaming on Hadoop with a Python script
  • Use machine learning algorithms
  • Describe use cases for Natural Language Processing (NLP)
  • Use the Natural Language Toolkit (NLTK)
  • Describe the components of a Spark application
  • Write a Spark application in Python
  • Run machine learning algorithms using Spark MLlib
  • Take data science into production

Audience:

  • Architects, software developers, analysts and data scientists who need to apply data science and machine learning on Hadoop.

Prerequisite:

  • Students must have experience with at least one programming or scripting language, knowledge in statistics and/or mathematics,
  • and a basic understanding of big data and Hadoop principles. Students new to Hadoop are encouraged to attend the HDP Overview: Apache Hadoop Essentials course.
More Information
Lab Access No
Technology Hadoop
Topics Big Data
Learning Style Virtual Classroom
Difficulty Intermediate
Course Duration 3 Days
Language English
VPA Eligible VPA Eligible
Write Your Own Review
Only registered users can write reviews. Please Sign in or create an account
Sales Support

Sales (866) 991-3924

Mon-Fri. 8am-6pm CST

Have Questions? Ask Us.

Why QuickStart

Turn Training Into A Personalized Learning Experience


  • Problem Solving through ExpertConnect & Peer-To-Peer Learning
  • Find The Quickest Path To Learn With Career Paths
  • Access All Courses With Master Subscription
  • Manage Your Team With Learning Analytics
  • Virtual Classroom Training & Self-Paced Learning
  • Integrate With Your LMS Through API's