Hadoop Management with Hive, Pig and SAS

Course Code:
SAS-DIHPS

Duration:
3 days
9:00am to 5.00pm
Course Fees:
S$2,800 (excl of G.S.T)
2020 Course Dates
None of the published dates will work for you? Speak to our training consultants for a private tuition arrangement or a closed door training.

Course Overview

In this course, you use processing methods to prepare structured and unstructured big data for analysis. You learn to organize this data into structured tabular form using Apache Hive and Apache Pig. You also learn SAS software technology and techniques that integrate with Hive and Pig and how to leverage these open source capabilities by programming with Base SAS and SAS/ACCESS Interface to Hadoop, and with SAS Data Integration Studio.
The Extended Learning page for this course includes the option to purchase Virtual Lab time to practice.
The e-learning format of this course also includes the option to purchase Virtual Lab time to practice.

Program Objectives

At Course Completion

• Move data into the Hadoop ecosystem.

• Use Hive to design a data warehouse in Hadoop.

• Perform data analysis using Hive Query Language.

• Join data sources.

• Perform extract, load, and transformation.

• Organize data in Hadoop by usage.

• Perform analysis on unstructured data using Apache Pig.

• Join massive data sets using Pig.

• Use user-defined functions (UDFs).

• Analyze big data in Hadoop using Hive and Pig.

• Use SAS programming to submit Hive and Pig programs that execute in Hadoop and store results in Hadoop or return results to SAS.

• Use SAS programming to move data between the SAS server and the Hadoop Distributed File System (HDFS).

• Construct SAS Data Integration Studio jobs that integrate with Hive and Pig processes and the HDFS.

Course Outline

Module 1: The Apache Hadoop Project

Module 2: Hive and HiveQL

Module 3: Pig and Pig Latin

Module 4: SAS and Hadoop

Click Here for full course Outline

Take the Next Step

It Takes Less Than 5 Min