HDP Developer: Spark 2.x Developer

Course Code:
DEV-343
Duration:
4 days

9:00am to 5.00pm
Location:
80 Jurong East Street 21 #04-04

Devan Nair Institute
Singapore 609607
Course Fees(USD):
$2,800(excl of G.S.T)
2019 Course Dates
Dates available on request
None of the published dates will work for you? Speak to our training consultants for a private tuition arrangement or a closed door training.

Course Overview

This course introduces the Apache Spark distributed computing engine, and is suitable for developers, data analysts, architects, technical managers, and anyone who needs to use Spark in a hands-on manner. It is based on the Spark 2.x release. The course provides a solid technical introduction to the Spark architecture and how Spark works. It covers the basic building blocks of Spark (e.g. RDDs and the distributed compute engine), as well as higher-level constructs that provide a simpler and more capable interface.It includes in-depth coverage of Spark SQL, DataFrames, and DataSets, which are now the preferred programming API. This includes exploring possible performance issues and strategies for optimization. The course also covers more advanced capabilities such as the use of Spark Streaming to process streaming data, and integrating with the Kafka server.

Course Outline

Module 1: Scala Ramp Up, Introduction to Spark

Module 2: RDDs and Spark Architecture, Spark SQL, DataFrames and DataSets

Module 3: Shuffling, Transformations and Performance, Performance Tuning

Module 4: Creating Standalone Applications and Spark Streaming

Click Here for full course Outline

Take the Next Step

It Takes Less Than 5 Min