fbpx

Apache Spark Development

Course Overview:

Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, Spark ML for machine learning, GraphX for graph data processing, and Spark Streaming for live data stream processing. With Spark running on Apache Hadoop YARN, developers can create applications to derive actionable insights within a single, shared dataset in Hadoop.
This training course will teach you how to solve Big Data problems using Apache Spark framework. The training will cover a wide range of Big Data use cases such as ETL, DWH, data virtualization, streaming, graph data structure, machine learning. It will also demonstrate how Spark integrates with other well established Hadoop ecosystem products. You will learn the course curriculum through theory lectures, live demonstrations and lab exercises. This course will be taught in Python programming language.

At Course Completion

  • Create real-world native apps using React
  • Make truly reusable components that look great
  • Understand the terminology and concepts of Redux
  • Get up to speed with React design principles and methodologies
  • Discover mobile design patterns used by experienced engineers

Prerequisite

  • Following are the pre-requisites for the course.
  • Programming knowledge in Python is required
  • Basic Knowledge of big data use-cases.
  • Basic knowledge of databases, OLAP/OTLP use cases, SQL
  • Knowledge of Java stack – JVM is helpful
  • What is Apache Spark- the story of the evolution from Hadoop
  • Advantages of Spark over Hadoop Map Reduce
  • Lambda architecture for enterprise data and analytics services
  • Deployment modes – YARN, Standalone, Mesos
  • Developing on Spark using REPL, Zeppelin, IDE
  • Data sources for Spark application

Lab Exercise

  • Install and get started with VM
  • Launching spark REPL and Zeppelin

Not Available

Course Detail

Calendar 2025

20-24 Jan

14-18 Apr

4-8 Jul

27-31 Oct

Have Any Question?

If you need further information about this course, please contact:

Registration Course under HRDC

Registration Course with PERKESO EIS

Registration Course