Apache Hadoop, Spark, Scala

Hadoop is a great Big Data solution and it's not the only Big Data solution

Course Details

Course Title:Apache Hadoop, Spark, Scala
Course Duration: 4 Weeks
Faculty: Sravya


Apache Hadoop, a free, open source, Java-based programming framework. Hadoop’s Core components are MapReduce and Hadoop Distributed File System. Hadoop Ecosystem includes Pig, Hive, HBase, ZooKeeper, Oozie, Sqoop, Flume. Hadoop is a great Big Data solution and it's not the only Big Data solution. Characteristics of Big Data are Volume, Velocity, Variety and Veracity. Spark, the technology that is revolutionizing the analytics and big data world. Spark is an open source processing engine built around speed, ease of use, and analytics. If you have large amounts of data that requires low latency processing that a typical MapReduce program cannot provide, Spark is the way to go.

Topics Covered

Introduction to Apache Hadoop
  • Covers Introduction to Hadoop
  • Hadoop Architecture
  • Administration
  • Key Components
  • Resilient Distributed Dataset and Data Frames
Introduction to Spark
  • Covers Introduction to Spark
  • Spark Application Programming
  • Spark Configuration
  • Monitoring & Tuning
Introduction to Scala
  • Covers Introduction to Scala
  • Creating Scala Doc
  • Project
  • REPL
  • Case Objects and Classes
  • Collections
  • Idiomatic Scala

Covers details on how to become -

  • Certified Hadoop Professional
  • Certified Hadoop Administrator
  • Certified Spark Profesional
  • Certified Scala Professional