Apache Hadoop, Spark, Scala

Hadoop is a great Big Data solution and it's not the only Big Data solution

Course Details

Course Title:Apache Hadoop, Spark, Scala
Course Duration: 4 Weeks
Faculty: Sravya

Brief

Apache Hadoop, a free, open source, Java-based programming framework. Hadoop’s Core components are MapReduce and Hadoop Distributed File System. Hadoop Ecosystem includes Pig, Hive, HBase, ZooKeeper, Oozie, Sqoop, Flume. Hadoop is a great Big Data solution and it's not the only Big Data solution. Characteristics of Big Data are Volume, Velocity, Variety and Veracity. Spark, the technology that is revolutionizing the analytics and big data world. Spark is an open source processing engine built around speed, ease of use, and analytics. If you have large amounts of data that requires low latency processing that a typical MapReduce program cannot provide, Spark is the way to go.

Topics Covered

  • Covers Introduction to Hadoop
  • Hadoop Architecture
  • Administration
  • Key Components
  • Resilient Distributed Dataset and Data Frames
  • Covers Introduction to Spark
  • Spark Application Programming
  • Spark Configuration
  • Monitoring & Tuning
  • Covers Introduction to Scala
  • Creating Scala Doc
  • Project
  • REPL
  • Case Objects and Classes
  • Collections
  • Idiomatic Scala

Covers details on how to become -

  • Certified Hadoop Professional
  • Certified Hadoop Administrator
  • Certified Spark Profesional
  • Certified Scala Professional