Course

Designing Big Data Applications - Foundations


Formerly Hadoop: Distributed Processing of Big Data

Big Data platforms are distributed systems that can process large amounts of data across clusters of servers. They are being used across industries in internet startups and established enterprises. In this comprehensive introductory course, you will get up to speed on the use of current Big Data platforms and gain insights into cloud-based Big Data architectures. We will cover Hadoop, Spark and other Big Data platforms based on SQL, such as Hive.

The first half of the course includes an overview of the frameworks for MapReduce and Spark. You will learn how to write MapReduce/Spark jobs and how to optimize data processing applications. The second half of the course covers SQL based tools for Big Data. We use Hive to build ETL jobs. The course also includes the fundamentals of NoSQL databases like HBase and Kafka.

The course consists of interactive lectures, hands-on labs in class, and take home practice exercises. Upon completion of this course, you will possess a strong understanding of the tools used to build Big Data applications using MapReduce, Spark, and Hive.

Topics Include:

  • Big Data applications architecture
  • Understanding Hadoop distributed file system (HDFS)
  • How MapReduce framework works
  • Introduction to HBase (Hadoop NoSQL database)
  • Introduction to Apache Kafka
  • Developing MapReduce applications
  • Introduction to Spark and SparkSQL
  • Developing Spark/SparkSQL applications
  • Managing tables and query development in Hive
  • Introduction to data pipelines

Note(s): This course uses Cloudera Hadoop. Students are required to bring laptops—with 64bit CPU and a minimum of 8GB of memory—to class.

Skills Needed: Basic SQL skills and the ability to create simple programs in a modern programming language are required. An understanding of database, parallel or distributed computing is helpful.

Sections Open for Enrollment:

Open Sections and Schedule
Start / End Date Units Location Cost Instructor
10-05-2019 to 12-14-2019 3.0 CLASSROOM $960

Marilson Campos

Enroll

Schedule

Date: Start Time: End Time: Meeting Type: Location:
Sat, 10-05-2019 9:00 a.m. 12:00 p.m. Classroom with Online Materials SANTA CLARA
Sat, 10-12-2019 9:00 a.m. 12:00 p.m. Classroom with Online Materials SANTA CLARA
Sat, 10-19-2019 9:00 a.m. 12:00 p.m. Classroom with Online Materials SANTA CLARA
Sat, 10-26-2019 9:00 a.m. 12:00 p.m. Classroom with Online Materials SANTA CLARA
Sat, 11-02-2019 9:00 a.m. 12:00 p.m. Classroom with Online Materials SANTA CLARA
Sat, 11-09-2019 9:00 a.m. 12:00 p.m. Classroom with Online Materials SANTA CLARA
Sat, 11-16-2019 9:00 a.m. 12:00 p.m. Classroom with Online Materials SANTA CLARA
Sat, 11-23-2019 9:00 a.m. 12:00 p.m. Classroom with Online Materials SANTA CLARA
Sat, 12-07-2019 9:00 a.m. 12:00 p.m. Classroom with Online Materials SANTA CLARA
Sat, 12-14-2019 9:00 a.m. 12:00 p.m. Classroom with Online Materials SANTA CLARA