Certificate in Big Data Engineering

Duration

4 Week

India's first PG program of its kind!

Comprehensive curriculum created by BITS & Industry Experts
5 practical industry projects, sponsored by Saavn
Industry mentors, mock interviews and career support
Offline workshops with industry, peer and faculty interactions

Program Syllabus

The curriculum has been developed by BITS faculty and leading Big Data companies. Most courses have an independent industry-sourced project that will be deployed by you on AWS Cloud. This syllabus will teach you end to end skills - a thorough understanding of fundamental concepts and thinking beyond tools!

Preparatory Sessions

If you don’t have previous experience in programming or databases (SQL) , don't worry! By enrolling for the program, you get access to completely free, pre-program preparatory sessions which will augment your skills in fundamental Computer Science concepts.

Topics Covered:

Object Oriented Programming (OOP) using JAVA
Data Structures
Design and Analysis of Algorithms
Relational Database Management Systems (SQL)

Prep Sessions will be available to students upon enrolment.

To learn more about why should you be taking prep sessions,

Foundations of Big Data Systems

Duration : 8 weeks

In this course you will be given an introduction to Big Data and its common industry applications. You will also develop important foundations in data structures and algorithms that form the basis of the Big Data Systems used in the industry.

Topics Covered:

Introduction to Big Data and its Applications
Data Abstraction
Linear data structures like Hashtables, Hashmaps, Bloom Filters
Non-linear data structures like Binary Search Trees, KD Trees
Distributed Algorithm Design
Algorithm Design using MapReduce

Course Outcomes:

You will be able to select and implement appropriate data structures to solve big data problems and also write Map and Reduce codes for distributed processing of data.

Programming Language Used: Java

Processing Big Data - ETL & Batch Processing

Duration : 7 weeks

Learn about collecting and processing structured and unstructured data by performing ETL operations. Use workflow manager tools to learn automation of task flows

Topics Covered:

Performing ETL Operations
Concepts in Data Warehousing and its Relevance for Big Data
Ingesting data into Big Data Platforms using Apache Sqoop & Flume
Workflow management for Hadoop using OOZIE
Batch Processing on Cloud

Course Outcomes:

You will learn to choose and use tools to ingest structured and unstructured data into big data processing systems and use Hive to perform data transformations. You will also be able to process Big Data on Cloud using Amazon EMR and use OOZIE for managing your workflow.

Tools & Technologies Used: Sqoop, Apache Flume, Apache Hive, HBase, Amazon EMR

Processing of Real Time Data & Streaming Data

Duration : 4 weeks

Ever wondered how you receive a notification based on your location? The answer lies in exploiting Real Time & Streaming Data. This course will expose you to the exciting world of processing real time data.

Topics Covered:

Applications of Streaming Data in Industry
Sourcing Streaming data using Apache Flume
Building real-time data pipeline using Apache Storm
Streaming on Apache Spark

Course Outcomes:

You will be able to build real time data processing systems using Apache Storm and Apache Spark

Tools & Technologies Used: Apache Storm, Apache Flume, Apache Spark

Big Data Analytics

Duration : 5 weeks

In this course you will be introduced to the field of Big Data Analytics and you will learn about the libraries in Apache Spark used to perform Regression, Classification, Clustering on Big Data.

Topics Covered:

Regression, Clustering & Classification using Spark MLLib
Building visualizations using Big Data
Case Studies on applications of Big Data Analytics

Course Outcomes:

You will be able to perform analytics on the big data using Spark MLLib and get knowledge of tools to visualize results.
Interested students will also have an opportunity to learn the basics of functional programming in Scala*

Tools & Technologies used:

Spark (MLLib) and Scala*

Course Category

Programing Language

Duration

India's first PG program of its kind!

Preparatory Sessions

Object Oriented Programming (OOP) using JAVA

Foundations of Big Data Systems

Introduction to Big Data and its Applications

Processing Big Data - ETL & Batch Processing

Processing of Real Time Data & Streaming Data

Applications of Streaming Data in Industry

Big Data Analytics

Regression, Clustering & Classification using Spark MLLib

Course Category