arrow_back
Introduction
Welcome to the Course Video
The Fundamentals
Data VS Information
Data Storage and Processing
Data Sources
Big Data Introduction
Fundamentals Assessment
Live Class on March 6 2023
The Foundations of Big Data
2.1 Emergence of Big Data
Emergence of Big Data
Basic Terminologies
Foundations Assessment 1
2.2 Central Theme of Big Data
Central Theme of Big Data
Requirements of Programming Model
Understand Distributed Processing through a Story
Foundations Assessment 2
LiveClassMarch82023
LiveClassMarch152023
Environment and Installations
OracleVMInstallation
1Oracle_VM_Installation_1
Google Cloud Platform Setup
How to install Ubuntu operating system on Virtual box
How to install PySpark on Ubuntu with Java and Python_3
How to configure Pyspark with Pycharm_with_Installation
Google Cloud Platform Setup
Hadoop Ecosystem
3.1 Introduction to Hadoop Ecosystem
Introduction to Hadoop Ecosystem
Hadoop Ecosystem Assessment 1
3.2 Hadoop Distributed File System
What is HDFS?
Nodes in HDFS
LiveClassMarch172023
HDFS Assessment1
Storing File in HDFS
Reading File from HDFS
HDFS Assessment2
Challenges in Distributed Systems
LiveclassMarch232023
Managing the Data Node Failure
LiveClassMarch242023
HDFS Assessment3
Managing Name Node Failure
LiveclassMarch272023
HDFS Commands Part 1
HDFS Commands Part2
LiveclassMarch282023
HDFS Assessment 4
3.3 Map Reduce
Introduction to Map Reduce
Map Reduce Flow Example 1
Map Reduce Implementation
LiveClassMarch292023
Map Reduce Example 2 - User View Count
Map Reduce Mappers and Reducers
MapReduce Assessment 1
Shuffle-Sort-Partitions
LiveClassMarch302023
Map Reduce Combiners
Combiner with Caution
Map Reduce Wrap Up
MapReduce Assessment 2
3.4 Hive
Transactional and Analytical Processing
What is Data warehouse?
Introducing Hive
LiveclassMarch312023
Hive Assessment 1
Hive Hands-on1
Hive Hands-on2
Hive Hands-On Assessment 1
Hive vs RDBMS
LiveClassApril32023
Hive Architecture
LiveClassApril042023
Hive Metastore
Hive Assessment 2
Hive Hands-On 3
Hive Hands_on Assessment 2
Primitive Datatypes in Hive
How storage works in Hive
Different types of Tables in Hive
LiveApril52023
Hive Assessment 3
Hive Hands-on4
Hive Hands-on5
Hive Hands-on Assessment3
Inserting the Data into Hive Tables
Hive Hands-on6-Inserting data into Tables
LiveclassApril62023
LiveClassApril102023
Hive Complex Datatypes
LiveclassApril112023
Hive User Defined Functions
LiveclassApril122023
Hive Assessment 4
Hive Hands-on7 _Complex Datatype
Hive Hands-on Retrieving Elements from Complex data types columns and Explode
Hive Hands_on Assessment 4
Denormalized Storage in Hive
Hive Optimization Of The Queries Theory
Hive Partition & Bucketing Theorypart1
LiveClassApril132023
Hive Partition & Bucketing Theory Part2
Hive Assessment 5
Hive Hands-on9- Partitioning Part1
LiveClassApril142023
Hive Hands on10 Partitioning Part 2
Hive Hands on11-Bucketing
LiveClassApril172023
Hive Hands Assessment-5
Python for PySpark
Introduction to Programming
Introduction to Programming
LiveClassApril182023IntrotoProgramming
Python Programming
Introduction to Python
Environment for Python
Executing Python Code
LiveclassApril192023PythonIntro1
Python Assessment 1
Syntax, Indentation and Comments
Syntax, Indentation and Comments - Practical
LiveClassApril202023PythonIndentationVariables
Variables
Variable Practical's
Python Datatypes
LiveClassApril212023PythonDataTypes
Python Datatypes Practicals
Python Assessment 2
Python Operator Concepts
Python Operator Praticals
LiveClassApril252023PythonOperators
Control Flows in Python
LiveClassApril262023PythonControlFlowIntro
Control Flows - IF ELSE Concepts
If Else Practical
Loops Theory
Loops Practical
LiveclassApril272023PythonControlFlow
Python Assessment 3
Python Function Concepts
Python Function Hands-on
Apache Spark
Introduction to Spark
Why Spark?
Advantages of Spark
LiveClassMay42023WhySparkandAdvantagesofSpark
What is Spark?
Components of Spark
LiveclassMay52023WhatisSpark
History of Spark
Introduction to Spark Assessment1
Overview of the Spark
Architecture of Spark
LiveClassMay82023ArchitecutreofSpark
Spark Session
Spark Session Terminal & Jupyter notebook Hands-On
Spark Language API
Overview of the Spark Assessment1
Dataframes and Partitions
How to Create Dataframe in Terminal and in Jupyter Notebook?
Spark Transformations
Spark Actions
Overview of the Spark Assessment2
Structured API Overview
Structured APIs - Dataframes and Datasets
Schema Definition
Spark Types
Structured API Execution
Structured API Overview Assessment1
Operations on Dataframes
Dataframe Columns
Columns as Expression
Dataframe Rows
Operations on Dataframe Assessment1
Ways of Creating Dataframe
Methods to Manipulate Columns
DataFrame Transformations
Operations on Dataframe Assessment2
Dataframe Transformation - Columns
Dataframe Transformations - Rows Part1
Dataframe Transformation - Rows Part2
Operations on Dataframe Assessment3
Working with Different Types of Data
Introduction to working with Different Types of Data
Working with Booleans
Working with Numbers
Working with Strings
Working with Strings Practical1
Working with String Practical2
Introduction to working with Different Types of Data Assessment 1
Working with Date and Time Stamps
Working with Null Concepts
Working with Nulls Practicals
Working with Complex Types
Working with Complex types practical
User Defined Functions - Concepts
User Defined Functions - Practicals
Introduction to working with Different Types of Data Assessment 2
Creating Dataframes from different sources
Data Sources Introduction
Read-API- Data Sources
Read-API-Practical
Write-API-Data Sources
Write-API-Practical
Creating Dataframes from different sources Assessment 1
Reading from CSV Files
Writing into CSV Files
Reading from JSON Files and Writing into JSON
Reading from Parquet and writing into Parquet
Reading from ORC and writing into ORC
Unstructured Data - Text File - Reading and Writing
Introduction to reading data from structured sources
Reading data from structured sources - Database - Concepts
Reading data from structured sources - Database - Practicals
Query Pushdown Concepts
Query Pushdown Praticals
Writing into structured sources - Database - Concepts
Writing into structured sources - Database - Practicals
Creating Dataframes from different sources Assessment 2
Aggregations
Introduction to Aggregations
Aggregataion Concepts - Count
Aggregation_Practical-1-Count
Aggregation Concepts - First, Sum and Average
Aggregation - Practical - 2FirstLastAverage
Aggregation Assessment 1
Aggregation concepts - Statistical Functions
Aggregation-Practical-3-StatisticalFunctions
Aggregation Concepts - Grouping
Aggregation-Practical-4-GroupBy
Aggregation Concepts - Window Functions
Aggregation-Practical-5-WindowFunctions
Aggregation Concepts - RollUp and Cube
Aggregation-Practical-6-RollupandCube
Aggregation Assessment 2
Spark Joins
Spark Joins Theory-1-Introduction
Spark Joins Theory-2-How Joins Work
Spark Joins-Theory-3-Inner Joins
Spark Joins -Practical -1-Innerjoins
Saprk Joins - Theory-4 - Outer Joins
Spark Joins -Practical - OuterJoins
Spark Joins -Theory - 5-Left Semi & Anti Joins
Spark Joins - Practical - LeftSemiAntiJoins
Spark Joins -Theory -6-CrossJoin
Spark Joins - Practical- CrossJoins
Spark Joins -Theory -7-ChallengesInJoins
Spark Joins-5-Practical-TacklingtheChallengesinJoins
Spark Joins -Theory -8-CommunicationStrategies
Joins Assessment
Resilient Distributed Datasets-RDDs
What is an RDD ?
Introduction to Low Level APIs
Properties Of RDD
When to use RDDs
Creating RDDs
RDD Practical-1-Creating RDDs
RDD Assessment 1
RDD Lineage
RDD Transformations
RDD - Transformations Practical
RDD Actions
RDD Actions - Practical
RDDT Saving To File
RDD Saving to a File - Practical
RDD Assessment 2
Distributed Variables
Distributed Variables - Introduction
Broadcast Variables
Broadcast Variables - Practical
Accumulators
Accumulators - Practical
Distributed Variables Assessment
How Spark runs on a Cluster
Introduction
How Spark runs on a Cluster - ClusterManager
How Spark runs on a Cluster - ExecutionModes
Life Cycle a Spark Application - Outside Spark
Life Cycle of a Spark Application - Inside Spark
How Spark runs on a Cluster Assessment
LiveclassMay92023PySparkSparkSession
LiveclassMay112023PysparkTransformations
LiveclassMay122023PySparkActions
LiveclassMay152023SparkStructuredAPIDatatypes
LiveclassMay162023PySparkLogicalPhysicalCatalystOptimizer
LiveclassMay182023PySparkColumnsandRows
LiveclassMay192023PysparkCreatingDataframes
LiveclassMay222023PySparkColumnManipulation
LiveClassMay242023PySparkRowTransformationsSort
LiveclassMay252023PySparkBooleans
LiveclassMay262023PySparkNumbers&Spaces
LiveclassMay292023PySparkStringManipulationDate
LiveclassMay30PySparkNullpracticals&ComplexDataTypes
LiveclassJune12023PySparkCompleTypesPracticalUDFTheory
LiveclassJune22023PySparkUDFPracticals
LiveclassJune52023PySparkDataSources1
LiveclassJune072023PySparkWritemode
LiveclassJune082023PySparkCSVJSONParquet
LiveclassJune092023DataSourceTextFileandplans
LiveclassJune132023PySparkReadingfromDatabase
LiveClassJune142023PySparkWritingtoDBandAggregationINtro
LiveClassJune162023PySparkAggregationsGroupBY
LiveclassJune192023PySparkWindowRollUPCube
LiveclassJune202023PySparkJOins1
LiveClassJune212023PysparkJoins2
LiveClassJune272023PySparkCommunicationStrategies
LiveClassJune282023PySparkJoinStrategiesHandson
LiveclassJuly032023PySparkJoinStrategyHints
LiveClassJuly42023PySparkRDD1
LiveClassJuly52023PySparkRDD2Transformation
LiveClassJuly062023PySparkRDDActions
LiveClassJuly72023PySparkDistributedVariables
SparkExecutionModes
Feb 5 2024 Batch Live Videos
IntrotoDataFeb52024
Feb7WhatisBigData
3rdClassFebBigDataTerminolgies
CentralThemeofBigDataFeb122024
PastaStoryHadoopEcsystemINtro
HDFS1NameNodeDataNode
TacklingtheChallengesofDataNodeandNamenodeFailure
MapReducePart1
MapReducePart2Reducers&Combiners
TransactionalVsAnalyticalDatawarehouseintrotohive
IntroductiontoProgrammingLanguage
PythonIntroductionVariables
PythongIntrotoDatatypes
PythonDataTypesPracticalOperatorsConcepts
Python-ControlFlows
PythonFunction
SparkIntroduction
SparkArchitecture
SparkSession&Dataframecreation
SparkTransformationandActions
SparkTypesandSchema
StructuredAPIExecutionandLogicalandPhysicalPlan
ColumnsRowsandCreatingtheDataframes
WorkingwithDataframes
ColumnandRowManipulations
RowsSortandUnion
DifferentTypesofData-BooleanandNumericals
SparkStringManipulations
SparkDates
SparkHandlingNull
SparkHandlingComplexDataTypes
SparkUDfs
SparkPythonUDFProcessandDataFramereader
video1972650244
SparkReadingCSVRepartition&coalesce
SparkJSONParquetORCandtextfiles
SparkReadingDataRDBMS
SparkPushedDownQueryandWritingDataintoRDBMS
SparkAggregation1
SparkAggregations2
SparkAggregationsGroupBy
SparkAggregationsWindowFunctions
SparkAggregationsRollUpandCube
SparkJoins1
Sparkjoins2
SparkChallengesinJoins
SparkJoinsCommunicationStrategies
SparkRDD1
Spark-RDDManipulations
SparkRDDTransformationsandActions
SparkRDDsWrittingintoafile
Spark-DistributedVaraibles
SparkExecutionModes
SparkLifeCycleoutsideandinside
SparkPerformancetuning-Caching
SparkTuningCachingPersistenceHands-on
SparkPTJoinsHints
SparkTuningCoalesceHints
IntrotoPandaNumpy&Matplot
SparkPerformanceTuning2
PerfromanceTuning_Hands-on1
PerformanceTuningHandsOnColumnPrunRowfilter
PerformanceTuningSparkPartitioning
SparkBucketingPerformanceTuning
AQE-Intro
SparkPerformanceTuningAQEConcepts
SparkPerformanceTuningAQEHands-on
PASS-CustomBatch
PassbatchNamenodefailure
APSSRequirementsofprogrammingmodel
APSSMapReduceParallelism
APSS
PASSclassDatawarehouse
PASSDataWarehouseSCDs
PASSHive
PASSWhySparkandWhatisSpark
PASS-SparkArchitectureSessionTransformation
PASSSparkActions-StructuredAPIs
PASS-SparkSchemaLogicalandPhysicalPlan
PASS-SparkTypeSafeExplainmodesColsRows
PASSSparkColsManipulation
PASSSparkSelectcolexpr
PASSSparkColManipulations2
PASS-SparkROwManipulation1
PASSSparkRowManipulations
PASSSparkUnion
PASSDoubtsClearing
PASSBoolean
PASSSparkNumbers
PASS-SPARKWOrkingwithStrings
Preview - Big Data - Hadoop & PySpark
Discuss (
0
)
navigate_before
Previous
Next
navigate_next