ratingsÂ
Apache Spark, a significant component in the Hadoop Ecosystem, is a cluster computing engine used in Big Data. Building on top of the Hadoop YARN and HDFS ecosystem, it offers order-of-magnitude faster processing for many in-memory computing tasks compared to Map/Reduce.
Unlimited Duration
March 4, 2021
This course provides indoctrination in the practical use of the umbrella of technologies that are on the leading edge of data science development focused on Spark and related tools.In this course you will learn about:
· The essentials of Spark architecture and applications
· How to execute Spark Programs
· How to create and manipulate both RDDs (Resilient Distributed Datasets) and UDFs (Unified Data Frames)
· How to persist and restore data frames
· Essential NOSQL access
· How to integrate machine learning into Spark applications
How to use Spark Streaming and Kafka to create streaming applications
Course Curriculum
-
- Hadoop Ecosystem 00:00:00
- Hadoop YARN vs. Mesos 00:00:00
- Spark vs. Map/Reduce 00:00:00
- Spark with Map/Reduce: Lambda Architecture 00:00:00
- Spark in the Enterprise Data Science Architecture 00:00:00
-
- Spark Shell 00:00:00
- RDDs: Resilient Distributed Datasets 00:00:00
- Data Frames 00:00:00
- Spark 2 Unified DataFrames 00:00:00
- Spark Sessions 00:00:00
- Functional Programming 00:00:00
- Spark SQL 00:00:00
- MLib 00:00:00
- Structured Streaming 00:00:00
- Spark R 00:00:00
- Spark and Python 00:00:00
- Coding with RDDs 00:00:00
- Transformations 00:00:00
- Actions 00:00:00
- Lazy Evaluation and Optimization 00:00:00
- RDDs in Map/Reduce 00:00:00
- Spark Sessions 00:00:00
- Running Applications 00:00:00
- Logging 00:00:00
- Spark Streaming 00:00:00
- Map/Reduce and Lambda Integration 00:00:00
- Camel Integration 00:00:00
- Drools and Spark 00:00:00
- Spark Packages 00:00:00
- Spark SQL 00:00:00
- SQL and DataFrames 00:00:00
- Spark SQL and Hive 00:00:00
- Spark SQL and JDBC 00:00:00
- Using Web Notebooks (Zeppelin, Jupyter) 00:00:00
- R on Spark 00:00:00
- Python on Spark 00:00:00
- Scala on Spark 00:00:00
- Monitoring Spark Performance 00:00:00
- Tuning Memory 00:00:00
- Tuning CPU 00:00:00
- Tuning Data Locality Troubleshooting 00:00:00
Course Reviews

Students