Это пробный урок. Оформите подписку, чтобы получить доступ ко всем материалам курса. Премиум

  1. Урок 1. 00:05:50
    Course Objective
  2. Урок 2. 00:03:10
    Downloading the code & Setting up Infrastructure
  3. Урок 3. 00:05:39
    Setting up data and how to use the notebooks for this course
  4. Урок 4. 00:25:24
    [Spark IO] Spark can read data from and write data to most systems and formats
  5. Урок 5. 00:22:17
    [Dataframe API] is the Pythonic equivalent of Spark SQL
  6. Урок 6. 00:08:42
    [Spark application] is made up of one driver and one-or-more executor
  7. Урок 7. 00:25:09
    [Distributed data transformations] are of two types Narrow & Wide
  8. Урок 8. 00:21:40
    [Query plan] is how Spark plans to execute your logic
  9. Урок 9. 00:21:58
    [Spark UI] to see statistics of how your data was processed
  10. Урок 10. 00:23:05
    [Columnar format] is critical for large-scale data warehousing
  11. Урок 11. 00:21:06
    [Partitioning] Splitting data into folders based on commonly filtered-column(s)
  12. Урок 12. 00:17:05
    [Bucketing] is partitioning for high-cardinality columns
  13. Урок 13. 00:15:51
    [Coding Techniques] for Optimal Data Processing in Apache Spark
  14. Урок 14. 00:17:14
    [Spark Configurations] for optimal data processing
  15. Урок 15. 00:34:01
    [End-to-end data pipeline] for 50GB Stackoverflow Data Analysis