Этот материал находится в платной подписке. Оформи премиум подписку и смотри или слушай The Data Engineering Bootcamp: Zero to Mastery, а также все другие курсы, прямо сейчас!
Премиум
  • Урок 1. 00:01:35
    The Data Engineering Bootcamp: Zero to Mastery
  • Урок 2. 00:04:17
    Introduction to Data Engineering
  • Урок 3. 00:04:43
    Who Are Data Engineers?
  • Урок 4. 00:03:19
    Prerequisites
  • Урок 5. 00:01:19
    Source Code for This Bootcamp
  • Урок 6. 00:04:38
    Plan for This Bootcamp
  • Урок 7. 00:06:37
    [Optional] What Is a Virtualenv?
  • Урок 8. 00:11:03
    [Optional] What Is Docker?
  • Урок 9. 00:04:08
    Introduction
  • Урок 10. 00:03:44
    Apache Spark
  • Урок 11. 00:04:24
    How Spark Works
  • Урок 12. 00:07:41
    Spark Application
  • Урок 13. 00:06:43
    DataFrames
  • Урок 14. 00:05:51
    Installing Spark
  • Урок 15. 00:07:02
    Inside Airbnb Data
  • Урок 16. 00:07:05
    Writing Your First Spark Job
  • Урок 17. 00:02:16
    Lazy Processing
  • Урок 18. 00:01:29
    [Exercise] Basic Functions
  • Урок 19. 00:06:41
    [Exercise] Basic Functions - Solution
  • Урок 20. 00:04:00
    Aggregating Data
  • Урок 21. 00:04:40
    Joining Data
  • Урок 22. 00:06:10
    Aggregations and Joins with Spark
  • Урок 23. 00:05:09
    Complex Data Types
  • Урок 24. 00:00:50
    [Exercise] Aggregate Functions
  • Урок 25. 00:05:54
    [Exercise] Aggregate Functions - Solution
  • Урок 26. 00:03:25
    User Defined Functions
  • Урок 27. 00:06:14
    Data Shuffle
  • Урок 28. 00:03:42
    Data Accumulators
  • Урок 29. 00:07:39
    Optimizing Spark Jobs
  • Урок 30. 00:04:29
    Submitting Spark Jobs
  • Урок 31. 00:05:16
    Other Spark APIs
  • Урок 32. 00:04:33
    Spark SQL
  • Урок 33. 00:02:10
    [Exercise] Advanced Spark
  • Урок 34. 00:05:26
    [Exercise] Advanced Spark - Solution
  • Урок 35. 00:03:08
    Summary
  • Урок 36. 00:04:26
    Introduction
  • Урок 37. 00:09:08
    What Is a Data Lake?
  • Урок 38. 00:07:47
    Amazon Web Services (AWS)
  • Урок 39. 00:05:45
    Simple Storage Service (S3)
  • Урок 40. 00:09:29
    Setting Up an AWS Account
  • Урок 41. 00:03:24
    Data Partitioning
  • Урок 42. 00:07:49
    Using S3
  • Урок 43. 00:02:59
    EMR Serverless
  • Урок 44. 00:02:52
    IAM Roles
  • Урок 45. 00:08:49
    Running a Spark Job
  • Урок 46. 00:07:41
    Parquet Data Format
  • Урок 47. 00:05:32
    Implementing a Data Catalog
  • Урок 48. 00:06:42
    Data Catalog Demo
  • Урок 49. 00:04:00
    Querying a Data Lake
  • Урок 50. 00:03:39
    Summary
  • Урок 51. 00:05:53
    Introduction
  • Урок 52. 00:05:19
    What Is Apache Airflow?
  • Урок 53. 00:03:15
    Airflow’s Architecture
  • Урок 54. 00:06:33
    Installing Airflow
  • Урок 55. 00:08:03
    Defining an Airflow DAG
  • Урок 56. 00:03:38
    Errors Handling
  • Урок 57. 00:04:54
    Idempotent Tasks
  • Урок 58. 00:04:58
    Creating a DAG - Part 1
  • Урок 59. 00:04:42
    Creating a DAG - Part 2
  • Урок 60. 00:04:09
    Handling Failed Tasks
  • Урок 61. 00:04:31
    [Exercise] Data Validation
  • Урок 62. 00:03:27
    [Exercise] Data Validation - Solution
  • Урок 63. 00:03:02
    Spark with Airflow
  • Урок 64. 00:07:39
    Using Spark with Airflow - Part 1
  • Урок 65. 00:05:52
    Using Spark with Airflow - Part 2
  • Урок 66. 00:04:46
    Sensors In Airflow
  • Урок 67. 00:04:08
    Using File Sensors
  • Урок 68. 00:05:50
    Data Ingestion
  • Урок 69. 00:06:03
    Reading Data From Postgres - Part 1
  • Урок 70. 00:05:40
    Reading Data from Postgres - Part 2
  • Урок 71. 00:03:53
    [Exercise] Average Customer Review
  • Урок 72. 00:04:33
    [Exercise] Average Customer Review - Solution
  • Урок 73. 00:04:26
    Advanced DAGs
  • Урок 74. 00:02:27
    Summary
  • Урок 75. 00:05:28
    Introduction
  • Урок 76. 00:06:06
    What Is Machine Learning
  • Урок 77. 00:05:38
    Regression Algorithms
  • Урок 78. 00:05:04
    Building a Regression Model
  • Урок 79. 00:09:46
    Training a Model
  • Урок 80. 00:07:26
    Model Evaluation
  • Урок 81. 00:03:57
    Testing a Regression Model
  • Урок 82. 00:02:12
    Model Lifecycle
  • Урок 83. 00:08:44
    Feature Engineering
  • Урок 84. 00:07:34
    Improving a Regression Model
  • Урок 85. 00:03:56
    Machine Learning Pipelines
  • Урок 86. 00:02:41
    Creating a Pipeline
  • Урок 87. 00:01:59
    [Exercise] House Price Estimation
  • Урок 88. 00:03:12
    [Exercise] House Price Estimation - Solution
  • Урок 89. 00:02:57
    [Exercise] Imposter Syndrome
  • Урок 90. 00:07:37
    Classification
  • Урок 91. 00:04:27
    Classifiers Evaluation
  • Урок 92. 00:08:31
    Training a Classifier
  • Урок 93. 00:08:06
    Hyperparameters
  • Урок 94. 00:03:02
    Optimizing a Model
  • Урок 95. 00:02:34
    [Exercise] Loan Approval
  • Урок 96. 00:02:33
    [Exercise] Load Approval - Solution
  • Урок 97. 00:06:56
    Deep Learning
  • Урок 98. 00:03:23
    Summary
  • Урок 99. 00:05:07
    Introduction
  • Урок 100. 00:06:11
    Natural Language Processing (NLP) before LLMs
  • Урок 101. 00:06:21
    Transformers
  • Урок 102. 00:07:40
    Types of LLMs
  • Урок 103. 00:02:19
    Hugging Face
  • Урок 104. 00:10:38
    Databricks Set Up
  • Урок 105. 00:07:36
    Using an LLM
  • Урок 106. 00:03:42
    Structured Output
  • Урок 107. 00:05:10
    Producing JSON Output
  • Урок 108. 00:05:20
    LLMs With Apache Spark
  • Урок 109. 00:02:48
    Summary
  • Урок 110. 00:06:06
    Introduction
  • Урок 111. 00:07:00
    What Is Apache Kafka?
  • Урок 112. 00:08:56
    Partitioning Data
  • Урок 113. 00:07:42
    Kafka API
  • Урок 114. 00:03:15
    Kafka Architecture
  • Урок 115. 00:05:53
    Set Up Kafka
  • Урок 116. 00:06:07
    Writing to Kafka
  • Урок 117. 00:07:37
    Reading from Kafka
  • Урок 118. 00:06:39
    Data Durability
  • Урок 119. 00:02:11
    Kafka vs Queues
  • Урок 120. 00:03:44
    [Exercise] Processing Records
  • Урок 121. 00:02:59
    [Exercise] Processing Records - Solution
  • Урок 122. 00:05:53
    Delivery Semantics
  • Урок 123. 00:04:34
    Kafka Transactions
  • Урок 124. 00:03:23
    Log Compaction
  • Урок 125. 00:06:59
    Kafka Connect
  • Урок 126. 00:09:44
    Using Kafka Connect
  • Урок 127. 00:04:31
    Outbox Pattern
  • Урок 128. 00:08:01
    Schema Registry
  • Урок 129. 00:08:10
    Using Schema Registry
  • Урок 130. 00:03:28
    Tiered Storage
  • Урок 131. 00:04:27
    [Exercise] Track Order Status Changes
  • Урок 132. 00:05:06
    [Exercise] Track Order Status Changes - Solution
  • Урок 133. 00:04:41
    Summary
  • Урок 134. 00:05:40
    Introduction
  • Урок 135. 00:05:24
    What Is Apache Flink?
  • Урок 136. 00:08:11
    Kafka Application
  • Урок 137. 00:03:11
    Multiple Streams
  • Урок 138. 00:05:46
    Installing Apache Flink
  • Урок 139. 00:07:22
    Processing Individual Records
  • Урок 140. 00:04:02
    [Exercise] Stream Processing
  • Урок 141. 00:02:40
    [Exercise] Stream Processing - Solution
  • Урок 142. 00:06:49
    Time Windows
  • Урок 143. 00:02:40
    Keyed Windows
  • Урок 144. 00:05:18
    Using Time Windows
  • Урок 145. 00:10:06
    Watermarks
  • Урок 146. 00:06:17
    Advanced Window Operations
  • Урок 147. 00:07:50
    Stateful Stream Processing
  • Урок 148. 00:04:42
    Using Local State
  • Урок 149. 00:04:35
    [Exercise] Anomalies Detection
  • Урок 150. 00:03:34
    [Exercise] Anomalies Detection - Solution
  • Урок 151. 00:05:50
    Joining Streams
  • Урок 152. 00:03:10
    Summary
  • Урок 153. 00:01:18
    Thank You!