The Lecture Series on Practical Data Engineering will feature presentations by distinguished speakers from industry and academia on the topic of data engineering. We will cover topics on the application of data engineering in industrial setups as well as system level advancements in data engineering research.
Introduction | 00:55:03 | |
---|---|---|
Introduction | 00:23:54 | |
Course Logistics | 00:31:09 |
Recommending Tourist Activities - Data Science Challenges and the Needs for Data Pipelines | 01:22:17 | |
---|---|---|
Introduction | 00:08:49 | |
What's a Typical Data Science Challenge? | 00:26:29 | |
What does this have to do with Data Pipelines? | 00:18:43 | |
Data Pipeline Examples | 00:13:30 | |
Summary, Q&A | 00:14:46 |
Towards Interactive Data Analytics | 01:08:16 | |
---|---|---|
Introduction | 00:09:13 | |
Visual Analysis & Interactive Data Exploration Accelerator | 00:14:54 | |
NL Analysis | 00:16:08 | |
CPU, GPU, FPGA & Main Memory, NVM | 00:25:20 | |
Summary | 00:02:41 |
File Metadata Management in Snowflake | 00:44:19 | |
---|---|---|
Introduction | 00:10:35 | |
DMLs | 00:05:28 | |
Select an Pruning, Time Travel, Cloning | 00:08:24 | |
Automatic Clustering | 00:19:52 |
Apache Flink - An Introduction and Outlook into the Future | 01:23:05 | |
---|---|---|
Overview | 00:11:50 | |
Use Case & Users | 00:13:30 | |
Stream Processing | 00:29:45 | |
Event-Time Processing | 00:17:21 | |
Ecosystem | 00:10:39 |
Deep Earth Query - Advances in Remote Sensing Image Characterization and Indexing from Massive Archives | 01:02:20 | |
---|---|---|
Introduction | 00:17:10 | |
Kernel-based Hashing Methods | 00:10:54 | |
Multi-Code Hashing | 00:16:57 | |
BigEarthNet | 00:17:19 |
Scale-In, Then Scale-Out - MPP Postgres Database with FPGA Acceleration | 01:03:46 | |
---|---|---|
Introduction | 00:05:46 | |
HW Accelerators Introduction | 00:08:55 | |
HW Accelerators and DB Processing | 00:06:23 | |
Scale-In Before You Scale-Out | 00:26:29 | |
Database Systems with Swarm64 Data Accelerator | 00:16:13 |
Dimensions of Hardware Parallelism & Exploiting Them for Data-Intensive Systems | 00:57:45 | |
---|---|---|
Types of Hardware Parallelism | 00:10:35 | |
Implicit Parallelism | 00:20:51 | |
Explicit Parallelism | 00:19:31 | |
Adding Heterogeneity | 00:06:48 |
SAP HANA - Software Development Process | 00:46:30 | |
---|---|---|
Introduction | 00:04:51 | |
Development Process | 00:33:17 | |
Team Setup | 00:08:22 |
A PL & Compiler View on Data Management and ML Systems | 01:24:37 | |
---|---|---|
Introduction | 00:15:44 | |
How do Query Engines Work? | 00:26:53 | |
Does it Scale? | 00:19:12 | |
TensorFlow -> TensorFlare | 00:22:48 |
Data Cleaning | 00:52:39 | |
---|---|---|
Introduction | 00:04:13 | |
Error Detection Algorithms | 00:17:08 | |
Performance of Semi-supervised Aggregation | 00:17:31 | |
Holistic Data Cleaning | 00:13:47 |