Continuous service provisioning is a key feature of modern hardware und software server systems. These systems achieve their level of user-perceived availability through a set of formal and technical approaches, commonly summarized under the term dependability.
Dependability is defined as the trustworthiness of hardware and software systems, so that reliance can be placed on the service they provide. The main dependability attributes commonly known and accepted are availability, reliability, safety, and security.
The Dependable Systems course gives an introduction into theoretical foundations, common building blocks and example implementations for dependable IT components and systems. The focus is on reliability and availability aspects of dependable systems, such as reliability analysis, fault tolerance, fault models or failure prediction. Amongst other things, the following topics are covered:
Dependability definitions and metrics
Design patterns for fault tolerance
Analytical evaluation of system dependability
Hardware dependability approaches
Software dependability approaches
Latest research topics
Introduction | 01:08:14 | |
---|---|---|
Dependability | 00:19:21 | |
Examples | 00:11:04 | |
More Examples | 00:13:33 | |
Importance of Dependability for Business | 00:14:31 | |
Course Topics | 00:09:45 |
Dependability Threats & Faults | 01:17:28 | |
---|---|---|
Dependability | 00:12:53 | |
System Integration Levels | 00:14:50 | |
Dependability Tree | 00:10:39 | |
Chain of Dependability Threats | 00:17:32 | |
Faults | 00:09:18 | |
Observations on Faults | 00:12:16 |
Faults & Errors | 01:11:16 | |
---|---|---|
Recap | 00:21:04 | |
Fault Characterization | 00:11:44 | |
Fault Model for Semiconductor Memories | 00:15:57 | |
Errors | 00:15:48 | |
Software Error Models | 00:06:43 |
Failures | 01:19:57 | |
---|---|---|
Recap | 00:13:20 | |
Hazard | 00:16:17 | |
Failures | 00:12:43 | |
Failure Severity | 00:14:40 | |
Observations on Failures | 00:12:42 | |
Fault Prevention | 00:10:15 |
Fault Tolerance | 01:12:47 | |
---|---|---|
Means for Dependability | 00:06:10 | |
Fault Tolerance | 00:20:51 | |
Phases of Fault Tolerance | 00:09:31 | |
Fault Tolerance - Error Detection | 00:16:51 | |
Fault Tolerance - Damage Confinement | 00:11:18 | |
Fault Tolerance - Fault Treatment | 00:08:06 |
Architectural Patterns | 01:14:33 | |
---|---|---|
Phases of Fault Tolerance | 00:12:15 | |
Units of Mitigation | 00:14:44 | |
Correcting Audits | 00:15:56 | |
Example: PostgreSQL 9 Redundancy Options | 00:06:33 | |
Humans | 00:08:52 | |
Maintenance Interface | 00:16:13 |
Detection Patterns | 01:12:23 | |
---|---|---|
Recap | 00:17:37 | |
Fault Correlation | 00:11:52 | |
Realistic Threshold | 00:17:41 | |
Voting | 00:17:59 | |
Maintenance and Exercises | 00:07:14 |
Error Recovery Patterns | 01:13:47 | |
---|---|---|
Recap | 00:03:19 | |
Error Recovery | 00:09:34 | |
Checkpoint | 00:17:27 | |
Remote Storage | 00:16:37 | |
Redundancy Configurations for Failover | 00:15:13 | |
Examples | 00:11:37 |
Dependability Attributes | 01:17:18 | |
---|---|---|
Attributes of Dependability | 00:13:37 | |
Observations on Dependability Attributes | 00:09:20 | |
Probability of Events | 00:17:18 | |
Failure Rate | 00:12:24 | |
Variable Failure Rate in Real World | 00:10:12 | |
Steady-State Availability | 00:14:27 |
Steady-State Availability | 00:30:40 | |
---|---|---|
Recap | 00:15:10 | |
Steady State Availability Example | 00:15:30 |
Dependability Modeling | 00:47:57 | |
---|---|---|
Dependability Modeling | 00:16:39 | |
Inductive or Deductive Methods | 00:16:10 | |
Parallel Case | 00:15:08 |
Fault Trees | 01:23:31 | |
---|---|---|
Recap | 00:12:06 | |
Reliability Block Diagrams | 00:16:07 | |
Deductive Analysis - Fault Trees | 00:16:00 | |
Static Fault Trees | 00:15:03 | |
Cut Sets | 00:16:38 | |
Fixing Cut Sets | 00:07:37 |
State-Based Dependability Modeling | 01:24:58 | |
---|---|---|
Recap | 00:15:53 | |
Hypothetic Example Computer System | 00:15:29 | |
Event Tree Analysis | 00:06:38 | |
State-Based Dependability Modeling | 00:08:00 | |
Markov Chains | 00:16:16 | |
Petri Nets | 00:18:10 | |
Reliability Tools | 00:04:32 |
State-Based Dependability Modeling Recap | 00:37:47 | |
---|---|---|
Markov Chains | 00:20:38 | |
Petri Nets | 00:17:09 |
FMEA | 00:36:38 | |
---|---|---|
Qualitative Dependability Investigation | 00:06:02 | |
FMEA | 00:06:21 | |
FMEA Types | 00:14:05 | |
FMEA Worksheet | 00:10:10 |
FMEA, FMECA & HAZOPS | 01:11:45 | |
---|---|---|
Recap | 00:18:55 | |
Example: System FMEA of ATM | 00:12:05 | |
Example: Severity Ranking in Automotive Industry | 00:13:02 | |
Software FMEA | 00:13:33 | |
FMECA & HAZOPS | 00:14:10 |
CMMI, Six Sigma & ITIL | 01:10:57 | |
---|---|---|
About the Assignment & Recap | 00:08:10 | |
Root Cause Analysis | 00:10:13 | |
RCA: Ishikawa/Fishbone Diagram | 00:10:06 | |
CMMI | 00:17:07 | |
Six Sigma | 00:09:42 | |
ITIL | 00:15:39 |
Reliability Prediction | 01:25:53 | |
---|---|---|
Predicting System Reliability | 00:07:06 | |
Reliability Data | 00:17:22 | |
Failure Probability Sources | 00:25:30 | |
Software Reliability Assessment | 00:11:38 | |
Jelinski-Moranda Model | 00:11:49 | |
Halstead Metric | 00:12:28 |
Distributed Systems Theory | 01:31:06 | |
---|---|---|
Motivation | 00:16:28 | |
Divide and Conquer | 00:12:21 | |
Timing Model | 00:18:07 | |
Fault Model | 00:20:21 | |
Consensus | 00:08:13 | |
Paxos | 00:15:36 |
Fault Tolerant Distributed Systems | 01:19:38 | |
---|---|---|
Recap | 00:10:31 | |
Consistency Models | 00:14:29 | |
Client-Centric Consistency Models | 00:16:30 | |
Trade-Offs | 00:13:36 | |
PACELC | 00:12:45 | |
Replication | 00:11:47 |
Dependable Distributed Applications | 00:37:34 | |
---|---|---|
Recap | 00:04:15 | |
FT-CORBA | 00:05:46 | |
Fault Tolerant Coordination Services | 00:14:31 | |
Google Spanner | 00:10:10 | |
The Reality of Distributed Failures... | 00:02:52 |
Hardware Dependability - Diagnosis & Redundancy | 01:10:58 | |
---|---|---|
Introduction | 00:11:28 | |
Recap | 00:17:46 | |
Hardware Dependability - Redundancy | 00:11:36 | |
Static Redundancy | 00:15:35 | |
Imperfect Voters | 00:14:33 |
Hardware Redundancy | 01:18:32 | |
---|---|---|
Recap | 00:11:15 | |
Graceful Degradation | 00:23:48 | |
The Real World of Hardware Redundancy | 00:09:37 | |
Memory Redundancy | 00:09:45 | |
Parity Codes | 00:12:46 | |
Checksumming | 00:11:21 |
Software Dependability | 01:15:00 | |
---|---|---|
Software Dependability | 00:07:54 | |
Fault Dependencies in the System Stack | 00:20:18 | |
Test Oracles | 00:11:48 | |
Fault Tolerant Software | 00:09:35 | |
Single-Version Approaches | 00:13:21 | |
Process Pairs | 00:12:04 |
Software Dependability & Summary | 01:24:31 | |
---|---|---|
Assignment 3 Feedback | 00:11:38 | |
Software Dependability | 00:14:56 | |
Control Flow Errors | 00:19:06 | |
NVP | 00:13:14 | |
Simplex Approach | 00:07:50 | |
Summary | 00:17:47 |