Continous service provisioning is a key feature of modern hardware und software server systems. These systems achieve their level of user-perceived availability through a set of formal and technical approaches, commonly summarized under the term dependability.
Dependability is defined as the trustworthiness of hardware and software systems, so that reliance can be placed on the service they provide. The main dependability attributes commonly known and accepted are availability, reliability, safety, and security.
The Dependable Systems course gives an introduction into theoretical foundations, common building blocks and example implementations for dependable IT components and systems. The focus is on reliability and availability aspects of dependable systems, such as reliability analysis, fault tolerance, fault models or failure prediction.
Introduction to the Course | 01:28:57 | |
---|---|---|
Introduction | 00:08:48 | |
Aspects of Dependability | 00:15:11 | |
Dependabilty Examples | 00:04:40 | |
Computers in Safety-Critical Systems | 00:10:09 | |
Famous Examples | 00:28:05 | |
Tradeoffs | 00:22:04 |
Basic Definitions and Faults | 01:19:08 | |
---|---|---|
Dependability | 00:13:49 | |
Dependability Stakeholders | 00:17:08 | |
Chain of Dependability Threats | 00:10:55 | |
Observations on Faults | 00:15:51 | |
Physical Faults | 00:16:03 | |
Fault Hierarchy in Distributed Systems | 00:05:22 |
Basic Definitions and Errors | 01:06:52 | |
---|---|---|
Errors | 00:09:38 | |
Hardware Error Models | 00:11:50 | |
Error Propagation | 00:14:05 | |
Failures | 00:18:20 | |
Swiss Cheese Model | 00:12:59 |
Fault Tolerance | 01:10:49 | |
---|---|---|
Means for Dependability | 00:13:18 | |
Fault Prevention | 00:14:12 | |
Fault Tolerance | 00:18:22 | |
Phases of Fault Tolerance | 00:13:34 | |
Error Processing Through Recovery | 00:11:23 |
Reliability and Availability | 01:27:04 | |
---|---|---|
Attributes of Dependability | 00:13:43 | |
Reliabilty and Availability | 00:17:00 | |
Reliabilty Definition | 00:11:56 | |
Failure Rate | 00:16:20 | |
Software Failure Rate | 00:10:23 | |
Failure Rate Example | 00:17:42 |
Analytical Evaluation | 00:55:15 | |
---|---|---|
Repetition | 00:13:08 | |
Dependability Modeling | 00:16:45 | |
Serial and Parallel Case | 00:14:09 | |
K-of-N Systems | 00:11:07 | |
Fault Tree Anaylsis | 00:00:06 |
Fault Trees | 01:11:23 | |
---|---|---|
New Cloud Computing Case Studies | 00:22:21 | |
Fault Tree Analysis | 00:07:30 | |
FTA Cutsets | 00:14:45 | |
Dynamic Fault Trees | 00:18:45 | |
Reliability Block Diagrams | 00:08:02 |
State Based Modeling | 01:27:24 | |
---|---|---|
Boolean Reduction Example | 00:17:14 | |
State Transition Diagrams | 00:22:05 | |
Markov Chains | 00:18:49 | |
Dependability Modeling with CTMCs | 00:17:24 | |
Stochastic Petri Nets | 00:11:52 |
Modelling Systems for Reliability Analysis | 00:57:43 | |
---|---|---|
Errors in Dependability Evaluation | 00:17:06 | |
Reliability Prediction | 00:22:47 | |
Mechanical Parts | 00:14:21 | |
Other Sources | 00:03:29 |
Reliability of Software | 01:17:45 | |
---|---|---|
Software - A Different Story | 00:16:31 | |
Dimensions of Black Box Models | 00:20:49 | |
Basic Execution Time Model | 00:09:55 | |
White Box Approach | 00:10:43 | |
Halstead Metric | 00:19:47 |
System Dependability Evaluation | 01:14:48 | |
---|---|---|
Qualitative Dependability Investigation | 00:09:58 | |
Root Cause Analysis | 00:18:20 | |
FMEA | 00:16:18 | |
Example: NASA Spacecraft FMEA | 00:16:18 | |
Software Development Evaluation | 00:13:54 |
Six Sigma. ITIL and CoBIT | 01:23:56 | |
---|---|---|
Review | 00:13:54 | |
Capability Maturity Model Integration | 00:13:19 | |
Reliability Models for IT Infrastructure | 00:20:29 | |
DMAIC Methodology | 00:14:47 | |
ITIL V3 Service Lifecycle | 00:14:12 | |
CoBIT | 00:07:15 |
Fault Tolerance Patterns | 01:13:31 | |
---|---|---|
Reviewing last lecture | 00:10:25 | |
Architectural Patterns | 00:26:19 | |
Spatial Redundancy through Replication | 00:17:12 | |
Maintenance Interface | 00:10:19 | |
Detection Patterns | 00:09:16 |
Detection Patterns | 01:30:09 | |
---|---|---|
Design Pattern | 00:08:00 | |
Realistic Threshold | 00:12:18 | |
Voting | 00:11:52 | |
Maintenance and Exercises | 00:14:36 | |
Error Recovery Patterns | 00:18:23 | |
Return to Reference Point / Limit Retries | 00:25:00 |
Error Recovery Patterns | 01:25:51 | |
---|---|---|
Assignment 1 | 00:23:30 | |
Error Recovery Patterns | 00:18:15 | |
Weighted Majority with Quorum Device | 00:25:19 | |
Checkpoint | 00:18:49 |
Diagnosis | 01:01:13 | |
---|---|---|
Fault Diagnosis | 00:19:11 | |
Coding Checks | 00:15:14 | |
Parity Codes | 00:17:12 | |
Hamming Code | 00:09:36 |
Redundancy | 01:23:39 | |
---|---|---|
Reiteration Redundancy | 00:21:35 | |
N-Modular Redundancy | 00:11:20 | |
Dynamic Redundancy | 00:15:39 | |
Hybrid Approaches | 00:14:22 | |
Imperfect voters | 00:08:30 | |
Memory Redundancy | 00:12:13 |
Testing | 01:25:33 | |
---|---|---|
Disk Redundancy | 00:19:51 | |
Raid 2/3 | 00:18:54 | |
Disk Redundancy: Google | 00:07:31 | |
Hardware Dependability - Testing | 00:18:34 | |
RAM Testing | 00:20:43 |
Software Dependability | 01:12:53 | |
---|---|---|
Introduction | 00:11:38 | |
Fault Dependencies in the System Stack | 00:18:44 | |
Testing Approaches | 00:15:25 | |
Software Fault Injection | 00:10:16 | |
Fault-tolerant Software | 00:16:50 |
Fault-tolerant Software | 01:13:33 | |
---|---|---|
Fault-tolerant Software | 00:15:47 | |
Single Version Approaches - Wrapper | 00:22:47 | |
Selective Instruction Duplication | 00:18:09 | |
CFE Approaches | 00:16:50 |
N-Version Programming - NVP | 01:32:45 | |
---|---|---|
Review | 00:12:49 | |
Recovery Blocks | 00:19:19 | |
Conversation Scheme | 00:13:53 | |
NVP - Design Process | 00:16:11 | |
NVP Independence Evaluation | 00:13:58 | |
Simplex Approach | 00:16:35 |
Coordination and Consensus | 01:12:55 | |
---|---|---|
Coordination Problems | 00:20:48 | |
Mutual Exclusion | 00:18:01 | |
Election Algorithms | 00:15:51 | |
Leader Election in Practise | 00:18:15 |
Reliability of Distributed Systems | 01:21:47 | |
---|---|---|
Consensus Problems | 00:08:05 | |
System-Level Diagnosis | 00:25:21 | |
Diagnosis with the PMC-Model | 00:14:28 | |
Byzantine Generals | 00:21:51 | |
Consensus Problem Families | 00:12:02 |