Rawashdeh, 2008 - Google Patents

Towards decentralized management of graceful degradation in distributed embedded systems

Rawashdeh, 2008

Document ID: 5085474856975786899
Author: Rawashdeh O
Publication year: 2008
Publication venue: Proceedings of IEEE Dependable Systems and Networks Conference (DSN’08)

External Links

Cited by

Snippet

Graceful degradation entails a proportional loss of functionality or the reduction in the quality of services a system provides in response to faults. Compared to traditional techniques, graceful degradation is a promising approach to achieving fault tolerance at reduced cost …

Continue reading at users.ece.cmu.edu (PDF) (other versions)

230000015556 catabolic process 0 title abstract description 12

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2023—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant details of failing over
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/2097—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements maintaining the standby controller/processing unit updated
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/16—Error detection or correction of the data by redundancy in hardware
- G06F11/20—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements
- G06F11/202—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant
- G06F11/2038—Error detection or correction of the data by redundancy in hardware using active fault-masking, e.g. by switching out faulty elements or by switching in spare elements where processing functionality is redundant with a single idle spare processing component
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Programme initiating; Programme switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogramme communication; Intertask communication
- G06F9/546—Message passing systems or structures, e.g. queues
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5061—Partitioning or combining of resources
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/07—Error detection; Error correction; Monitoring responding to the occurence of a fault, e.g. fault tolerance
- G06F11/14—Error detection or correction of the data by redundancy in operation
- G06F11/1402—Saving, restoring, recovering or retrying
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F1/00—Details of data-processing equipment not covered by groups G06F3/00 - G06F13/00, e.g. cooling, packaging or power supply specially adapted for computer application
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a programme unit and a register, e.g. for a simultaneous processing of several programmes
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F2201/00—Indexing scheme relating to error detection, to error correction, and to monitoring
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06Q—DATA PROCESSING SYSTEMS OR METHODS, SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL, SUPERVISORY OR FORECASTING PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management

Similar Documents

Publication	Publication Date	Title
Sari et al.	2015	Fault tolerance mechanisms in distributed systems
Engelmann et al.	2011	Redundant execution of HPC applications with MR-MPI
Chatterjee et al.	2018	Task mapping and scheduling for network-on-chip based multi-core platform with transient faults
Thekkilakattil et al.	2014	Mixed criticality scheduling in fault-tolerant distributed real-time systems
Hudson et al.	2018	Fault control using triple modular redundancy (TMR)
Kumar et al.	2011	Fault tolerance in real time distributed system
US7840852B2 (en)	2010-11-23	Method and system for environmentally adaptive fault tolerant computing
CN116860463A (en)	2023-10-10	A distributed adaptive spaceborne middleware system
Taskeen Zaidi	2016	Modeling for fault tolerance in cloud computing environment
Zheng et al.	2009	On the design of communication-aware fault-tolerant scheduling algorithms for precedence constrained tasks in grid computing systems with dedicated communication devices
Engelmann et al.	2006	Symmetric Active/Active High Availability for High-Performance Computing System Services.
Korovin et al.	2016	A recovery method for the robotic decentralized control system with performance redundancy
Kumar et al.	2015	Real-time fault tolerance task scheduling algorithm with minimum energy consumption
Lau et al.	2008	Designing fault tolerant web services using bpel
Brightwell et al.	2010	Transparent redundant computing with MPI
Rawashdeh	2008	Towards decentralized management of graceful degradation in distributed embedded systems
Rawashdeh et al.	2005	A technique for specifying dynamically reconfigurable embedded systems
Girault* et al.	2004	A scheduling heuristics for distributed real-time embedded systems tolerant to processor and communication media failures
Resch et al.	2013	Software composability and mixed criticality for triple modular redundant architectures
Agirre et al.	2012	Fault tolerant component management platform over Data Distribution Service
Subramaniyan et al.	2006	FEMPI: A Lightweight Fault-tolerant MPI for Embedded Cluster Systems.
Engelmann et al.	2006	Active/active replication for highly available HPC system services
te Hofsté et al.	2024	Towards the online reconfiguration of a dependable distributed on-board computer
Obermaisser et al.	2009	Model-based development of MPSoCs with support for early validation
Dhawan et al.	2022	A System Model of Fault Tolerance Technique in the Distributed and Scalable System: A Review