Normal view MARC view ISBD view

Rathore, Neeraj :Fault Tolerance Mechanism

By: Rathore, Neeraj.
Publisher: Tamil Nadu i-manager's 2017Edition: Vol.4(1), Jan-June.Description: 28-34p.Subject(s): Computer EngineeringOnline resources: Click here In: i-manager's journal on cloud computing (JCC)Summary: Checkpointing is a technique for inserting fault tolerance into computing systems. It basically consists on storing a snapshot of the current application state, and uses it for restarting the execution in case of failure. It is saving the program state, usually to stable storage, so that it may be reconstructed later in time. Checkpointing provides the backbone for rollback recovery (fault-tolerance), playback debugging, process migration, and job swapping. It mainly focuses on fault-tolerance, process migration, and the performance of checkpointing on all computational platforms from uniprocessors to supercomputers. Checkpointing and restart has been one of the most widely used techniques for fault tolerance in large parallel applications. By periodically saving application status to permanent storage (disk or tape), the execution can be restarted from the last checkpoint if system faults occur. It is an effective approach to tolerating both hardware and software faults. For example, a user who is writing a long program at a terminal can save the input buffer occasionally to minimize the rewriting caused by failures that affect the buffer.
Tags from this library: No tags from this library for this title. Log in to add tags.
    average rating: 0.0 (0 votes)
Item type Current location Call number Status Date due Barcode Item holds
Articles Abstract Database Articles Abstract Database School of Engineering & Technology
Archieval Section
Not for loan 2019898
Total holds: 0

Checkpointing is a technique for inserting fault tolerance into computing systems. It basically consists on storing a snapshot of the current application state, and uses it for restarting the execution in case of failure. It is saving the program state, usually to stable storage, so that it may be reconstructed later in time. Checkpointing provides the backbone for rollback recovery (fault-tolerance), playback debugging, process migration, and job swapping. It mainly focuses on fault-tolerance, process migration, and the performance of checkpointing on all computational platforms from uniprocessors to supercomputers.

Checkpointing and restart has been one of the most widely used techniques for fault tolerance in large parallel applications. By periodically saving application status to permanent storage (disk or tape), the execution can be restarted from the last checkpoint if system faults occur. It is an effective approach to tolerating both hardware and software faults. For example, a user who is writing a long program at a terminal can save the input buffer occasionally to minimize the rewriting caused by failures that affect the buffer.

There are no comments for this item.

Log in to your account to post a comment.

Click on an image to view it in the image viewer

Unique Visitors hit counter Total Page Views free counter
Implemented and Maintained by AIKTC-KRRC (Central Library).
For any Suggestions/Query Contact to library or Email: librarian@aiktc.ac.in | Ph:+91 22 27481247
Website/OPAC best viewed in Mozilla Browser in 1366X768 Resolution.

Powered by Koha