JPH09204340A

JPH09204340A - Multi-volume audit trace for fault tolerant computer

Info

Publication number: JPH09204340A
Application number: JP8008775A
Authority: JP
Inventors: J Sukaaperosu Michael; ジェイスカーペロスマイケル; Van Del Linden Robert; ファンデルリンデンロバート; J Curley William; ジェイカーレイウィリアム; A Ryan James; エイライアンジェームズ; C Mclean Matthew; シーマックリーンマシュー
Original assignee: Tandem Computers Inc
Current assignee: Tandem Computers Inc
Priority date: 1996-01-23
Filing date: 1996-01-23
Publication date: 1997-08-05

Abstract

PROBLEM TO BE SOLVED: To utilize a stored audit trace file for restoring on-line by enabling an audit trail constitution control processor to control the generation, renaming and erasing of the audit trace file on an audit trace disk storage device. SOLUTION: Both of a data disk processor 200 and the audit trace constitution control processor 220 generate an audit record so as to allow restoration to a consistent state in case of a fault. Primary and backup ADP controls the access of a disk volume storing the audit trace file. Primary ADP is normally provided with the control of disk access but in case of a fault, it is replaced with backup. In order to allow this taking over, primary ADP sends a periodical check point to the backup including all disk state information necessary for smooth taking over. These processings mutually work by exchanging messages through a messaging system 222.

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、フォールトトレラント
計算機システムに関し、特に万一故障の場合にはデータ
ベースの一貫した回復を可能にするようにデータベース
に対する変更を記録する技術に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to fault tolerant computer systems, and more particularly to techniques for recording changes to a database to allow consistent recovery of the database in the event of a failure.

【０００２】[0002]

【従来の技術】フォールトトレラント計算機システムの
設計に対する基本は、トランザクションと呼ばれるプロ
グラム構成体である。トランザクションは、データベー
スの内容を一つの一貫した状態から別のものに変更す
る、明確に区切られたオペレーション、または関連した
オペレーションのセットである。トランザクション内の
データベースオペレーションは、単一単位として取り扱
われる。トランザクションによって実行された変更の全
てが不変(permanent) にされる（トランザクションが処
理される）かまたは変更が不変にされない（トランザク
ションが中止される）かのどちらかである。トランザク
ションの実行中に故障が発生したならば、データベース
になされたどんな部分的な変更も自動的に取り消され
て、一貫した状態にデータベースを残しておく。トラン
ザクションがデータベースへのその変更を不変的に処理
する前に、トランザクションにより影響されるデータベ
ースの行またはレコードについての情報は、通称監査証
跡に書き込まれる。構想段階では、監査証跡をデータベ
ースに対する変更の来歴として考えることができる。監
査証跡は、そのレコードがデータベースに対する変更を
記述する一連のファイルで構成される。監査証跡レコー
ドは、変更されたデータベースレコード（または物理的
ページ）の更新前及び更新後イメージで一般に構成され
る。更新前イメージでは、データベースシステムは、ア
プリケーションプログラムがシステムの故障によって終
了することを中止または失敗したときに発生する、不完
全な変更を取り消すことができる。更新後イメージで
は、データベースシステムは、データベースファイルの
古い（できるだけ一致しない）コピーを復元しかつ以前
の変更を再び行うことにより媒体の故障から回復するこ
とができる。この情報を包含している監査証跡に対する
他の用語は、監査ログ、またはジャーナルを含む。The basis for the design of fault tolerant computer systems is a program construct called a transaction. A transaction is a delimited set of operations or a set of related operations that changes the contents of a database from one consistent state to another. Database operations within a transaction are treated as a single unit. Either all the changes made by the transaction are made permanent (the transaction is processed) or the changes are not made immutable (the transaction is aborted). If a failure occurs during the execution of a transaction, any partial changes made to the database are automatically undone, leaving the database in a consistent state. Information about a database row or record affected by a transaction is written to the so-called audit trail before the transaction immutably processes its changes to the database. At the conceptual stage, the audit trail can be thought of as a provenance of changes to the database. The audit trail consists of a series of files whose records describe changes to the database. Audit trail records generally consist of pre-update and post-update images of modified database records (or physical pages). The pre-update image allows the database system to undo incomplete changes that occur when an application program stops or fails to terminate due to a system failure. In the post-update image, the database system can recover from a media failure by restoring an old (possibly inconsistent) copy of the database file and making the previous changes again. Other terms for audit trails containing this information include audit logs, or journals.

【０００３】一般に、監査証跡を構成している一連のフ
ァイルは、単一のディスクボリューム上に物理的に記憶
される。ディスクボリュームの連続監査証跡ファイルが
満たされると、アーカイビング処理は、連続監査証跡フ
ァイルをテープに移送し、ファイルは、新しく生成され
たレコードを記憶するために利用可能になる。Generally, the series of files that make up the audit trail are physically stored on a single disk volume. When the continuous audit trail file on the disk volume is full, the archiving process transports the continuous audit trail file to tape, and the file becomes available for storing newly created records.

【０００４】[0004]

【発明が解決しようとする課題】監査証跡ファイルの物
理的記憶に対するこのアプローチは、多くの不都合を持
つ。新しく生成された監査レコードを記憶している処理
は、ディスクアクセスについて先に満たされた監査ファ
イルのアーカイビングと競合しなければならない。この
コンテンション（争奪）は、監査生成の許容割合い及び
最終的にトランザクション処理スピードを効果的に制限
することができる。古い監査レコードをアーカイブ（保
存）するためのテープの利用可能性は、利用可能な記憶
の総量に関するあらゆる制限を除去するけれども、保存
された監査証跡ファイルは、オンライン回復のために利
用できない。オンライン回復は、単一のディスクボリュ
ームに記憶された監査レコードに制限される。ディスク
コンテンション問題に対する一つの部分的な解決策は、
ここにその内容が参考文献として示される、J. Gray et
al., “Transaction Processing Con-cepts and Techn
iques ”, Morgan Kauffman,1993年に表されている。こ
の文献のセクション9.6.4 に示された、Gray et al. の
技術は、ディスクコンテンションの問題を改善する。残
念ながら、オンライン回復は、それにもかかわらず単一
ディスクボリュームに制限される。This approach to physical storage of audit trail files has many disadvantages. The process storing the newly generated audit record must compete with the previously satisfied audit file archiving for disk access. This contention can effectively limit the acceptable rate of audit generation and ultimately transaction processing speed. Although the availability of tape to archive old audit records removes any restrictions on the total amount of storage available, saved audit trail files are not available for online recovery. Online recovery is limited to audit records stored on a single disk volume. One partial solution to the disk contention problem is
Its contents are provided here as a reference, J. Gray et
al., “Transaction Processing Con-cepts and Techn
iques ", Morgan Kauffman, 1993. The technique of Gray et al., presented in section 9.6.4 of this document, improves the problem of disk contention. Unfortunately, online recovery is Nevertheless, it is limited to a single disk volume.

【０００５】本発明の目的は、上記従来技術における問
題点を解決するフォールトトレラント計算機用マルチボ
リューム監査証跡を提供することである。An object of the present invention is to provide a multi-volume audit trail for a fault-tolerant computer that solves the above problems in the prior art.

【０００６】[0006]

【課題を解決するための手段】本発明の上記目的は、フ
ォールトトレラント計算機システムであって、監査レコ
ードを生成する監査ジェネレータと、監査レコードを監
査ファイルに記憶する複数の監査証跡ディスク記憶装置
と、監査ジェネレータから監査レコードを受け取りかつ
監査レコードを監査証跡ディスク記憶装置に導く複数の
監査証跡記憶処理手段と、先に割り当てられた監査ファ
イルがいっぱいになると現行割当て監査ファイルである
べき新しい監査ファイル及び現行応答監査証跡記憶処理
であるべき新しい監査証跡記憶処理手段を選択する複数
の監査証跡記憶処理手段に結合された監査証跡構成処理
とを備え、各監査レコードは、監査ジェネレータによっ
てアクセスされたデータベースへの変更を記載し、各監
査証跡記憶処理手段は、少なくとも一つの監査証跡記憶
ディスク装置へのアクセスを有し、監査レコードは、監
査ジェネレータから現行応答監査証跡記憶処理手段を通
って現行割当て監査ファイルへ導かれ、各新しい監査フ
ァイルは、そのすぐ前の監査ファイルとは異なる監査証
跡ディスク記憶装置上に配置され、各新しい監査証跡記
憶処理手段は、応答性のハンドオフを開始するためにそ
のすぐ前の監査証跡記憶処理手段からロールオーバーメ
ッセージを受け取るフォールトトレラント計算機システ
ムによって達成される。The above object of the present invention is a fault tolerant computer system, comprising: an audit generator for generating audit records; a plurality of audit trail disk storage devices for storing audit records in an audit file; A plurality of audit trail storage processing means for receiving audit records from the audit generator and directing the audit records to the audit trail disk storage device, and a new audit file that should be the current allocated audit file when the previously allocated audit file is full and the current A response audit trail storage operation that selects a new audit trail storage processing means, and an audit trail configuration processing coupled to the plurality of audit trail storage processing means, each audit record to a database accessed by the audit generator. Describe the changes, and each audit trail storage processor Has access to at least one audit trail storage disk unit, and the audit records are directed from the audit generator through the current response audit trail storage processing means to the currently allocated audit file, each new audit file being immediately Located on a different audit trail disk store than the previous audit file, each new audit trail store process receives a rollover message from its immediate previous audit trail store process to initiate a responsive handoff. Achieved by a fault tolerant computer system.

【０００７】また、本発明の上記目的は、監査ジェネレ
ータ、第１及び第２のプライマリ監査証跡記憶処理、及
び第１及び第２のプライマリ監査証跡記憶処理に対する
バックアップとして役立っている第１及び第２のバック
アップ監査証跡記憶処理を備えているフォールトトレラ
ント計算機システムにおいて、監査証跡記憶処理は、監
査ジェネレータによって生成された監査レコードを監査
記憶処理にアクセス可能な監査ファイルに記憶してお
り、第１のプライマリ監査証跡記憶処理から第２のプラ
イマリ監査証跡記憶処理へ現行生成監査レコードの記憶
を切り換える方法であって、第１のプライマリ監査証跡
記憶処理及び第１のバックアップ監査証跡記憶処理にア
クセス可能な第１の監査ファイルにおける記憶のために
監査ジェネレータから監査レコードのバッファを第１の
プライマリ監査証跡記憶処理で受け取り、監査レコード
のバッファを受け取ることにより、第１の監査ファイル
がいっぱいでありかつレコードのバッファを受け入れる
ことができないということを第１のプライマリ監査証跡
記憶処理で決定し、第１のプライマリ監査証跡記憶処理
から第２のプライマリ監査証跡記憶処理へロールオーバ
ー要求メッセージを送り、ロールオーバー要求メッセー
ジは、第２のプライマリ監査証跡記憶処理及び第２のバ
ックアップ監査証跡記憶処理にアクセス可能な第２の監
査ファイルを識別する固有のシーケンス番号を含み、第
１のプライマリ監査証跡記憶処理から監査ジェネレータ
へロールオーバー告知メッセージを送り、ロールオーバ
ー告知メッセージは、第１の監査ファイルがいっぱいで
ある表示及び現行生成監査レコードに対する正しい宛先
として第２のプライマリ監査証跡記憶処理を識別する情
報を含み、第２のプライマリ監査証跡記憶処理でロール
オーバー要求メッセージを受け取ることにより第２のプ
ライマリ監査証跡記憶処理から第２のバックアップ監査
証跡記憶処理へ第１のチェックポイント・メッセージを
送り、第１のチェックポイント・メッセージは、監査レ
コードの記憶が開始されるべき第２の監査ファイル内の
位置を識別するロケーター情報を含み、第１のチェック
ポイント・メッセージに応じて第２のバックアップ監査
証跡記憶処理から第２のプライマリ監査証跡記憶処理へ
第１のチェックポイント肯定応答メッセージを送り、ロ
ールオーバー要求メッセージに応じて第２のプライマリ
監査証跡記憶処理から第１のプライマリ監査証跡記憶処
理へロールオーバー肯定応答メッセージを送り、第１の
プライマリ監査証跡記憶処理から第１のバックアップ監
査証跡記憶処理へ第２のチェックポイント・メッセージ
を送り、第２のチェックポイント・メッセージは、第１
の監査ファイルがいっぱいであるという表示を含み、第
２のチェックポイント・メッセージに応じて、第１のバ
ックアップ監査証跡記憶処理から第１のプライマリ監査
証跡記憶処理へ第２のチェックポイント肯定応答メッセ
ージを送る段階を具備する方法によっても達成される。The above object of the present invention also serves as a backup for the audit generator, the first and second primary audit trail storage processes, and the first and second primary audit trail storage processes. In a fault-tolerant computer system equipped with backup backup audit trail storage processing of the first, the audit trail storage processing stores the audit records generated by the audit generator in an audit file accessible to the audit storage processing, and the first primary A method of switching the storage of currently generated audit records from an audit trail storage process to a second primary audit trail storage process, the first access to the first primary audit trail storage process and the first backup audit trail storage process. Audit generator for storage in each audit file By receiving the buffer of audit records in the first primary audit trail storage process and receiving the buffer of audit records, the first primary is that the first audit file is full and the buffer of records cannot be accepted. A rollover request message is sent from the first primary audit trail storage process to the second primary audit trail storage process, which is determined by the audit trail storage process, and the rollover request message includes the second primary audit trail storage process and the second primary audit trail storage process. Sending a rollover announcement message from the first primary audit trail storage operation to the audit generator, the rollover announcement message including a unique sequence number identifying a second audit file accessible to the backup audit trail storage operation of First audit file The second primary audit trail store operation receives a rollover request message containing information identifying the second primary audit trail store operation as the correct destination for the full indication and the currently generated audit record. A first checkpoint message is sent from the primary audit trail store process to the second backup audit trail store process, the first checkpoint message being in the second audit file where storage of audit records should be initiated. A first checkpoint acknowledgment message is sent from the second backup audit trail store operation to the second primary audit trail store operation in response to the first checkpoint message, including locator information identifying the location, and the rollover is performed. Second primary supervisor in response to request message Send a rollover acknowledgment message from the visa trail store process to the first primary audit trail store process, send a second checkpoint message from the first primary audit trail store process to the first backup audit trail store process, The second checkpoint message is the first
A second checkpoint acknowledgment message from the first backup audit trail store process to the first primary audit trail store process, in response to the second checkpoint message, including an indication that the audit file is full. It is also achieved by a method comprising the step of sending.

【０００８】更に、本発明の上記目的は、監査ジェネレ
ータ、プロトコル管理処理、及び複数の監査証跡記憶処
理を備えているフォールトトレラント計算機システムに
おいて、監査証跡記憶処理は、監査ジェネレータによっ
て生成された監査レコードを監査記憶処理にアクセス可
能な監査ファイルに記憶するためのものであり、監査証
跡記憶処理の中で現行生成監査レコードの記憶に対する
応答性を回す方法であって、ａ）プロトコル管理処理を
用いて、現行割当て監査証跡記憶処理であるべき選択さ
れた監査証跡記憶処理及び現行割当て監査ファイルであ
るべき選択された監査証跡記憶処理へアクセス可能な選
択された監査ファイルを割り当て、ｂ）現行割当て監査
ファイルにおける記憶のために監査ジェネレータから現
行割当て監査証跡記憶処理へ監査レコードのバッファを
伝送し、ｃ）現行割当て監査証跡記憶処理から現行割当
て監査ファイルへ監査ジェネレータから受け取った監査
レコードのバッファを書き込み、ｄ）現行割当て監査証
跡記憶処理で、連続するバッファが書き込まれるときに
現行割当て監査ファイルの成長を監視し、ｅ）現行割当
て監査証跡記憶処理で、現行割当て監査証跡の大きさを
第１のスレショルドと比較し、ｆ）大きさが第１の所定
のスレショルドを越えるというｅ）段階の決定により、
現行割当て監査証跡記憶処理からプロトコル管理処理へ
第１のスレショルド警告メッセージを送る段階を具備す
る方法によっても達成される。Further, the above object of the present invention is to provide a fault tolerant computer system comprising an audit generator, a protocol management process, and a plurality of audit trail storage processes, wherein the audit trail storage processes are the audit records generated by the audit generator. Is stored in an audit file that is accessible to the audit storage process, and is a method for rotating the responsiveness to the storage of the currently generated audit record in the audit trail storage process. Assigning a selected audit file that is accessible to the selected audit trail storage process that should be the current allocation audit trail storage process and the selected audit trail storage process that should be the current allocation audit file, b) the current assignment audit file Current allocation audit trail from audit generator for storage in Storage of audit record buffers, c) write buffer of audit records received from the audit generator from the current allocation audit trail storage process to the current allocation audit file, and d) consecutive buffers in current allocation audit trail storage process. Monitor the growth of the current allocation audit file when is written, e) compare the size of the current allocation audit trail with a first threshold in the current allocation audit trail storage process, and f) measure the first predetermined size. By the decision in step e) to cross the threshold of
It is also accomplished by a method comprising the step of sending a first threshold warning message from the current allocation audit trail storage process to the protocol management process.

【０００９】また、本発明の上記目的は、監査ジェネレ
ータ、複数の監査証跡記憶処理を備えているフォール
トトレラント計算機システムにおいて、監査証跡記憶処
理は、監査ジェネレータによって生成された監査レコー
ドを監査記憶処理にアクセス可能な監査ファイルに記憶
するためのものであり、連続する監査ファイルがいっぱ
いになると、監査ジェネレータによって生成された監査
レコードを記憶するための現行応答性は、先に応答を有
する監査証跡記憶処理から新しく応答を有する監査証跡
記憶処理へロールオーバーメッセージを送ることによっ
て転送され、連続的に用いられた監査ファイルは、順番
に固有のシーケンス番号を割り当てられ、かつ各監査証
跡記憶処理は、監査レコードを記憶するために監査証跡
記憶処理の一つによって用いられた少なくとも既知の監
査ファイルを識別するシーケンス番号を記憶し、第１の
監査証跡記憶処理で受け取ったロールオーバーメッセー
ジを処理するフォールトトレラント方法であり、第１の
監査証跡記憶処理は、あたかもそれが既に応答監査証跡
記憶処理であるかのように動作する方法であって、ａ）
第１の監査証跡記憶処理で、第２の監査証跡記憶処理か
らロールオーバーメッセージを受け取り、ロールオーバ
ーメッセージは、監査レコードを受け取るために次の監
査ファイルの監査ファイルシーケンス番号を含み、ｂ）
第１の監査証跡記憶処理で、ロールオーバーメッセージ
から監査ファイルシーケンス番号を抽出し、ｃ）受け取
った監査ファイルシーケンス番号を第１の監査証跡記憶
処理に記憶された少なくとも既知の監査ファイルシーケ
ンス番号と比較し、ｄ）受け取った監査ファイルシーケ
ンス番号が記憶された少なくとも既知の監査ファイルシ
ーケンス番号よりも大きいという段階ｃ）における決定
により、段階ｆ）へ進み、ｅ）段階ｈ）へ進み、ｆ）第
１の監査記憶処理内に記憶された監査ファイルシーケン
ス番号によって記憶された監査ファイルシーケンス番号
によって識別された監査ファイルを閉じ、ｇ）第１の監
査証跡記憶処理から監査レコードを受け取るためにロー
ルオーバーメッセージ内に含まれるシーケンス番号によ
って識別された新しい監査ファイルを開き、ｈ）ロール
オーバーメッセージの処理を終了する段階を具備する方
法によっても達成される。Further, the above object of the present invention is to provide an audit generator and a plurality of audit trail storage processes. In a fault tolerant computer system, the audit trail storage process converts an audit record generated by the audit generator into an audit storage process. The current responsiveness for storing audit records generated by the audit generator is to store audit trail storage operations when the consecutive audit files are full, for storing in an accessible audit file. From the new audit trail store with a new response, used by consecutive rollover messages, are sequentially assigned unique sequence numbers, and each audit trail store process is assigned an audit record. One of the audit trail storage processes to store Is a fault tolerant method of storing a sequence number identifying at least a known audit file used by and processing rollover messages received in a first audit trail storage process, the first audit trail storage process comprising: A method that operates as if it were already a response audit trail store, a)
The first audit trail storage process receives a rollover message from the second audit trail storage process, the rollover message including the audit file sequence number of the next audit file to receive the audit record, b).
In the first audit trail storage process, extract the audit file sequence number from the rollover message, and c) compare the received audit file sequence number with at least the known audit file sequence number stored in the first audit trail storage process. And d) the decision in step c) that the received audit file sequence number is greater than at least the stored known audit file sequence number, go to step f), e) go to step h), f) first. In the rollover message to close the audit file identified by the audit file sequence number stored by the audit file sequence number stored in the first audit trail storage process, and g) receive the audit record from the first audit trail storage process. New identified by the sequence number contained in There open audit files also achieved by a method comprising the step of terminating the process h) rollover message.

【００１０】[0010]

【作用】本発明によれば、フォールトトレラント計算機
システムは、任意の数のディスクボリュームにわたって
監査レコードを包含している監査証跡を分散する。一つ
の監査証跡ファイルが満たされた後で、監査レコード
は、異なるディスクボリュームに記憶された次の監査証
跡ファイルに向かって指向される。新しく生成された監
査証跡レコードの記憶は、利用可能なディスクボリュー
ムの中を回る。満たされた監査証跡ファイルの内容は、
最終的には保存されて、それらのスペースは、新しく生
成された監査レコードの記憶のために再び利用可能にな
る。故障の後のオンライン回復のために利用できる監査
の量は、どんな単一ディスクボリュームの記憶容量にも
制限されない。更に、満たされた監査証跡ファイルのア
ーカイビングと新しく生成された監査レコードの記憶と
の間のディスクアクセスに対するコンテンションは、存
在しない。本発明の一実施例では、フォールトトレラン
ト計算システムは、複数の処理装置及びディスク記憶装
置またはボリュームを含む。少なくとも一つの処理装置
は、データベースまたはシステム状態に対する変更を記
述する監査レコードを生成する処理を実行する。ディス
ク記憶装置のあるものは、監査レコードを受け取るため
に選択される。各々のそのように指定されたディスク記
憶装置は、ディスクアクセスを制御する処理装置の一つ
の上で走行している関連プライマリ監査証跡ディスク処
理（ＡＤＰ）及び万一該プライマリが故障した場合にデ
ィスクアクセスを支配する別の処理装置上で走行してい
るバックアップ監査証跡ディスク処理（ＡＤＰ）を有す
る。In accordance with the present invention, a fault tolerant computer system distributes an audit trail containing audit records across any number of disk volumes. After one audit trail file is filled, the audit records are directed towards the next audit trail file stored on a different disk volume. The storage of newly created audit trail records circulates among the available disk volumes. The contents of the filled audit trail file are
Eventually saved, those spaces will be made available again for storage of newly created audit records. The amount of audit available for online recovery after a failure is not limited to the storage capacity of any single disk volume. Furthermore, there is no contention for disk access between the archiving of filled audit trail files and the storage of newly created audit records. In one embodiment of the present invention, a fault tolerant computing system includes multiple processors and disk storage devices or volumes. At least one processing unit performs processing to generate audit records that describe changes to the database or system state. Some of the disk storage is selected to receive the audit records. Each such designated disk storage device has an associated primary audit trail disk processing (ADP) running on one of the processing units that controls disk access and disk access should the primary fail. Have a backup audit trail disk process (ADP) running on another processor that controls the.

【００１１】処理装置の一つの上で走行している別の処
理は、監査証跡構成管理処理として知られている。この
処理は、監査証跡ディスク記憶装置上の監査証跡ファイ
ルの生成、改名、及び消去を制御する。システムオペレ
ーター入力に応じて、監査証跡構成管理処理は、監査を
受け取るために用いるべきディスク記憶装置の数及び各
指定された記憶装置上に記憶された監査証跡ファイルの
数を構成する。監査ジェネレータ（生成プログラム）
は、現行監査証跡ファイルとして知られるプライマリ監
査証跡ファイルにそのレコードを指向する。この現行プ
ライマリ監査証跡記憶処理は、現行監査証跡ファイルの
成長を監視すると同時にレコードを記憶する。現行監査
証跡ファイルの大きさがスレショルド(threshold) に達
したときには、現行監査証跡記憶処理は、新しい監査証
跡ファイルを準備すべく監査証跡構成管理処理に指示す
る。監査証跡構成管理処理は、新しい監査証跡ファイル
を準備しかつ新しい監査証跡ファイルの名前を現行監査
証跡記憶処理に知らせる。現行監査証跡ファイルが満た
されたときには、現行プライマリ監査証跡記憶処理は、
新しい監査証跡ファイルへのアクセスを有している監査
証跡記憶処理の名前を監査ジェネレータに送る。また、
現行監査証跡記憶処理は、ロールオーバーメッセージを
新しい監査証跡記憶処理に送る。本発明は、監査レコー
ド記憶がロールオーバーの間中に発生するフォールトに
よって妨害されないことを保証するために特別なロール
オーバーメッセージプロトコルを供給する。Another process running on one of the processing units is known as an audit trail configuration management process. This process controls the creation, renaming, and deletion of audit trail files on the audit trail disk storage device. In response to system operator input, the audit trail configuration management process configures the number of disk storage devices to use to receive audits and the number of audit trail files stored on each designated storage device. Audit generator (generation program)
Directs its records to the primary audit trail file known as the current audit trail file. This current primary audit trail storage process monitors the growth of the current audit trail file while storing records. When the size of the current audit trail file reaches a threshold, the current audit trail storage process directs the audit trail configuration management process to prepare a new audit trail file. The audit trail configuration management process prepares a new audit trail file and informs the current audit trail storage process of the name of the new audit trail file. When the current audit trail file is full, the current primary audit trail store operation is
Send the name of the audit trail store that has access to the new audit trail file to the audit generator. Also,
The current audit trail store process sends a rollover message to the new audit trail store process. The present invention provides a special rollover message protocol to ensure that audit record storage is not disturbed by faults that occur during rollover.

【００１２】本発明は、また、ディスクボリュームにオ
ーバーフロー監査証跡記憶として指定されることを許容
する。オーバーフロー・スペースは、監査ダンプに対し
てテープを取付けるためにオペレーターが利用できない
か、または最も古いファイルが改名の資格を有する前に
プライマリ監査証跡を満たしてしまう監査発生の突然の
バーストが存在するときのような、極端な情況において
用いられる。監査証跡レコードは、オーバーフローボリ
ュームに転送され、それゆえに新しい監査発生のために
スペースを解放する。また、システムオペレーターは、
回復手順の一部として監査ダンプから復元された監査証
跡ファイルを保持するために用いられるローカルディス
クボリュームを特定することができる。アクティブ監査
証跡ディスクボリュームの数及びボリューム毎のファイ
ルの数のような種々の監査証跡構成パラメータは、オン
ラインで調整することができる。新しい監査ジェネレー
タは、既存の監査証跡に追加することができる。グラフ
ィックユーザーインターフェイスは、オペレーターに監
査証跡の進行状態を説明する表示手段を供給する。例え
ば、一つの棒グラフは、どのくらいの量の監査証跡が現
在使用されているのかをオペレーターに示す。システム
オペレーターは、現在のトランザクションワークロード
が監査証跡容量をオーバーフロースレショルドに向かっ
てまたは、それを越えて、監査発生が中断されなければ
ならない地点に向かってプッシュしているかどうかを一
目で知ることができる。The present invention also allows disk volumes to be designated as overflow audit trail storage. Overflow space is not available to the operator to mount the tape for the audit dump, or there is a sudden burst of audit occurrences that fills the primary audit trail before the oldest file is eligible to be renamed. Used in extreme situations, such as. Audit trail records are transferred to the overflow volume, thus freeing space for new audit occurrences. Also, the system operator
It is possible to identify the local disk volume used to hold the audit trail files restored from the audit dump as part of the recovery procedure. Various audit trail configuration parameters such as the number of active audit trail disk volumes and the number of files per volume can be adjusted online. New audit generators can be added to existing audit trails. The graphical user interface provides the operator with a display to explain the progress of the audit trail. For example, one bar chart shows the operator how much audit trail is currently being used. System operators can see at a glance if the current transaction workload is pushing audit trail capacity towards or beyond the overflow threshold, to the point where audit occurrences must be interrupted. .

【００１３】本発明は、添付した図面に関して以下の詳
細な説明を参照することによってより理解されうるであ
ろう。The present invention may be better understood by reference to the following detailed description in connection with the accompanying drawings.

【００１４】[0014]

【実施例】本発明は、マルチ処理が同時に走行しかつメ
ッセージを交換するフォールトトレラント計算機システ
ムにおける監査レコードの記憶に関する。用語“処理”
は、処理が取るべきアクション及びそれが読取り、書込
み、かつ操作することができるデータ値の集まりを画定
している順序付けられた機械語命令の集まりによって画
定された活動の流れを表わす。多重処理は、フォールト
トレラント計算機システム内で同時にかつ非同期に走行
しうる。用語“メッセージ”は、一つの処理から別の処
理に伝送された情報の単位を表わす。メッセージは、
“待った(waited)”かまたは“待たない”のいずれかで
ありうる。“待ったメッセージ(waited message)”は、
それ（送信側）が応答を得るまで送信側が先に進まない
という、一つの処理から別の処理に送られたメッセージ
である。“待たないメッセージ”は、送信側が応答を待
つことなく先に進むという、一つの処理から別の処理に
送られたメッセージである。送信側は、非同期に応答を
受け入れる。ある一定の特別な型の処理は、本発明に関
連する。“バックアップ処理”及び“プライマリ処理”
は、プライマ処理が失敗したならばバックアップ処理が
プライマリ処理に取って代わる、“処理ペア(process p
air)”を形成する。一緒に、処理ペアは、単一論理エン
ティティとして考えられる。プライマリ処理は、万一故
障の場合にテークオーバー(takeover)をイネーブルする
ために必要な状態情報を含むメッセージである定期(per
iodic)“チェックポイント”をバックアップ処理に送
る。DETAILED DESCRIPTION OF THE INVENTION The present invention relates to the storage of audit records in a fault tolerant computer system in which multiple processes run simultaneously and exchange messages. The term “processing”
Represents the flow of activity defined by the ordered set of machine language instructions that define the set of actions that the process should take and the data values it can read, write, and manipulate. Multiple processes can run simultaneously and asynchronously within a fault tolerant computer system. The term "message" refers to a unit of information transmitted from one process to another. The message is
It can either be “waited” or “not wait”. The “waited message” is
A message sent from one process to another, in which the sender does not proceed until it (sender) gets a response. A "waitless message" is a message sent from one process to another process in which the sender proceeds without waiting for a response. The sender accepts the response asynchronously. Certain special types of processing are relevant to the present invention. "Backup process" and "Primary process"
The backup process replaces the primary process if the primer process fails.
Together, the processing pairs are considered as a single logical entity. The primary processing is a message that contains the state information needed to enable takeover in case of a failure. A certain period (per
iodic) Send "checkpoint" to backup process.

【００１５】“ディスク処理”は、物理的ディスクボリ
ュームを管理する処理である。“データボリューム”ま
たは“データディスク処理”は、データベースファイル
を管理する処理である。“監査証跡レコード”は、デー
タベースまたはシステム状態に対する変更を記述する。
監査証跡レコードは、“更新後イメージ”及び“更新前
イメージ”を含みうる。“更新後イメージ”は、変更が
それになされた後のデータベースレコードまたは物理的
ページのコピーである。“更新前イメージ”は、変更が
それになされる前のデータベースレコードまたは物理的
ページのコピーである。“データボリューム”は、デー
タベースファイルに対する更新に関連した監査レコード
を生成する。“監査証跡ファイル”は、監査証跡レコー
ドのファイルである。“監査証跡”は、順序付けられた
監査証跡ファイルの順序である。“監査証跡ディスク処
理（ＡＤＰ）”は、レコードを受け取りかつそのレコー
ドを監査証跡ファイルに書き込むディスク処理である。
“監査ジェネレータ（生成プログラム）”は、監査レコ
ードをＡＤＰに送るあらゆる処理である。監査ジェネレ
ータの例は、データディスク処理及び監査証跡構成管理
処理を含む。"Disk processing" is processing for managing physical disk volumes. "Data volume" or "data disk process" is a process for managing database files. An "audit trail record" describes changes to the database or system state.
The audit trail record may include a "post-update image" and a "pre-update image". An "updated image" is a copy of a database record or physical page after changes have been made to it. A "pre-update image" is a copy of a database record or physical page before any changes were made to it. The "data volume" produces audit records related to updates to database files. The “audit trail file” is a file of audit trail records. An "audit trail" is an ordered sequence of audit trail files. "Audit trail disk processing (ADP)" is disk processing that receives a record and writes the record to an audit trail file.
An "audit generator" is any process that sends audit records to ADP. Examples of audit generators include data disk processing and audit trail configuration management processing.

【００１６】本発明は、後続ファイルが、異なるＡＤＰ
により管理される異なるディスクボリューム上に存在す
る監査証跡である“マルチボリューム監査証跡”を供給
する。監査ジェネレータは、監査ジェネレータに割当て
られた監査証跡に属している“現行監査証跡ファイル”
を管理するＡＤＰに現行生成された監査を送る。本発明
のコンテキストでは、“監査証跡構成管理処理”は、監
査証跡のファイルのシーケンス（順序）における次の現
行監査証跡ファイルを準備する。“ロールオーバー”
は、いっぱいになった現行監査証跡ファイルを使用する
ことから監査証跡のシーケンスにおける次の監査証跡フ
ァイルを使用することへの遷移である。図１は、本発明
によるフォールトトレラント計算機システム１００を示
す。フォールトトレラント計算機システム１００は、マ
ルチ処理装置１０２、１０４、１０６、１０８、デバイ
スコントローラ１１０、１１２、１１４、１１６、ディ
スク記憶装置またはディスクボリューム１１８、１２
０、１２２、及びテープ記憶装置１２４を含む。また、
フォールトトレラント計算機システム１００は、システ
ム端末装置１２６も備えている。システムバス１２８
は、処理装置１０２、１０４、１０６、１０８とシステ
ム端末装置１２６を相互接続する。デバイスコントロー
ラ１１０は、ディスクボリューム１１８へのシステムア
クセスを供給し、デバイスコントローラ１１２は、ディ
スクボリューム１２０へのアクセスを供給し、デバイス
コントローラ１１４は、ディスクボリューム１２２への
アクセスを供給し、デバイスコントローラ１１６は、テ
ープ記憶装置へのアクセスを供給する。示されたハード
ウェア構成要素の数、形式、及び配置は、本発明を実施
するために用いられうる素子の例示である。According to the present invention, the subsequent files have different ADPs.
Provides a "multi-volume audit trail" which is an audit trail that resides on different disk volumes managed by. The audit generator is a “current audit trail file” that belongs to the audit trail assigned to the audit generator.
Send the currently generated audit to the ADP that manages the. In the context of the present invention, the "audit trail configuration management process" prepares the next current audit trail file in a sequence of files in the audit trail. "Roll over"
Is the transition from using a full current audit trail file to using the next audit trail file in the sequence of audit trails. FIG. 1 illustrates a fault tolerant computer system 100 according to the present invention. The fault tolerant computer system 100 includes a multi-processing device 102, 104, 106, 108, a device controller 110, 112, 114, 116, a disk storage device or a disk volume 118, 12.
0, 122, and tape storage device 124. Also,
The fault tolerant computer system 100 also includes a system terminal device 126. System bus 128
Interconnects the processing devices 102, 104, 106, 108 and the system terminal device 126. The device controller 110 provides system access to the disk volume 118, the device controller 112 provides access to the disk volume 120, the device controller 114 provides access to the disk volume 122, and the device controller 116 , Providing access to tape storage. The numbers, types, and arrangements of the hardware components shown are exemplary of the elements that may be used to implement the present invention.

【００１７】図２は、本発明によるフォールトトレラン
ト計算機システム１００上で走行する種々の処理を示し
ている処理説明図である。データディスク処理またはデ
ータボリューム２００が示されている。データディスク
処理２００は、ディスクボリュームの一つに記憶された
データベースレコードを変更すべく動作しかつ監査レコ
ードを生成する。データディスク処理２００は、ディス
クボリュームを変更する複数の処理を実際に表しうる。
監査証跡ディスク処理（ＡＤＰ）ペア２０２、２０４、
２０６は、プライマリ処理及びバックアップ処理を含
む。監査証跡構成管理処理２２０は、監査証跡ディスク
記憶装置上の監査証跡ファイルの生成、改名、及び消去
を制御する役割を果たしている。万一故障の場合にこれ
らの機能を取って代わるべくバックアップ監査証跡構成
管理処理（図示省略）も設けられている。ここで、記憶
に対する監査レコードを生成するあらゆる処理は、監査
ジェネレータと称する。図２を参照すると、データディ
スク処理２００及び監査証跡構成管理処理２２０の両方
は、万一故障の場合に一貫した状態への回復を許容すべ
く監査レコードを生成する。本発明によれば、各監査ジ
ェネレータは、監査を記憶するための順序付けられた監
査証跡ファイルのシーケンスまたは関連監査証跡を有す
る。プライマリ及びバックアップＡＤＰは、監査証跡フ
ァイルを記憶するディスクボリュームへのアクセスを制
御する。プライマリＡＤＰは、ディスクアクセスの制御
を通常有するが、万一故障の場合には、バックアップが
取って代わる。このテークオーバーを許容するために、
プライマリＡＤＰは、円滑なテークオーバーに必要なデ
ィスク状態情報の全てを含んでいるそのバックアップに
定期(periodic)チェックポイントを送る。FIG. 2 is a process explanatory diagram showing various processes running on the fault tolerant computer system 100 according to the present invention. A data disk process or data volume 200 is shown. The data disk process 200 operates to modify a database record stored on one of the disk volumes and generates an audit record. The data disk process 200 may actually represent multiple processes for changing a disk volume.
Audit trail disk processing (ADP) pair 202, 204,
206 includes a primary process and a backup process. The audit trail configuration management process 220 plays a role of controlling generation, renaming, and deletion of the audit trail file on the audit trail disk storage device. A backup audit trail configuration management process (not shown) is also provided to replace these functions in the event of a failure. Here, any process that creates an audit record for storage is referred to as an audit generator. Referring to FIG. 2, both the data disk process 200 and the audit trail configuration management process 220 generate audit records to allow recovery to a consistent state in the event of a failure. According to the invention, each audit generator has an ordered sequence of audit trail files or associated audit trails for storing audits. The primary and backup ADPs control access to the disk volumes that store the audit trail files. The primary ADP usually has control of disk access, but in case of failure, backup replaces it. To allow this takeover,
The primary ADP sends a periodic checkpoint to its backup that contains all of the disk state information needed for a smooth takeover.

【００１８】図２の処理は、図１の処理装置上で走行す
る。単一処理装置は、一つ以上の処理を走行する。処理
ペアのプライマリ処理及びバックアップ処理は、異なる
処理装置上で走行する。図２の処理は、メッセージング
システム２２２を介してメッセージを交換することによ
り相互に作用する。メッセージングシステム２２２の動
作は、構内通信処理が同じ処理装置上または異なる処理
装置上のどちらで動作するかには無関係である。必要な
らば、メッセージ情報は、システムバス１２８を介して
伝送される。図２は、本発明についての説明を意図した
ものであり、本発明は、多数の監査生成及び監査記憶処
理の組合せのコンテキストで動作しうる。図３は、本発
明によるマルチディスクボリュームにまたがる監査証跡
を確立しかつ動作する初期段階を示しているフローチャ
ートである。監査証跡構成管理処理２２０は、段階３０
０で監査証跡構成データ構造を生成する。この監査証跡
構成データ構造は、監査証跡に含まれるべきＡＤＰの識
別、各ＡＤＰにより管理されるべき監査証跡ファイルの
数、及び各ファイルの大きさを含む。好ましい実施例で
は、監査証跡は、１６のディスク記憶装置にまでまたが
りうる。しかしながら、監査証跡におけるディスク記憶
装置の最大数は、任意である。The process of FIG. 2 runs on the processor of FIG. A single processor runs more than one process. The primary and backup processes of the process pair run on different processing units. The processes of FIG. 2 interact by exchanging messages via messaging system 222. The operation of the messaging system 222 is independent of whether the local area communication process operates on the same or a different processing device. If necessary, message information is transmitted via system bus 128. FIG. 2 is intended to be illustrative of the present invention, which may operate in the context of multiple audit generate and audit store combinations. FIG. 3 is a flow chart showing the initial stages of establishing and operating an audit trail across multi-disk volumes according to the present invention. The audit trail configuration management process 220 is step 30.
Generate an audit trail configuration data structure at 0. This audit trail configuration data structure includes the identification of the ADPs that should be included in the audit trail, the number of audit trail files that should be managed by each ADP, and the size of each file. In the preferred embodiment, the audit trail can span up to 16 disk storage devices. However, the maximum number of disk storage devices in the audit trail is arbitrary.

【００１９】段階３１０で、監査証跡構成管理処理２２
０は、監査証跡におけるＡＤＰのそれぞれに開始メッセ
ージを送る。このとき、監査証跡に関連する全てのＡＤ
Ｐは、名前及び独自の現行監査証跡ファイルのシーケン
ス番号を学ぶ。段階３２０で、監査証跡構成管理処理２
２０は、監査証跡を使用する監査ジェネレータにメッセ
ージを送る。メッセージは、現行ＡＤＰ、現行監査証跡
ファイルへのアクセスを有しているプライマリＡＤＰの
識別を含む。図４は、本発明により段階３００で生成さ
れた代表的な監査証跡構成データ構造４００の部分を示
す。監査証跡構成データ構造４００は、アクティブ監査
証跡の内包のためにオペレーターにより選択されたディ
スク装置を制御しているＡＤＰに対応しているエントリ
ー４０４、４０６、及び４０８を含んでいるアクティブ
ボリューム・リンクド・リスト４０２を含む。各ＡＤＰ
エントリーは、ＡＤＰにより制御されるディスク記憶装
置上の予め割り当てられた監査証跡ファイルを識別して
いるエントリーＰ₁〜Ｐ₃の関連リンクド・リスト４１
０、４１２、及び４１４を有する。各ＡＤＰエントリー
は、常駐アクティブ監査証跡ファイルの関連リンクド・
リストも有するが、初期的にこのリストは、空白であ
る。In step 310, audit trail configuration management process 22
0 sends a start message to each of the ADPs in the audit trail. At this time, all AD related to the audit trail
P learns the name and unique current audit trail file sequence number. In step 320, audit trail configuration management process 2
20 sends a message to the audit generator using the audit trail. The message includes the current ADP, the identification of the primary ADP that has access to the current audit trail file. FIG. 4 shows a portion of an exemplary audit trail configuration data structure 400 generated in step 300 according to the present invention. The audit trail configuration data structure 400 includes an active volume linked file containing entries 404, 406, and 408 corresponding to the ADP controlling the disk device selected by the operator for inclusion of the active audit trail. Contains list 402. Each ADP
Entry associated linked list of entries P ₁ to P ₃ which identify the pre-assigned audit trail file on the disk storage device that is controlled by the ADP 41
0, 412, and 414. Each ADP entry is associated with a linked active audit trail file.
Initially this list is blank, although it also has a list.

【００２０】図５は、最初のファイルが現行監査証跡フ
ァイルであるべく選択された後の監査証跡構成データ構
造４００の部分を示す。予め割り当てられたファイルエ
ントリーＰ₃により識別されたファイルは、監査証跡構
成管理処理２２０により改名されかつＡＤＰエントリー
４０４に関連する常駐アクティブ監査証跡ファイルの新
しいリンクド・リスト４１６におけるエントリーＦ₁に
よりここで識別される。エントリーＦ₁により識別され
たファイルは、それゆえに第１の現行監査証跡ファイル
でありかつＡＤＰ₁は、第１の現行ＡＤＰである。新し
く生成された監査は、現行ＡＤＰを介して現行監査証跡
ファイルに書き込まれる。現行ＡＤＰは、現行監査証跡
ファイルの成長を監視する。現行監査証跡ファイルがス
レショルドに達したときには、現行ＡＤＰは、次の監査
証跡ファイルを識別しかつ準備することを監査証跡構成
管理処理２２０に要求する。次の監査証跡ファイルは、
用いられていない予め割り当てられたものかまたは改名
に対して、所定の基準の下で、資格がある、アクティブ
監査証跡ファイルのいずれかでありうる。好ましい実施
例では、改名に対する資格を決定するために以下の基準
が用いられる。一般に、有資格であるためには、候補監
査証跡ファイルは、テープにダンプされていなければな
らない。（好ましい実施例では、オペレーターは、ダン
ピングが資格のために要求されない場合にはテープに監
査を保存しないことを選びうる。）候補ファイルは、
１）フォールトトレラント計算機システム２００を再起
動すること、２）監査ジェネレータのキャッシュに維持
されたデータを復元すること、または３）現在保留して
いるトランザクションを取り消すことに必要な監査を含
まなくてもよい。現行監査証跡ファイルも改名できな
い。好ましい実施例では、例えば遠隔バックアップ処理
のような、他の雑処理に対して、監査証跡ファイルを改
名するための資格を否定することもできる。FIG. 5 shows a portion of the audit trail configuration data structure 400 after the first file has been selected to be the current audit trail file. The file identified by the pre-allocated file entry P ₃ is renamed here by the audit trail configuration management process 220 and is now identified by the entry F ₁ in the new linked list 416 of resident active audit trail files associated with the ADP entry 404. To be done. The file identified by entry F ₁ is therefore the first current audit trail file and ADP ₁ is the first current ADP. The newly generated audit is written to the current audit trail file via the current ADP. Current ADP monitors the growth of current audit trail files. When the current audit trail file reaches the threshold, the current ADP requests the audit trail configuration management process 220 to identify and prepare the next audit trail file. The following audit trail file is
It can be either an active audit trail file that is eligible under certain criteria for pre-assigned or renamed unused. In the preferred embodiment, the following criteria are used to determine eligibility for rename: In general, the candidate audit trail file must be dumped to tape in order to be qualified. (In the preferred embodiment, the operator may choose not to save the audit to tape unless dumping is required due to qualification.) The candidate file is
Without including the audit required to 1) restart the fault-tolerant computer system 200, 2) restore the data maintained in the audit generator's cache, or 3) cancel the currently pending transaction. Good. The current audit trail file cannot be renamed either. In the preferred embodiment, other miscellaneous processes, such as remote backup processes, may be denied the qualification to rename the audit trail file.

【００２１】図６は、複数のファイルが監査で満たされ
た後の監査証跡構成データ構造４００の部分を示す。連
続アクティブ監査証跡ファイルは、シーケンス番号を増
加することによって識別され、かつ各ＡＤＰエントリー
に関連した（対応付けられた）常駐リンクド・リスト４
１６、４１８、及び４２０を通して分配されたＦ₁、Ｆ
₂、Ｆ₃．．．とマークされた関連エントリーを有す
る。図７及び図８は、本発明により次の現行監査証跡フ
ァイルを選択する段階を記述しているフローチャートを
示す。段階５００で、監査証跡構成管理処理２２０は、
次の監査証跡ファイルを準備するために現行ＡＤＰから
要求を受け取る。段階５０２で、監査証跡構成管理処理
２２０は、ボリュームリンクド・リスト４０２における
現行ＡＤＰエントリーを識別する。段階５０４で、監査
証跡構成管理処理２２０は、新しい現行ＡＤＰであるべ
き候補としてボリュームリンクド・リスト４０２におい
て次のＡＤＰエントリーを選択する。段階５０６で、監
査証跡構成管理処理２２０は、候補ＡＤＰエントリーに
より識別されたボリュームが立ち上がっているかどうか
を決定する。そのボリュームが立ち上がっているなら
ば、段階５０８で、候補ＡＤＰエントリーに関連した予
め割り当てられたファイルリンクド・リストにおけるエ
ントリーにより識別された予め割り当てられたファイル
が存在するかどうかを決定する。候補ＡＤＰエントリー
に対する予め割り当てられたファイルリンクド・リスト
がエントリーを有するならば、リストにおける最後のエ
ントリーにより識別されたファイルは、段階５１０で次
の現行監査証跡ファイルとして選択され、かつ候補ＡＤ
Ｐは、次の現行ＡＤＰである。FIG. 6 shows a portion of the audit trail configuration data structure 400 after multiple files have been filled with an audit. Sequential active audit trail files are identified by incrementing the sequence number and associated (associated) resident linked list 4 for each ADP entry.
F ₁ , F distributed through 16, 418, and 420
_2, F _3. . . It has related entries marked as. 7 and 8 show a flow chart describing the steps of selecting the next current audit trail file in accordance with the present invention. In step 500, the audit trail configuration management process 220
Receive a request from the current ADP to prepare the next audit trail file. At step 502, the audit trail configuration management process 220 identifies the current ADP entry in the volume linked list 402. At step 504, the audit trail configuration management process 220 selects the next ADP entry in the volume linked list 402 as a candidate to be the new current ADP. At step 506, the audit trail configuration management process 220 determines if the volume identified by the candidate ADP entry is up. If the volume is up, then step 508 determines if there is a pre-allocated file identified by the entry in the pre-allocated file linked list associated with the candidate ADP entry. If the pre-allocated file linked list for the candidate ADP entry has entries, the file identified by the last entry in the list is selected at step 510 as the next current audit trail file and the candidate AD
P is the next current ADP.

【００２２】次に、予め割り当てられたファイルリンク
ド・リストは、最後のエントリーを除去することによっ
て更新される。常駐ファイルリンクド・リストは、現行
監査証跡ファイルのシーケンス番号よりも一つ大きいフ
ァイルシーケンス番号をエントリーに加えることによっ
て更新される。予め割り当てられたファイルが利用でき
ないならば、監査証跡構成管理処理２２０は、候補ＡＤ
Ｐエントリーの常駐ファイルリンクド・リスト上のエン
トリーの中に改名有資格ファイルが存在するかどうか見
るために、段階５１２で、チェックする。この常駐ファ
イルリンクド・リストが改名有資格エントリーを有する
ならば、最も低いシーケンス番号を有する改名有資格フ
ァイルは、段階５１０で次の現行監査証跡ファイルとし
て識別される。次に、このファイルは、使用の前に改名
される。常駐ファイルリンクド・リストは、改名有資格
監査証跡ファイルを識別している以前のエントリーを取
り除きかつ現行監査証跡ファイルのシーケンス番号より
も一つ高いシーケンス番号を有するリストの終りに新し
いエントリーを追加することによって更新される。この
新しいエントリーは、次の現行監査証跡ファイルを識別
する。The pre-allocated file linked list is then updated by removing the last entry. The resident file linked list is updated by adding to the entry a file sequence number that is one greater than the sequence number of the current audit trail file. If the pre-allocated file is not available, the audit trail configuration management process 220
A check is made at step 512 to see if the renamed entitlement file exists in the entry on the P entry's resident file linked list. If this resident file linked list has a rename eligible entry, the rename eligible file with the lowest sequence number is identified at step 510 as the next current audit trail file. This file is then renamed before use. Resident File Linked List removes the previous entry identifying the renamed audit trail file and adds a new entry at the end of the list with a sequence number one higher than the sequence number in the current audit trail file. Will be updated by This new entry identifies the next current audit trail file.

【００２３】図８を参照すると、候補ＡＤＰエントリー
によって識別されたボリュームがダウンであるかまたは
予め割り当てられたファイルまたは改名有資格ファイル
を有していないならば、監査証跡構成管理処理２２０
は、段階５１６で新しい候補ＡＤＰエントリーとしてボ
リュームリスト４０２上の次のＡＤＰエントリーを識別
する。次の候補ＡＤＰエントリーに対する検索は、次の
候補ＡＤＰエントリーがリストの最初のものでありうる
ようにボリュームリスト４０２の始めに循環しうる。段
階５１８で、監査証跡構成管理処理２２０は、それが監
査証跡ファイルに対するその検索においてリスト上の全
てのボリュームを通って循環したかどうかを見るために
チェックする。循環しなかったならば、実行は、最新の
候補ＡＤＰに関連した適切な次の現行監査証跡ファイル
を識別するために段階５０６へ進む。監査証跡構成管理
処理２２０がボリュームリスト上の全てのＡＤＰエント
リーを通って循環しかつ次の監査証跡ファイルであるべ
き適切な候補を識別しなかったならば、事象が段階５２
０で発行される。段階５２０で発行された事象は、監査
証跡ファイルが見つけられないことをシステム端末装置
１２６を介して説明する。次に、監査証跡構成管理処理
２２０は、利用可能なファイルを生成しうる事象を段階
５２２で待機する。そのような事象は、構成へのＡＤＰ
の追加、ダウンされたボリュームリスト上のＡＤＰが利
用できるようになること、ボリューム当たりのファイル
数の増加、以前無資格なファイルからオーバーフローへ
の監視の転送、または有資格状態における他の変化を含
む。利用可能なファイルを生成しうる事象の後で、監査
証跡構成管理処理２２０は、検索を再開すべく段階５０
２へ戻る。その結果、ボリュームリスト４０２によって
参照されるボリュームが動作可能のままでありかつ完全
監査証跡が適時間に基づいて改名に対して有資格になる
という前提で、監査の記憶は、ラウンドロビンで利用可
能なボリュームを通って回る。それゆえに、監査証跡
は、多くのボリュームにまたがって分配される。アーカ
イビング処理は、現行で生成された監査の記憶を有する
ディスクアクセスに対して争わない。オンライン回復に
利用可能な監査証跡容量は、あらゆる一つのディスクボ
リュームの記憶容量を越える。さらなる利点は、非現行
ＡＤＰまたはディスクボリュームの故障は、現行で生成
された監査の記憶を停止しないことである。それゆえ
に、本発明のマルチボリューム監査証跡技術は、平均故
障間隔（ＭＴＢＦ）における増加を導く。Referring to FIG. 8, if the volume identified by the candidate ADP entry is down or does not have a preallocated file or rename file, then audit trail configuration management process 220.
Identifies the next ADP entry on volume list 402 as a new candidate ADP entry in step 516. The search for the next candidate ADP entry may cycle to the beginning of the volume list 402 so that the next candidate ADP entry may be the first in the list. At step 518, the audit trail configuration management process 220 checks to see if it has cycled through all the volumes on the list in its search for audit trail files. If not, execution proceeds to step 506 to identify the appropriate next current audit trail file associated with the latest candidate ADP. If the audit trail configuration management process 220 has cycled through all ADP entries on the volume list and has not identified a suitable candidate to be the next audit trail file, the event is step 52.
Issued at 0. The event issued at step 520 explains, via the system terminal 126, that the audit trail file cannot be found. The audit trail configuration management process 220 then waits at step 522 for an event that may generate an available file. Such an event is an ADP to configuration
Additions, availability of ADP on the downed volume list, increased number of files per volume, transfer of monitoring from previously unqualified files to overflow, or other changes in the qualified status . After an event that could generate an available file, the audit trail configuration management process 220 proceeds to step 50 to restart the search.
Return to 2. As a result, audit storage is available in round robin, provided that the volumes referenced by volume list 402 remain operational and the full audit trail is eligible for rename on a timely basis. It goes through a large volume. Therefore, the audit trail is distributed over many volumes. The archiving process does not contend for a disk access with a currently generated audit store. The audit trail capacity available for online recovery exceeds the storage capacity of any one disk volume. A further advantage is that failure of a non-current ADP or disk volume does not stop the storage of currently generated audits. Therefore, the multi-volume audit trail technique of the present invention leads to an increase in mean time between failures (MTBF).

【００２４】それらが起動するときに、マルチボリュー
ム監査証跡に関連した全てのＡＤＰは、現行監査証跡フ
ァイルの名前及びシーケンス番号を学ぶ。現行ＡＤＰが
その監査証跡ファイルが満杯になったということがわか
ったときには、それは、ファイルを満杯とマークして、
監査証跡のファイルのシーケンスに次のファイルを有す
るＡＤＰにロールオーバー要求を送る。ロールオーバー
要求は、新しく生成された監査証跡ファイルの名前及び
シーケンス番号を包含する。ロールオーバーに加わって
いる両方のＡＤＰは、新しい現行監査証跡ファイルの名
前及びシーケンス番号を示すべくそれらのデータ構造を
更新する。監査証跡構成管理処理２２０は、非同期通告
を介してロールオーバーに知らせる。ロールオーバーに
参加していない他の全てのＡＤＰは、新しい現行監査証
跡ファイルについて学ばないが、結果としてそれらは、
それら自身がロールオーバー要求を受け取るときにそれ
らのデータ構造を更新する。これは、監査証跡ファイル
がラウンドロビンで種々のＡＤＰ中で割り当てられるの
で、やがて起こる。以下の例は、どのファイルが現行で
あるかについてＡＤＰが学ぶ方法を示す。マルチボリュ
ーム監査証跡に参加している３つのＡＤＰ（ＡＤＰ₁、
ＡＤＰ₂、及びＡＤＰ₃）が存在すると想定する。ＡＤ
Ｐ起動時間でＡＤＰ₁は、現行ファイル名Ｆ₁（下付き
は、シーケンス番号を示す）を包含すると更に想定す
る。以下の表１は、一連のロールオーバーの間中の現行
監査証跡ファイルの各ＡＤＰの視点を示す。All ADPs associated with a multi-volume audit trail learn the name and sequence number of the current audit trail file when they boot. When the current ADP finds that its audit trail file is full, it marks the file as full and
Send a rollover request to the ADP with the next file in the sequence of files in the audit trail. The rollover request contains the name and sequence number of the newly created audit trail file. Both ADPs participating in the rollover update their data structures to indicate the name and sequence number of the new current audit trail file. The audit trail configuration management process 220 informs the rollover via asynchronous notification. All other ADPs not participating in the rollover do not learn about the new current audit trail files, but as a result they
Update their data structures when they themselves receive the rollover request. This will happen over time as the audit trail files are allocated in various ADPs in round robin. The following example shows how ADP learns which file is current. The three ADPs participating in the multi-volume audit trail (ADP ₁ ,
Assume that ADP ₂ and ADP ₃ ) are present. AD
It is further assumed that at P launch time ADP ₁ contains the current file name F ₁ (subscript indicates sequence number). Table 1 below shows each ADP view of the current audit trail file during a series of rollovers.

【００２５】[0025]

【表１】 ───────────────────────────────────時間 ADP₁の視点 ADP₂の視点 ADP₃の視点起動後 ADP₁上のF₁ ADP₁上のF₁ ADP₁上のF₁ 第１のロールオーバー後 ADP₂上のF₂ ADP₂上のF₂ ADP₁上のF₁ （ADP₁からADP₂へ）第２のロールオーバー後 ADP₂上のF₂ ADP₃上のF₃ ADP₃上のF₃ （ADP₂からADP₃へ）第３のロールオーバー後 ADP₁上のF₄ ADP₃上のF₃ ADP₁上のF₄ （ADP₃からADP₁へ）・・・・図９から図１１は、現行監査証跡ファイルが満たされか
つ次の監査証跡ファイルへのハンドオフに対する準備が
行われたときの、ＡＤＰによって交換されたメッセー
ジ、監査証跡構成管理処理２２０、及び監査ジェネレー
タを示す。ＡＤＰ _mとラベル付けされたエントリーは、
現行ＡＤＰを表わす。ＡＤＰ_m+1とラベル付けされたエ
ントリーは、ロールオーバーが行われた後で監査を書き
込むことを開始するＡＤＰを表わす。ＡＤＰ_m及びＡＤ
Ｐ_m+1は、バックアップが取って代わらない限りプライ
マリＡＤＰであることが理解される。ＤＰ_xとラベル付
けされたエントリーは、それらの監査をＡＤＰ_mに現行
送付する全監査生成ディスク処理を表わす。ＴＭＰとラ
ベル付けされたエントリーは、監査証跡構成管理処理２
２０を表わす。メッセージシステムトラフィックがペー
ジを横切って前後に動くと同時に時間は、ページを下に
動く。ラインセグメントのヘッドにおける矢印は、メッ
セージシステムトラフィックの方向を示す。ラインセグ
メントの末端におけるプラスサイン（＋）は、オリジナ
ルメッセージとしてそれに印を付け、ラインセグメント
の末端におけるコロン（：）は、応答としてそれに印を
付ける。各メッセージは、それをその対応している応答
に関連付けるための番号を有する。スレショルドは、太
字のイタリック体で現れる。ノーウェイト(no-wait) ・
メッセージは、それの記述においてキーワートＮＯＷＡ
ＩＴを有する。全ての他のメッセージは、ウェイテッド
(waited)・メッセージであると想定される。[Table 1] ───────────────────────────────────Time ADP ₁ perspective ADP ₂ perspective ADP ₃ perspective After starting on the F ₁ ADP ₁ on the F ₁ ADP ₁ on the ADP ₁ of F ₁ ADP after the first rollover_TwoF on_Two ADP_TwoF on_Two ADP₁F on₁ (From ADP ₁ to ADP ₂ ) ADP after the second rollover_TwoF on_Two ADP_ThreeF on_Three ADP_ThreeF on_Three (ADP ₂ to ADP ₃ ) ADP after the third rollover₁F on_Four ADP_ThreeF on_Three ADP₁F on_Four (From ADP ₃ to ADP ₁ ) ... 9 to 11 are the current audit trail files filled?
Ready for handoff to the next audit trail file
Messages exchanged by ADP when they were made
, Audit trail configuration management process 220, and audit generation
Data. ADP _mThe entry labeled as
Represents the current ADP. ADP_{m + 1}D labeled
Write the audit after the rollover has taken place.
Represents the ADP that begins to jam. ADP_mAnd AD
P_{m + 1}Is a private policy unless a backup replaces it.
It is understood that this is Mali ADP. DP_xWith label
Marked entries ADP their audit_mCurrent
Represents all audit generated disk processes sent. TMP and LA
The entry with the bell is the audit trail configuration management process 2
Represents twenty. Message system traffic is
As you move back and forth across the page, time goes down the page.
Move. The arrow at the head of the line segment
Indicates the direction of sage system traffic. Line segment
The plus sign (+) at the end of the
Mark it as a message, line segment
The colon (:) at the end of the
wear. Each message has it's corresponding response
Have a number to associate with. Threshold is thick
Appears in italic typeface. No-wait
The message, in its description, is Keywort NOWA.
Have IT. All other messages are weighted
(waited) -It is assumed to be a message.

【００２６】ＡＤＰ_mは、監査ジェネレータから監査レ
コードのバッファ、メッセージ１＋を受け取り、かつ確
認応答、メッセージ：１を送る。ＡＤＰ_mは、現行監査
証跡ファイルに監査レコードのバッファを記憶する。そ
れが監査を記憶すると、ＡＤＰ_mは、現行監査証跡ファ
イルの成長を監視する。図９は、最初のスレショルドで
ある“準備ロールオーバー・スレショルド”に到達する
と、ＡＤＰ_mは、メッセージ２＋を監査証跡構成管理処
理２２０に送り、新しい監査証跡ファイルを準備すべく
それに依頼することを示す。監査証跡構成管理処理２２
０は、ＡＤＰ_mからシステム関連サービスを必要としう
るので、メッセージ２＋は、ノーウェイトを送られる。
ウェイテッド・メッセージは、監査証跡構成管理処理２
２０とＡＤＰ_mの間にデッドロックをもたらしうる。図
１０は、その監査ジェネレータからの監査、メッセージ
３＋を受け取るべく継続しているＡＤＰ_mを示すが、監
査証跡構成管理処理２２０は、現行監査証跡ファイルが
第２のスレショルドである“監査保持スレショルド”に
到達するときまでに新しい監査証跡ファイルを準備して
いない。監査保持スレショルドに到達すると、ＡＤＰ_m
は、現行送付された監査、メッセージ４＋、応答では、
メッセージ：４、を受容するが、それは、その監査ジェ
ネレータに全てのそれらの新しい監査を無期限に保持す
ることを告げる。この監査保持スレショルドは、監査証
跡構成管理処理２２０が新しい監査証跡ファイルを通常
準備しかつ監査保持スレショルドが到達されるかなり前
にＡＤＰ_mに知らせるので一般に到達されない。監査保
持応答に応じて、監査ジェネレータは、それらが監査を
送ることを再開できるときにそれを依頼している現行Ａ
ＤＰにノーウェイト・メッセージ５＋を送る。ADP _m receives a buffer of audit records, message 1+, and sends an acknowledgment, message: 1 from the audit generator. ADP _m stores a buffer of audit records in the current audit trail file. As it stores the audit, ADP _m monitors the growth of the current audit trail file. FIG. 9 shows that upon reaching the first threshold, the "preparation rollover threshold", ADP _m sends message 2+ to audit trail configuration management process 220, asking it to prepare a new audit trail file. . Audit trail configuration management process 22
Message 2+ is sent no wait because 0 may require system related services from ADP _m .
The weighted message is the audit trail configuration management process 2
A deadlock may occur between 20 and ADP _m . FIG. 10 shows an audit from that audit generator, ADP _m continuing to receive message 3+, but audit trail configuration management process 220 indicates that the current audit trail file is at the second threshold, "Audit Retention Threshold". You haven't prepared a new audit trail file by the time you reach. Upon reaching the audit retention threshold, ADP _m
Is currently sent audit, message 4+, in response,
Accepts message: 4, which tells the audit generator to keep all those new audits indefinitely. This audit retention threshold is generally not reached because the audit trail configuration management process 220 normally prepares a new audit trail file and informs ADP _{m long} before the audit retention threshold is reached. In response to the audit hold response, the audit generator will ask the current A when it can resume sending audits.
Send no wait message 5+ to DP.

【００２７】監査証跡構成管理処理２２０は、オーバー
フロー記憶に先に保存または転送されたＡＤＰ_m+1にア
クセス可能な監査証跡ファイルを改名することによって
新しい監査証跡ファイルを準備する。新しい監査証跡フ
ァイルは、現行監査証跡ファイルのシーケンス番号より
も一つ高いシーケンス番号が割当てられる。監査証跡構
成管理処理２２０が新しい監査証跡ファイルを準備した
すぐ後に、それは、応答メッセージ：２をＡＤＰ_mに送
ることによってＡＤＰ_mに知らせる。応答メッセージ：
２は、新しい監査証跡ファイルの名前及びシーケンス番
号を含む。現行ＡＤＰ_mが監査を保持すべくいずれかの
監査ジェネレータに告げたならば、それは、それらがそ
れらの監査を送ることを再開できる、メッセージ：５を
それらに知らせる。次に、監査ジェネレータは、再びＡ
ＤＰ_mにそれらの監査を送ることを開始する。図１１
は、ＡＤＰ_mが、現行監査証跡ファイルの残りのものに
合致しない監査レコードのバッファを書き込むための要
求８＋をそれが得るまで、監査、メッセージ７＋及び：
７を受け取ることを継続することを示す。換言すると、
監査証跡ファイルは、ファイル・フル・スレショルド(f
ile full threshold) に到達した。この地点で、ＡＤＰ
_mは、全ての先に受け取った未だ書き込まれていない監
査をディスクに書き込む。次に、それは、いっぱいであ
る（フル）としてファイルに印を付けてノーウェイト・
ロールオーバーメッセージ９＋をＡＤＰ_m+1に送り新し
いＡＤＰになることをそれに告げる。そして、それは、
ファイル・フル応答であるメッセージ：８で監査ジェネ
レータに応答して、ＡＤＰ_m+1の名前を示す。監査は、
ここでＡＤＰ_m+1、メッセージ１０＋及び：１０に指向
される。ロールオーバーの更なる詳細は、以下に図１２
を参照して説明する。The audit trail configuration management process 220 prepares a new audit trail file by renaming an audit trail file accessible to ADP _{m + 1} that was previously stored or transferred to overflow storage. The new audit trail file is assigned a sequence number one higher than the sequence number of the current audit trail file. Shortly after the audit trail configuration management process 220 prepares a new audit trail file, it informs ADP _m by sending a reply message: 2 to ADP _m . Response message:
2 contains the name and sequence number of the new audit trail file. If the current ADP _m tells any audit generator to hold audits, it informs them that they can resume sending their audits, message: 5. Then the audit generator again
Start sending those audits to DP _m . FIG.
Audit, message 7+ and: until ADP _m gets a request 8+ to write a buffer of audit records that does not match the rest of the current audit trail file.
Indicates to continue to receive 7. In other words,
The audit trail file is file full threshold (f
ile full threshold) has been reached. At this point, ADP
_m writes to disk all previously unwritten audits it has received. Then it marks the file as full and no wait
Send a rollover message 9+ to ADP _{m + 1} telling it that it will be the new ADP. and it is,
File Full Response Message: Respond to the Audit Generator with 8 indicating the name of ADP _{m + 1} . Audit is
It is now directed to ADP _{m + 1} , messages 10+ and: 10. Further details of rollover are shown below in FIG.
This will be described with reference to FIG.

【００２８】“遅れている監査ジェネレータ”と通称さ
れる、他の監査ジェネレータは、それらが、ＡＤＰ_m+1
の名前を含むファイル・フル応答をその次にそれらに与
えるＡＤＰ_mに監査を送るまでロールオーバーについて
学ばない。次に、これらの監査ジェネレータは、ＡＤＰ
_m+1にそれらの監査を送ることを開始する。図１１にお
いて、メッセージ８への応答は、メッセージ９への応答
の前に発生するということに注目してほしい。古いＡＤ
Ｐは、それが新しいＡＤＰからロールオーバーメッセー
ジへの応答を受け取る前にその監査ジェネレータにファ
イル・フル応答を与える。これは、新しいＡＤＰがロー
ルオーバーメッセージを受け取る前に監査ジェネレータ
が新しいＡＤＰに監査を正確に送るという可能レース条
件へ導く。図１１を参照すると、これは、メッセージ９
＋の前にメッセージ１０＋がＡＤＰ_m+1に理論的に到着
しうるということを意味する。この情況は、破局的なＣ
ＰＵ故障の場合にだけ発生すべきである。それが現行
ＡＤＰであるということを認識する前にＡＤＰ_m+1が監
査レコードのバッファを受け取った場合には、それは、
現時点よりも一つ前に既知の現行ＡＤＰの名前と一緒に
ファイル・フル応答を与える。監査ジェネレータは、そ
れが現行ＡＤＰに戻りかつ現行ＡＤＰがそのロールオー
バーメッセージを受け取るまでトレール上の全てのＡＤ
Ｐのサイクリック・ツアー(cyclic tour) をそれゆえに
取る。[0028] is commonly referred to as "late audit generators are", other audit generator, they are, ADP m + ₁
Don't learn about rollover until you send an audit to ADP _m , which then gives them a file full response containing the names of. Next, these audit generators
Start sending those audits to _{m + 1} . Note that in FIG. 11, the response to message 8 occurs before the response to message 9. Old AD
P gives its audit generator a file full response before it receives a response to the rollover message from the new ADP. This leads to the possible race condition that the audit generator correctly sends the audit to the new ADP before the new ADP receives the rollover message. Referring to FIG. 11, this is message 9
This means that message 10+ can theoretically arrive at ADP _{m + 1} before +. This situation is catastrophic C
It should only occur in the case of a PU failure. If ADP _{m + 1} receives a buffer of audit records before recognizing that it is the current ADP, it will:
Give the file full response with the name of the known current ADP one before the present. The audit generator will return all ADs on the trail until it returns to the current ADP and the current ADP receives its rollover message.
Therefore take P's cyclic tour.

【００２９】ロールオーバーメッセージ９がウェイテッ
ドにされたならばそのようなサイクリック・ツアーは、
完全に回避されうるが、それがその次にその監査ジェネ
レータのいずれにも応答できる前に古いＡＤＰがロール
オーバーメッセージへの応答を待たなけらばならないの
で、これは、一般的な場合に性能を低減する。準備ロー
ルオーバー及び監査保持スレショルドの計算をここで説
明する。これらのスレショルドの計算の基本原理を理解
するために、ピン(pins)と以下称される、マルチ・コー
ポレーティング監査生成処理の集合として監査ジェネレ
ータを考えることは、有用である。監査ジェネレータ
は、ＡＤＰが監査を保持すべくディスクボリュームに告
げた後で短い時間についてその現行ＡＤＰに監査を送る
ことを継続しうる。ＡＤＰが監査を保持すべくそれに告
げたときに、そのピンのあるものが監査生成要求の代わ
りに実行しうるので、監査ジェネレータは、これを行
う。また、監査ジェネレータは、ＡＤＰが監査を保持す
べく告げたときにその監査バッファに監査レコードを既
に累積しうる。それゆえに、ＡＤＰは、それがこのエキ
ストラ監査をうまく受け入れることができるように監査
保持スレショルドを設定することが必要である。以下の
数式に従って監査保持スレショルドを設定することは、
エキストラ監査が受け入れられることを確実にする：ＡＨＴ＝ＭＦ−（Ｂ＊（ＤＶ＋ＤＶＰ））ここで、ＡＨＴは、監査保持スレショルド、ＭＦは、監
査証跡ファイルの最大の大きさ、Ｂは、好ましい実施例
では、約３２Ｋである、監査ジェネレータの監査バッフ
ァの大きさ、ＤＶは、ＡＤＰに監査を送る監査ジェネレ
ータの数、そしてＤＶＰは、ＡＤＰに監査を送るそれら
の監査ジェネレータに対するピンの総数である。監査証
跡構成管理処理２２０も監査を送るので、それは、数式
に対する単一ピン監査ジェネレータとして考えられるべ
きである。If the rollover message 9 was made weighted, such a cyclic tour would:
This can be avoided altogether, but this is a performance hit in the general case because the old ADP must wait for a response to the rollover message before it can then respond to any of its audit generators. Reduce. The preparation rollover and audit retention threshold calculations are described herein. In order to understand the basic principles of calculating these thresholds, it is useful to consider the audit generator as a collection of multi-corporating audit generation processes, referred to below as pins. The audit generator may continue to send audits to its current ADP for a short time after ADP tells the disk volume to hold the audit. When ADP tells it to keep an audit, the audit generator does this because some of its pins can execute instead of audit generate requests. Also, the audit generator may have already accumulated audit records in its audit buffer when ADP tells it to keep the audit. Therefore, ADP needs to set an audit retention threshold so that it can successfully accept this extra audit. Setting the audit retention threshold according to the following formula
Ensure that extra audits are accepted: AHT = MF- (B * (DV + DVP)) where AHT is the audit retention threshold, MF is the maximum size of the audit trail file, and B is the preferred embodiment. Is about 32K, the size of the audit generator's audit buffer, DV is the number of audit generators sending audits to ADP, and DVP is the total number of pins to those audit generators sending audits to ADP. The audit trail configuration management process 220 also sends audits, so it should be considered as a single pin audit generator for formulas.

【００３０】項（Ｂ＊（ＤＶ＋ＤＶＰ））は、それらの
監査を保持すべくその関連監査ジェネレータに告げた後
でＡＤＰが（少なくとも理論的に）受け取ることができ
る監査の量を表わす。ピンの数に監査ジェネレータの数
を加えることは、それがその次の要求を処理することを
始めたときに各監査ジェネレータのバッファが完全にい
っぱいでありかつ監査レコードで全バッファを満たす要
求を監査ジェネレータの各ピンが処理するという可能性
をアドレスする。上記に示した監査保持スレショルドの
数式は、非常に保守的である。それは、全監査の記憶を
保証するために一般に必要なものよりもさらに低いスレ
ショルドを生成する。好ましい実施例では、準備ロール
オーバー・スレショルドは、監査保持スレショルドの７
０％に設定される。また、この値は、柔軟に構成するこ
とができうるかまたは実際の監査生成割合い（監査生成
率）から導き出しうる。ＡＤＰが起動するときには、い
くつの監査ジェネレータがそれに監査を送るのかをそれ
は知らないし、各監査ジェネレータ上のピンの数もそれ
は知らない。それゆえに、ＡＤＰは、監査保持スレショ
ルド適応的に再計算しかつそれが新しい監査ジェネレー
タについて学ぶ毎にロールオーバー・スレショルドを準
備しなけらばならない。監査ジェネレータが監査をＡＤ
Ｐに送る度に、それは、その処理グループのピンの数を
示す。初めてＡＤＰが特定の監査ジェネレータから監査
を得たときには、それは、監査ジェネレータの数及びピ
ンの数（上記の数式のＤＶ及びＤＶＰ）を更新しかつ監
査保持スレショルドを再計算してロールオーバー・スレ
ショルドを準備する。The term (B * (DV + DVP)) represents the amount of audit ADP can (at least theoretically) receive after telling its associated audit generator to keep those audits. Adding the number of audit generators to the number of pins audits the request when each audit generator's buffer is completely full and fills the entire buffer with audit records when it begins to process the next request. Addresses the possibility that each pin of the generator will handle. The formula for the audit retention threshold shown above is very conservative. It produces an even lower threshold than is generally required to guarantee the memory of the entire audit. In the preferred embodiment, the prepare rollover threshold is the audit hold threshold of 7.
Set to 0%. Also, this value can be configured flexibly or can be derived from the actual audit generation rate (audit generation rate). When ADP fires, it does not know how many audit generators send it audits, nor does it know the number of pins on each audit generator. Therefore, the ADP must adaptively recalculate the audit retention threshold and prepare a rollover threshold each time it learns about a new audit generator. Audit generator AD audit
Each time it sends to P, it indicates the number of pins in that processing group. When ADP gets an audit from a particular audit generator for the first time, it updates the number of audit generators and the number of pins (DV and DVP in the above formula) and recalculates the audit retention threshold to get the rollover threshold. prepare.

【００３１】ＡＤＰが準備ロールオーバー・スレショル
ドを再計算する度に、それは、この値を現行監査証跡フ
ァイルの大きさと比較しなけらばならない。ファイルの
大きさが準備ロールオーバー・スレショルドを越えかつ
次のファイルがまだ準備されていないならば、ＡＤＰ
は、新しい監査証跡ファイルを準備することをそれに依
頼している監査証跡構成管理処理２２０に要求をすぐ送
らなけらばならない。ロールオーバーの間中にＡＤＰ_m
またはＡＤＰ_m+1の故障が、それがあたかも現行である
ようにＡＤＰが動作することがない情況、即ち“バトン
を落とす”か、一つ以上のＡＤＰがあたかもそれが現行
であるとして動作する情況、即ち“バトンを壊す”のい
ずれかをもたらさないということは、重要である。本発
明は、監査ジェネレータ、ＡＤＰ_m、ＡＤＰ_m+1、及び
それらのバックアップの中でフォールトトレラント・ロ
ールオーバーメッセージ・プロトコルを提供する。図１
２は、本発明によるフォールトトレラント・ロールオー
バーメッセージ・プロトコルを示す。円は、監査ジェネ
レータ７００、プライマリＡＤＰ_m７０２、プライマリ
ＡＤＰ_m+1７０４、及びそれらのバックアップ７０６及
び７０８のような、処理を表す。矢印は、エンティティ
間を前後に通過するメッセージを表わす。ロールオーバ
ーは、プライマリＡＤＰ_mとプライマリＡＤＰ_m+1の間
で発生する。Each time ADP recalculates the prepare rollover threshold, it must compare this value to the size of the current audit trail file. If the file size exceeds the prepare rollover threshold and the next file is not yet prepared, ADP
Must immediately send a request to the audit trail configuration management process 220 requesting it to prepare a new audit trail file. ADP _m during rollover
Or the situation where the failure of ADP _{m + 1} does not cause ADP to work as if it were current, ie "drop the baton", or one or more ADPs operate as if it were current. It is important that it does not result in either "breaking the baton". The present invention provides a fault tolerant rollover message protocol among the audit generators, ADP _m , ADP _{m + 1} , and their backups. FIG.
2 shows a fault tolerant rollover message protocol according to the present invention. Circles represent processes such as audit generator 700, primary ADP _m 702, primary ADP _{m + 1} 704, and their backups 706 and 708. Arrows represent messages passing back and forth between entities. Rollover occurs between primary ADP _m and primary ADP _{m + 1} .

【００３２】現行プライマリＡＤＰ、ＡＤＰ_mは、監査
ジェネレータから、監査レコードのバックアップ、図６
のメッセージ８＋に対応しているメッセージＡを受け取
る。図６を参照して説明したように、プライマリＡＤＰ
_mは、いっぱいである現行監査証跡ファイルに印を付け
てロールオーバー要求、メッセージＢまたは図６のメッ
セージ９＋をプライマリＡＤＰ_m+1に送らなけらばなら
ない。次に、プライマリＡＤＰ_m+1は、チェックポイン
ト、メッセージＣをバックアップＡＤＰ_m+1に送り、そ
れが現行ＡＤＰになったことを示す。バックアップＡＤ
Ｐ_m+1は、肯定応答、メッセージＤを戻す。プライマリ
ＡＤＰ_m+1が肯定応答、図６のメッセージ：９に対応し
ているメッセージＥをオリジナルロールオーバー要求
に戻した後、プライマリＡＤＰ_mは、バックアップＡＤ
Ｐ_mにチェッポイント、メッセージＦを送り、それがも
はや現行ＡＤＰでないことを示す。バックアップＡＤＰ
_mが肯定応答、メッセージＧを戻したときに、ロールオ
ーバーは、完了し、プロトコルは、終了する。ＡＤＰ_m
は、監査ジェネレータ、図６のメッセージ：８に対応し
ているメッセージＸに応答して、監査証跡ファイルがい
っぱいになったということ及びそれが監査レコードのバ
ッファを新しいプライマリＡＤＰに再送しなければなら
ないということをそれに告げる。この応答は、メッセー
ジＢの後でメッセージＦの前であればいつでも行うこと
ができる。The current primary ADP, ADP _m, is a backup of audit records from the audit generator, FIG.
The message A corresponding to the message 8+ is received. As described with reference to FIG. 6, the primary ADP
_m must mark the current audit trail file that is full and send a rollover request, message B or message 9+ of FIG. 6 to the primary ADP _{m + 1} . The primary ADP _{m + 1 then} sends a checkpoint, message C, to the backup ADP _{m + 1} , indicating that it has become the current ADP. Backup AD
P _{m + 1} returns an acknowledgment, message D. After the primary ADP _{m + 1} returns an acknowledgment, message E corresponding to message: 9 in FIG. 6, to the original rollover request, the primary ADP _m becomes the backup AD.
Check point P _m, sends a message F, indicating that it is no longer current ADP. Backup ADP
_{When m} returns an acknowledgment, message G, the rollover is complete and the protocol ends. ADP _m
Responds to the audit generator, message X, which corresponds to message 8 in FIG. 6, that the audit trail file is full and it must resend the buffer of audit records to the new primary ADP. Tell it that. This reply can be made at any time after message B but before message F.

【００３３】本発明のこのロールオーバー・プロトコル
は、種々のＡＤＰ処理（バックアップを含む）の間に６
つのメッセージ（肯定応答を含む）だけを包含する。処
理ペアに基づくフォールトトレラント・プロトコルは、
さらに少ないメッセージを用いることができないという
ことを示すことができる。このロールオーバー・プロト
コルが故障を取り扱う方法をここで説明する。ロールオ
ーバー・プロトコルは、ダブルＣＰＵ故障ではなく、単
一ＣＰＵの故障を取り扱うことができなければならな
い。プライマリ及びバックアップＡＤＰは、別のＣＰＵ
で走行するので、少なくとも一つが単一ＣＰＵ故障から
生き残る。バックアップＡＤＰの故障は、支障（害）が
ない。バックアップＡＤＰのＣＰＵが故障したときに
は、それは、再負荷される。ＣＰＵ再負荷の一部とし
て、バックアップＡＤＰは、再起動され、かつプライマ
リは、プライマリの故障の場合に取って代わることが再
び準備されているように全てのそのデータ構造を適切に
更新するためにチェックポイント・メッセージをそれに
送る。そのプライマリが故障する前にバックアップＡＤ
Ｐが完全に再起動する限り、ロールオーバー・プロトコ
ルは、成功する。This rollover protocol of the present invention provides 6 during various ADP processes (including backup).
Includes only one message (including acknowledgment). Fault-tolerant protocols based on processing pairs
It can be shown that fewer messages cannot be used. The manner in which this rollover protocol handles failures is now described. The rollover protocol must be able to handle single CPU failures rather than double CPU failures. Primary and backup ADP are different CPUs
At least one survives a single CPU failure. The failure of the backup ADP has no hindrance. When the backup ADP CPU fails, it is reloaded. As part of the CPU reload, the backup ADP will be restarted and the primary will properly update all its data structures so that it is ready to replace in case of failure of the primary. Send a checkpoint message to it. Backup AD before the primary fails
As long as P completely restarts, the rollover protocol will succeed.

【００３４】更に、プロトコルにおける複数の地点で、
プライマリＡＤＰの損失は、明らかにプロトコルに故障
をもたらさない。例えば、プライマリＡＤＰ_mがメッセ
ージＢを送る前またはバックアップＡＤＰ_m+1がメッセ
ージＣを受け取った後にプライマリＡＤＰ_m+1が故障し
たならば、バックアップＡＤＰ_m+1は、うまく取って代
わるべき十分な情報を有しかつプライマリになる。ま
た、プライマリＡＤＰ_mがメッセージＡを受け取る前ま
たはバックアップＡＤＰ_mがメッセージＦを受け取った
後にプライマリＡＤＰ_mが故障したならば、バックアッ
プＡＤＰ_mは、うまく取って代わるべき十分な情報を有
しかつプライマリになる。そして、プライマリＡＤＰ_m
がメッセージＢを送った後であるがプライマリＡＤＰ
_m+1がメッセージＣを送る前にプライマリＡＤＰ_mとプ
ライマリＡＤＰ_m+1の両方が故障したならば、プライマ
リＡＤＰ_mとプライマリＡＤＰ_m+1の両方は、うまく取
って代わるべき十分な情報を有しかつプライマリにな
る。この場合には、監査ジェネレータは、取って代わり
かつプライマリになるバックアップＡＤＰ_mにメッセー
ジＡを再送しなければならないということに注目された
い。４つの故障シナリオが考慮されるべく残っている：
Ｉ）プライマリＡＤＰ_mは、それがメッセージＢを送っ
た後であるがそれがメッセージＦを送る前に故障する。
ＩＩ）プライマリＡＤＰ_m+1は、それがメッセージＢを
受け取った後あるがそれがメッセージＣを送る前に故障
する。ＩＩＩ）プライマリＡＤＰ_m+1は、それがメッセ
ージＣを送った後であるがそれがメッセージＥで応答す
る前に故障する。ＩＶ）プライマリＡＤＰ_m及びプライ
マリＡＤＰ_m+1は、プライマリＡＤＰ_m+1がメッセージ
Ｃを送った後であるがプライマリＡＤＰ_mがメッセージ
Ｆを送る前に故障する。Furthermore, at several points in the protocol,
The loss of primary ADP obviously does not bring the protocol down. For example, if the primary ADP _m _{+ 1} fails before the primary ADP _m sends message B or after the backup ADP _{m + 1} receives message C, the backup ADP _{m + 1} has sufficient information to successfully replace. Have and become primary. Further, if the primary ADP _m after before or backup ADP _m receives a message F primary ADP _m receives a message A fails, the backup ADP _m has sufficient information to replace well and the primary Become. And the primary ADP _m
After sending message B, but the primary ADP
If both primary ADP _m and primary ADP _{m + 1} fail before _{m + 1} sends message C, then both primary ADP _m and primary ADP _{m + 1} have sufficient information to successfully replace. And become the primary. Note that in this case, the audit generator must resend message A to the backup ADP _m that will take over and become the primary. Four failure scenarios remain to be considered:
I) The primary ADP _m fails after it sends message B but before it sends message F.
II) The primary ADP _{m + 1} fails after it receives message B but before it sends message C. III) The primary ADP _{m + 1} fails after it sends message C but before it responds with message E. IV) Primary ADP _m and primary ADP _{m + 1} fail after primary ADP _{m + 1} sends message C but before primary ADP _m sends message F.

【００３５】Ｉの場合、バックアップＡＤＰ_mは、メッ
セージＦをまだ受け取っていないので、それが取って代
わるときには、それが現行であるとまだ思っている。プ
ライマリＡＤＰ_m+1がメッセージＢをうまく受け取った
ので、それが現行であるとまだ思っている。バックアッ
プＡＤＰ_m及びプライマリＡＤＰ_m+1の両方は、それら
が現行であるということを信じている。しかしながら、
あたかもそれらが現行であるかのようにそれらは、両方
とも動作しない。それがメッセージＢをまだ送る前に、
プライマリＡＤＰ_mは、フル（いっぱい）であると現行
監査証跡に印を付けたことを思い出してください。バッ
クアップＡＤＰ_m（プライマリになる）が新しい監査を
受け取ったならば、監査証跡ファイルがいっぱいであり
かつロールオーバーを単に再起動するということを認識
するであろう。プライマリＡＤＰ_m+1は、ロールオーバ
ー要求のシーケンス番号が、それが既に現行であると信
じられているシーケンス番号に等しいかまたはそれに先
行するということを認識するであろう。それゆえに、プ
ライマリＡＤＰ_m+1は、要求を肯定応答するかさもなく
ば要求において特定されたものに等しいかまたはそれよ
りも大きいシーケンス番号を有する監査証跡ファイルを
使用することを既に開始してしまったので、それを無視
する。For I, the backup ADP _m has not received message F yet, so when it replaces it still thinks it is current. I still think that it is current because the primary ADP _{m + 1} successfully received message B. Both the backup ADP _m and the primary ADP _{m + 1} believe that they are current. However,
They both do not work as if they were current. Before it still sends message B
Recall that the primary ADP _m marked the current audit trail as full. If the backup ADP _m (which becomes primary) receives a new audit, it will recognize that the audit trail file is full and will just restart the rollover. The primary ADP _{m + 1} will recognize that the sequence number of the rollover request is equal to or precedes the sequence number it is believed to be already current. Therefore, the primary ADP _{m + 1} has already started to acknowledge the request or otherwise use an audit trail file with a sequence number equal to or greater than that specified in the request. So ignore it.

【００３６】しかしながら、仮にそのバックアップＡＤ
Ｐ_m（プライマリになる）は、それにロールオーバーを
再起動することをもたらす監査を受け取らないとする。
その結果、それは、それが現行であるとそれが既に思っ
ていてもロールオーバー要求自体を受け取る。このロー
ルオーバー要求におけるシーケンス番号が、現行である
とそれが既に信じているシーケンス番号を越えるので、
それは、その現行監査証跡ファイルを閉じかつロールオ
ーバー要求において特定された監査証跡ファイルを使用
することを開始する。それゆえに、その結果として、監
査証跡は、ただ一つのＡＤＰがそれが現行であると思っ
ている状態に戻る。ＩＩの場合には、プライマリＡＤＰ
_mは、プライマリＡＤＰ_m+1が故障したということを知
らされて、それは、取って代わってかつプライマリにな
ったバックアップＡＤＰ_m+1にロールオーバー要求を再
送する。バックアップＡＤＰ_m+1は、メッセージＣを決
して受け取っていないので、これは、それがロールオー
バーについて学んだ最初であり、かつ通常の場合におけ
るようにプロトコルで先に進む。ＩＩＩの場合は、ＩＩ
の場合に類似する。プライマリＡＤＰ_mは、プライマリ
ＡＤＰ_m+1が故障したという通知を受け取り、取って代
わってプライマリになるバックアップＡＤＰ_m+1にロー
ルオーバー要求を再送する。バックアップＡＤＰ
_m+1は、メッセージＣを受け取ったので、それは、現行
になる。再送されたロールオーバー要求を受け取って、
バックアップＡＤＰ_m+1は、要求のシーケンス番号が、
現行であるとそれが既に信じているファイルのシーケン
ス番号に等しいかまたは先行することを認識するであろ
う。それゆえに、バックアップＡＤＰ_m+1は、再送され
たロールオーバー要求を肯定応答するかさもなくば要求
において特定されたものに等しいかまたはそれよりも大
きいシーケンス番号を有する監査証跡ファイルを使用す
ることを既に開始してしまったのでそれを無視する。However, if the backup AD
P_m(Becomes primary) roll over it
Suppose you don't receive an audit that results in a reboot.
As a result, it already thinks it is current
Receive the rollover request itself. This low
The sequence number in the failover request is current
And it exceeds the sequence number you already believe,
It closes and rolls over its current audit trail file.
Use the audit trail file specified in the server request
To start doing. Therefore, as a result,
Visa trail thinks that only one ADP is current
Return to the state of being. In case of II, primary ADP
_mIs the primary ADP_{m + 1}Know that the
It replaces and becomes primary.
Backup ADP_{m + 1}Re-rollover request to
Send. Backup ADP_{m + 1}Decides message C
This is a roll-up because I have not received
Be the first to learn about the bar and usually
So proceed with the protocol. II in the case of III
Similar to. Primary ADP_mIs the primary
ADP_{m + 1}Received a notification that the
Backup ADP that becomes primary_{m + 1}To low
Resend the failover request. Backup ADP
_{m + 1}Received message C, it
become. Upon receiving the retransmitted rollover request,
Backup ADP_{m + 1}Is the request sequence number
The sequence of files it already believes to be current
Be recognized as equal to or precede the
U. Therefore, backup ADP_{m + 1}Is resent
Acknowledge rollover request or otherwise request
Equal to or greater than that specified in
Use an audit trail file with a threshold sequence number
I've already started doing that, so ignore it.

【００３７】ＩＶの場合は、Ｉの場合に類似する。バッ
クアップＡＤＰ_m及びバックアップＡＤＰ_m+1の両方
は、取って代わってプライマリになる。バックアップＡ
ＤＰ_m+ ₁がメッセージＣを受け取ったので、それが現行
であるとそれは思うであろう。バックアップＡＤＰ
_mは、メッセージＦを決して受け取っていないので、そ
れが現行であるとそれは思うであろう。両方のＡＤＰ
は、それらが現行であると思っているが、しかしＡＤＰ
は、情況を結局修正するであろう。Ｉの場合のように、
問題のＡＤＰは、ロールオーバー要求のシーケンス番号
及びプライマリＡＤＰ_mがロールオーバーを起動する前
にいっぱいであると現行監査証跡ファイルに印を付けた
という事実に依存する。バックアップＡＤＰ_m（プライ
マリになる）が新しい監査を受け取ったならば、監査証
跡ファイルがいっぱいでありかつロールオーバーを単に
再起動するということを認識するであろう。バックアッ
プＡＤＰ_m+1（プライマリになる）は、ロールオーバー
要求のシーケンス番号が、それが既に現行であると信じ
られているシーケンス番号に等しいかまたはそれに先行
するということを認識するであろう。それゆえに、バッ
クアップＡＤＰ_m+1は、要求を肯定応答するかさもなく
ば要求において特定されたものに等しいかまたはそれよ
りも大きいシーケンス番号を有する監査証跡ファイルを
使用することを既に開始してしまったのでそれを無視す
る。Case IV is similar to case I. Both backup ADP _m and backup ADP _{m + 1} take over and become primary. Backup A
Since DP _{m +} ₁ received message C, it would think it is current. Backup ADP
It will think it is current because _m has never received message F. Both ADP
Thinks they are current, but ADP
Will eventually fix the situation. As for I,
The ADP in question relies on the sequence number of the rollover request and the fact that the primary ADP _m marked the current audit trail file as full before invoking the rollover. If the backup ADP _m (which becomes primary) receives a new audit, it will recognize that the audit trail file is full and will just restart the rollover. The backup ADP _{m + 1} (becoming the primary) will recognize that the sequence number of the rollover request is equal to or precedes the sequence number it is believed to already be current. Therefore, the backup ADP _{m + 1} has already acknowledged the request or has already begun to use an audit trail file with a sequence number equal to or greater than that specified in the request. So ignore it.

【００３８】バックアップＡＤＰ_mがそれにロールオー
バーを再起動することをもたらす監査を受け取らないな
らば、それは、それが現行であるとそれが既に思ってい
てもロールオーバー要求自体を受け取る。このロールオ
ーバー要求におけるシーケンス番号が、現行であるとそ
れが既に信じているシーケンス番号を越えるので、それ
は、その現行監査証跡ファイルを閉じかつロールオーバ
ー要求において特定された監査証跡ファイルを使用する
ことを開始する。再び、監査証跡は、ただ一つのＡＤＰ
が現行ＡＤＰとして動作する状態に戻る。それゆえに、
それが現行であると信じているＡＤＰがロールオーバー
要求を受け取った上で取るアクションは、シーケンス番
号に依存するであろう。ＡＤＰがそれ自体現行であるこ
とを信じないでかつそれがロールオーバー要求を受け取
ったならば、３つのシナリオが可能である。要求のシー
ケンス番号がＡＤＰに知られた先のシーケンス番号より
も大きいならば、これは、普通の情況であり、このＡＤ
Ｐが現行になるべきである。要求のシーケンス番号が先
のシーケンス番号に等しいならば、全ての処理が上述の
規則に従うことが不可能なので、プロトコル故障が発生
したということである。そしてＡＤＰは、故障したこと
になる。要求のシーケンス番号が既知のシーケンス番号
よりも小さいならば、リクエスター(requester) は、あ
る利用により遅れており、ＡＤＰは、この要求を肯定応
答しかつそれを無視する。If the backup ADP _m does not receive the audit that would cause it to restart the rollover, it receives the rollover request itself, even if it already thinks it is current. Since the sequence number in this rollover request exceeds the sequence number it already believes to be current, it closes its current audit trail file and uses the audit trail file specified in the rollover request. Start. Again, the audit trail is the only ADP
Returns to operating as the current ADP. Hence,
The action taken by ADP, which it believes to be current, upon receiving the rollover request will depend on the sequence number. If you do not believe that ADP is current and that it receives a rollover request, three scenarios are possible. If the sequence number of the request is higher than the previous sequence number known to ADP, this is normal and this AD
P should be current. If the sequence number of the request is equal to the previous sequence number, it means that a protocol failure has occurred because all the processes cannot follow the above rules. And the ADP has failed. If the sequence number of the request is less than the known sequence number, the requester has been delayed by some utilization and ADP acknowledges this request and ignores it.

【００３９】[0039]

【表２】 ───────────────────────────────────ＡＤＰ状態シーケンス番号状態アクションこのＡＤＰは、要求のシーケンス番号＞通常の場合。それが現行でこのＡＤＰに知られたこのＡＤＰは現行になるべきはないと思うシーケンス番号である。要求のシーケンス番号＝プロトコルエラー！発生すこのＡＤＰに知られたべきではないので、ＡＤＰはシーケンス番号すぐに失敗する。要求のシーケンス番号＜遅れリクエスター。このＡＤＰに知られたこのＡＤＰは要求を肯定応答シーケンス番号すべきであるがこれを処理しない。このＡＤＰは、要求のシーケンス番号＞このＡＤＰは遅れている。それが現行でＡＤＰに知られたシーそれは、古い監査証跡ファイあると思うケンス番号ルを閉じて要求で特定された新しいファイルを使用することを始めるべきである。 [Table 2] ─────────────────────────────────── ADP state Sequence number State Action This ADP is a request Sequence number of> Normal case. It is currently known to this ADP This is a sequence number that I think should not be current . Request sequence number = protocol error! The ADP sequence number immediately fails because it should not be known to this ADP . Request Sequence Number <Lazy Requester. Known to this ADP This ADP should acknowledge the request, but does not process it . This ADP is the sequence number of the request> this ADP is late. Sea it has been known to ADP in the current it is, it should start that you use the new file that has been specified in the request to close the sequence number Lumpur, which I think there old audit trail file.

【００４０】要求のシーケンス番号遅れリクエスター。＜＝ＡＤＰに知られたこのＡＤＰは要求を肯定応答シーケンス番号すべきであるがそれを処理ししない。構成された監査証跡は、制限された容量を有する。アク
ティブ監査証跡容量は、ｎが監査証跡におけるアクティ
ブボリュームの数、ｍがボリューム毎の監査証跡ファイ
ルの数、そしてｘが単一監査証跡ファイルの容量である
とすると、ｎ＊ｍ＊ｘで計算される。この容量制限は、
監査生成の割合いが、記憶スペースが新しい監査の記憶
に対して利用可能になる、即ちフル監査ファイルが改名
の資格を有するようになる割合いを越えるときに重要に
なる。例えば、オペレーターが監査ダンプのためのテー
プを取付けるために利用できないかまたは監査証跡に充
満をもたらす監査生成の突然のバーストが存在しうる。
本発明は、オーバーフロー監査証跡記憶に対して選択さ
れたディスクボリュームをシステムオペレーターに指定
させる。動作において、一度アクティブ監査証跡容量の
構成可能なスレショルド・パーセンテージが改名の資格
を有さないフル監査証跡ファイルによって占有される
と、オーバーフロー記憶が用いられる。オーバーフロー
スレショルドは、監査証跡容量の５０％から１００％ま
でのどこかであるべく一般に構成される。また、最適ス
レショルドも監査生成割合い及びアクティブ監査証跡容
量から計算されうる。監査証跡レコードは、オーバーフ
ローボリュームに転送されて、新しい監査生成のために
スペースを空ける。Request Sequence Number Delay Requester. <= Known to ADP This ADP should acknowledge the request, sequence number but not process it . The configured audit trail has a limited capacity. The active audit trail capacity is calculated as n * m * x, where n is the number of active volumes in the audit trail, m is the number of audit trail files per volume, and x is the capacity of a single audit trail file. It This capacity limit is
The rate of audit generation becomes important when the storage space exceeds the rate at which storage space becomes available for new audit storage, that is, full audit files become eligible for rename. For example, there may be a sudden burst of audit generation that is not available to the operator to mount the tape for the audit dump, or that fills the audit trail.
The present invention allows the system operator to specify the selected disk volume for overflow audit trail storage. In operation, overflow storage is used once the configurable threshold percentage of active audit trail capacity is occupied by full audit trail files that are not eligible for rename. The overflow threshold is typically configured to be anywhere from 50% to 100% of audit trail capacity. An optimal threshold can also be calculated from the audit generation rate and active audit trail capacity. Audit trail records are transferred to the overflow volume to make room for new audit generations.

【００４１】監査証跡構成管理処理２２０は、オーバー
フロー監査構成データ構造を保守する。このデータ構造
は、オーバーフローボリュームの原因であるＡＤＰに対
するエントリーを有しているオーバーフローボリューム
リンクド・リストを含む。図１３は、本発明によるオー
バーフロー監査証跡記憶を使用する段階を示す。段階８
００で、監査証跡構成管理処理２２０は、監査証跡のス
レショルド・パーセンテージ以上がいっぱいであること
を決定する。段階８０２で、改名の資格をまだ有さない
最も古い監査証跡ファイル（最も低いシーケンス番号）
は、その目的のために構成されたボリュームの一つにコ
ピーされて改名の資格を有すると印を付けられる。監査
証跡構成管理処理２２０は、それがディスク記憶装置上
にメモリ空間を有するＡＤＰエントリーを見付けるまで
オーバーフロー・ボリュームリンクド・リストを通って
循環することによってターゲット・オーバーフローボリ
ュームを選択する。ある理由によりオーバーフロー及び
正則監査証跡記憶が同じディスクボリュームを共有しか
つディスクボリューム上のアクティブ監査証跡ファイル
が同じディスクボリューム上のオーバーフロー・スペー
スにコピーされるべきであるならば、ファイルは、単に
オーバーフローとして改名されかつ新しい予め割り当て
られた監査証跡ファイルがディスクボリュームのオーバ
ーフロー・スペースに生成される。The audit trail configuration management process 220 maintains the overflow audit configuration data structure. This data structure contains an overflow volume linked list that has an entry for the ADP responsible for the overflow volume. FIG. 13 illustrates the steps for using overflow audit trail storage according to the present invention. Stage 8
At 00, the audit trail configuration management process 220 determines that the audit trail threshold percentage or greater is full. Oldest audit trail file (lowest sequence number) that is not yet eligible for rename at step 802
Is copied to one of the volumes configured for that purpose and marked as eligible for rename. The audit trail configuration management process 220 selects a target overflow volume by cycling through the overflow volume linked list until it finds an ADP entry with memory space on disk storage. If for some reason overflow and regular audit trail storage share the same disk volume and the active audit trail files on the disk volume should be copied to overflow space on the same disk volume, then the file is simply A renamed and new pre-allocated audit trail file is created in the overflow space of the disk volume.

【００４２】段階８０４で、システム端末装置１２６
は、オーバーフロー・スペースの使用を削除するために
何かを行わなければならないという警告を表示する。適
切なアクションは、監査証跡ファイルが改名の資格を有
さない理由による。監査がダンプされていないならば、
オペレーターは、テープを取り付けるべきである。保留
のトランザクションが、改名されることからファイルを
防止しているならば、トランザクションは、終了される
べきである。監査ジェネレータを実行しているＣＰＵが
キャッシュ回復に対して監査を必要とするならば、オペ
レーターは、キャッシュの定期フラッシング(periodic
flushing) を待つべきである。遠隔バックアップのよう
な別の処理が多くの監査証跡ファイルに対して改名有資
格を否定しているならば、処理は、検査されるかまたは
調べられるべきである。また、記憶スペースが利用可能
であるならば、オペレーターは、ボリューム毎のファイ
ルの数を増加するかまたは監査証跡にボリュームを追加
しうる。一度ファイルに記憶されたオーバーフロー監査
が、そこに記憶された監査が保存されたので、もはや必
要でなくなったならば、オーバーフローファイルは、消
去される。オーバーフロー・スペースの使用にもかかわ
らず監査証跡が満たすことを継続するならば、開始トラ
ンザクション不能スレショルドに到達しうる。このスレ
ショルドでは、監査証跡構成管理処理２２０は、新しい
トランザクションを許容しない。スレショルドは、保留
しているトランザクションの結果として生ずる監査を収
容するために十分な監査証跡容量が残るように構成され
るべきである。監査証跡容量が開始トランザクション不
能スレショルドがもはや越えられないように利用可能に
なるときには、監査証跡構成管理処理２２０は、再び新
しいトランザクションを許容する。In step 804, the system terminal 126
Displays a warning that something must be done to remove the use of overflow space. The appropriate action is because the audit trail file is not eligible for rename. If the audit is not dumped,
The operator should install the tape. If the pending transaction is preventing the file from being renamed, the transaction should be terminated. If the CPU running the audit generator requires auditing for cache recovery, the operator may use periodic flushing of the cache.
flushing) should be waited for. If another process, such as remote backup, denies renaming for many audit trail files, the process should be inspected or examined. Also, if storage space is available, the operator can increase the number of files per volume or add volumes to the audit trail. If the overflow audit once stored in the file is no longer needed because the audit stored therein is saved, the overflow file is erased. If the audit trail continues to fill in spite of the use of overflow space, then the start non-transaction threshold may be reached. At this threshold, audit trail configuration management process 220 does not allow new transactions. The threshold should be configured to leave sufficient audit trail capacity to accommodate the audits that result from pending transactions. When the audit trail capacity becomes available so that the initiating no transaction threshold is no longer exceeded, the audit trail configuration management process 220 again allows new transactions.

【００４３】システムマネージャは、回復手順の一部と
して監査ダンプから復元された監査証跡ファイルを保持
する一セットのディスクボリュームを特定することもで
きる。監査証跡構成管理処理２２０は、復元された監査
証跡ファイルを受け取るべく指定されたＡＤＰに対する
エントリーを含んでいる別のボリュームリンクド・リス
トを保守する。オーバーフローを有するときのように、
監査証跡構成管理処理２２０は、復元された監査を受け
取るためのスペースを有する次のＡＤＰを見付けるため
にこのボリュームリンクド・リストを通って循環する。
本発明は、フォールトトレラント計算機システム２００
を再始動することなく監査証跡を再構成するための能力
を供給する。ボリュームは、アクティブ監査証跡、その
オーバーフロー・スペース、及びその復元されたファイ
ルを保持するために用いられるボリュームのセットに追
加されうるか、またはそれから削除されうる。ボリュー
ム毎のアクティブファイルの数、オーバーフロー・スレ
ショルド、及び開始トランザクション不能スレショルド
も変更されうる。新しいアクティブボリュームが監査証
跡に追加されたときには、エキストラ・ファイルは、そ
のボリューム上に割り当てられて、アクティブ監査証跡
に容量を追加する。監査証跡からアクティブボリューム
を削除することは、アクティブ監査証跡ファイルの数を
低減する最終的な効果を有する。The system manager can also identify a set of disk volumes that hold the audit trail files restored from the audit dump as part of the recovery procedure. The audit trail configuration management process 220 maintains another volume linked list containing an entry for the ADP designated to receive the restored audit trail file. As when having an overflow,
The audit trail configuration management process 220 cycles through this volume linked list to find the next ADP that has space to receive the restored audit.
The present invention is a fault tolerant computer system 200.
Provides the ability to reconstruct the audit trail without restarting the. Volumes can be added to or deleted from the set of volumes used to hold the active audit trail, its overflow space, and its restored files. The number of active files per volume, the overflow threshold, and the no transaction start threshold may also be changed. When a new active volume is added to the audit trail, extra files are allocated on that volume to add capacity to the active audit trail. Removing active volumes from the audit trail has the net effect of reducing the number of active audit trail files.

【００４４】アクティブボリュームが削除されるときに
は、そのエントリーは、もはや新しい監査証跡ファイル
を保持するために用いられないようにボリュームリンク
ド・リスト上に印を付けられる。しかしながら、ボリュ
ーム上に既に存在するあらゆるファイルは、それらがも
はや必要でなくなるまで残る。一度ボリューム上のファ
イルの全てがもはや必要でなくなると、ボリュームは、
構成におけるその先の役割から除かれる。これは、それ
が監査証跡ファイルをいまだに包含すると同時に、削除
されたボリュームが遷移的な“削除”状態であることを
意味する。先に削除されたボリュームがアクティブ状態
に復元されたときには、それはそのように印を付けられ
かつ再び新しい監査ファイルを保持するために利用可能
になりかつリンクド・リスト内にその位置を維持する。
図１４は、本発明による監査証跡状態表示９００を示
す。状態表示９００は、監査証跡消費棒グラフ９０２、
監査証跡ファイル状態チャート９０４及び状態情報領域
９０６を含む。監査証跡消費棒グラフ９０２は、改名の
資格を有していない監査証跡ファイルによって消費され
た監査証跡容量のパーセンテージを示す。オーバーフロ
ー・スレショルド及び開始トランザクション不能スレシ
ョルドは、棒グラフ上に明確に印を付けられる。その結
果、システム・オペレーターは、現行監査証跡動作の容
易に理解できる表示を供給される。When an active volume is deleted, its entry is marked on the volume linked list so that it is no longer used to hold a new audit trail file. However, any files already existing on the volume remain until they are no longer needed. Once all of the files on the volume are no longer needed, the volume will
Removed from its subsequent role in the composition. This means that the deleted volume is in a transitive "deleted" state while it still contains the audit trail files. When the previously deleted volume is restored to the active state, it becomes so marked and becomes available again to hold the new audit file and maintains its position in the linked list.
FIG. 14 illustrates an audit trail status display 900 according to the present invention. The status display 900 includes an audit trail consumption bar graph 902,
An audit trail file status chart 904 and status information area 906 are included. The audit trail consumption bar graph 902 shows the percentage of audit trail capacity consumed by audit trail files that are not entitled to rename. Overflow threshold and initiating no transaction threshold are clearly marked on the bar graph. As a result, the system operator is provided with an easily understandable indication of current audit trail behavior.

【００４５】監査証跡ファイル状態チャート９０４は、
監査証跡ファイルの名前、それらのファイル状態及びそ
れらのダンプ状態をリストで示す。ファイル状態は、フ
ァイルが利用可能、即ち、改名の資格を有するか、アク
ティブ（改名の資格を有さない）か、または予め割り当
てられた、か否かを示す。ダンピング状態は、アクティ
ブファイルに対してだけ示される。可能なダンピング状
態は、“ダンプされた”、“ダンプされなかった”、
“現行”及び“ダンピングではない”を含む。“ダンピ
ングではない”状態は、そのファイルが書き込まれたと
きに監査をダンプしないようにシステムが構成されたこ
とを示す。状態領域９０６は、現行監査証跡状態が“通
常”かまたは“オーバーフロー使用中”である、表示を
含む。また、ここでシステム・オペレーターは、改名の
資格を有さない最も古いファイル、“ファースト・ピン
・ファイル(First pinnedfile) ”の名前を見ることが
できる。そのファイルが改名の資格を有さない理由の簡
単な説明がある。図示された表示は、最も古い無資格フ
ァイルがそれが現行監査証跡ファイルなので資格を有さ
ないことを示す。上記は、本発明の好ましい実施例の完
全なる記載であるが、種々の代替、変更及び同等物が用
いられうる。本発明は、上述の実施例に対して適切な変
更を行うことによって同様に適用可能であるということ
が明らかであろう。例えば、上述の監査証跡技術は、順
番に追加されるあらゆる連続的に生成されたレコードの
記憶に適用されうる。従って、上記の説明は、添付した
特許請求の範囲の確立された限界によって画定される本
発明の範疇を限定するものとしてとらえるべきではな
い。The audit trail file status chart 904 is
List audit trail file names, their file status and their dump status. The file status indicates whether the file is available, that is, eligible for rename, active (not eligible for rename), or preallocated. The dumping status is shown only for active files. Possible damping states are "dumped", "not dumped",
Includes "current" and "non-dumping". A "non-dumping" state indicates that the system was configured not to dump audit when the file was written. The status area 906 includes an indication that the current audit trail status is "normal" or "overflow in use". Also here, the system operator can see the name of the oldest file, "First pinned file", which is not eligible to be renamed. There is a brief explanation why the file is not eligible to be renamed. The display shown shows that the oldest unqualified file is not qualified as it is the current audit trail file. While the above is a complete description of the preferred embodiment of the present invention, various alternatives, modifications and equivalents may be used. It will be clear that the invention is likewise applicable with appropriate modifications to the embodiments described above. For example, the audit trail technique described above may be applied to the storage of any serially generated records that are added in sequence. Therefore, the above description should not be taken as limiting the scope of the invention, which is defined by the established limits of the appended claims.

【００４６】[0046]

【発明の効果】本発明のフォールトトレラント計算機シ
ステムは、監査レコードを生成する監査ジェネレータ
と、監査レコードを監査ファイルに記憶する複数の監査
証跡ディスク記憶装置と、監査ジェネレータから監査レ
コードを受け取りかつ監査レコードを監査証跡ディスク
記憶装置に導く複数の監査証跡記憶処理手段と、先に割
り当てられた監査ファイルがいっぱいになると現行割当
て監査ファイルであるべき新しい監査ファイル及び現行
応答監査証跡記憶処理であるべき新しい監査証跡記憶処
理手段を選択する複数の監査証跡記憶処理手段に結合さ
れた監査証跡構成処理とを備え、各監査レコードは、監
査ジェネレータによってアクセスされたデータベースへ
の変更を記載し、各監査証跡記憶処理手段は、少なくと
も一つの監査証跡記憶ディスク装置へのアクセスを有
し、監査レコードは、監査ジェネレータから現行応答監
査証跡記憶処理手段を通って現行割当て監査ファイルへ
導かれ、各新しい監査ファイルは、そのすぐ前の監査フ
ァイルとは異なる監査証跡ディスク記憶装置上に配置さ
れ、各新しい監査証跡記憶処理手段は、応答性のハンド
オフを開始するためにそのすぐ前の監査証跡記憶処理手
段からロールオーバーメッセージを受け取る。従って、
本発明は、フォールトトレラント計算機システムは、任
意の数のディスクボリュームにわたって監査レコードを
包含している監査証跡を分散する。一つの監査証跡ファ
イルが満たされた後で、監査レコードは、異なるディス
クボリュームに記憶された次の監査証跡ファイルに向か
って指向される。新しく生成された監査証跡レコードの
記憶は、利用可能なディスクボリュームの中を回る。満
たされた監査証跡ファイルの内容は、最終的には保存さ
れて、それらのスペースは、新しく生成された監査レコ
ードの記憶のために再び利用可能になる。故障の後のオ
ンライン回復のために利用できる監査の量は、どんな単
一ディスクボリュームの記憶容量にも制限されない。更
に、満たされた監査証跡ファイルのアーカイビングと新
しく生成された監査レコードの記憶との間のディスクア
クセスに対するコンテンションは、存在しない。また、
本発明は、ディスクボリュームにオーバーフロー監査証
跡記憶として指定されることを許容する。オーバーフロ
ー・スペースは、監査ダンプに対してテープを取付ける
ためにオペレーターが利用できないか、または最も古い
ファイルが改名の資格を有する前にプライマリ監査証跡
を満たしてしまう監査発生の突然のバーストが存在する
ときのような、極端な情況において用いられる。監査証
跡レコードは、オーバーフローボリュームに転送され、
それゆえに新しい監査発生のためにスペースを解放す
る。また、システムオペレーターは、回復手順の一部と
して監査ダンプから復元された監査証跡ファイルを保持
するために用いられるローカルディスクボリュームを特
定することができる。更に、本発明では、アクティブ監
査証跡ディスクボリュームの数及びボリューム毎のファ
イルの数のような種々の監査証跡構成パラメータは、オ
ンラインで調整することができる。新しい監査ジェネレ
ータは、既存の監査証跡に追加することができる。グラ
フィックユーザーインターフェイスは、オペレーターに
監査証跡の進行状態を説明する表示手段を供給する。The fault tolerant computer system of the present invention includes an audit generator that generates audit records, a plurality of audit trail disk storage devices that store audit records in an audit file, and audit records that receive audit records from the audit generator. To the audit trail disk storage, and a new audit file that should be the current allocated audit file and a new audit that should be the current response audit trail storage processing when the previously allocated audit file is full. An audit trail configuration process coupled to the plurality of audit trail storage processing means for selecting a trail storage processing means, each audit record describing a change to the database accessed by the audit generator, and each audit trail storage processing operation. Means is at least one audit trail record Having access to the disk unit, the audit records are directed from the audit generator through the current response audit trail store processing means to the current allocated audit file, each new audit file being a different audit file than the previous audit file. Located on the trail disk storage device, each new audit trail storage processing means receives a rollover message from its immediately preceding audit trail storage processing means to initiate a responsive handoff. Therefore,
The present invention provides a fault tolerant computer system that distributes an audit trail containing audit records across any number of disk volumes. After one audit trail file is filled, the audit records are directed towards the next audit trail file stored on a different disk volume. The storage of newly created audit trail records circulates among the available disk volumes. The contents of the filled audit trail files are eventually saved and their space is made available again for storage of newly created audit records. The amount of audit available for online recovery after a failure is not limited to the storage capacity of any single disk volume. Furthermore, there is no contention for disk access between the archiving of filled audit trail files and the storage of newly created audit records. Also,
The present invention allows disk volumes to be designated as overflow audit trail storage. Overflow space is not available to the operator to mount the tape for the audit dump, or there is a sudden burst of audit occurrences that fills the primary audit trail before the oldest file is eligible to be renamed. Used in extreme situations, such as. Audit trail records are transferred to the overflow volume,
Therefore free up space for new audit occurrences. The system operator can also identify the local disk volume used to hold the audit trail files restored from the audit dump as part of the recovery procedure. Further, in the present invention, various audit trail configuration parameters such as the number of active audit trail disk volumes and the number of files per volume can be adjusted online. New audit generators can be added to existing audit trails. The graphical user interface provides the operator with a display to explain the progress of the audit trail.

【００４７】また、本発明の方法は、監査ジェネレー
タ、第１及び第２のプライマリ監査証跡記憶処理、及び
第１及び第２のプライマリ監査証跡記憶処理に対するバ
ックアップとして役立っている第１及び第２のバックア
ップ監査証跡記憶処理を備えているフォールトトレラン
ト計算機システムにおいて、監査証跡記憶処理は、監査
ジェネレータによって生成された監査レコードを監査記
憶処理にアクセス可能な監査ファイルに記憶しており、
第１のプライマリ監査証跡記憶処理から第２のプライマ
リ監査証跡記憶処理へ現行生成監査レコードの記憶を切
り換える方法であって、第１のプライマリ監査証跡記憶
処理及び第１のバックアップ監査証跡記憶処理にアクセ
ス可能な第１の監査ファイルにおける記憶のために監査
ジェネレータから監査レコードのバッファを第１のプラ
イマリ監査証跡記憶処理で受け取り、監査レコードのバ
ッファを受け取ることにより、第１の監査ファイルがい
っぱいでありかつレコードのバッファを受け入れること
ができないということを第１のプライマリ監査証跡記憶
処理で決定し、第１のプライマリ監査証跡記憶処理から
第２のプライマリ監査証跡記憶処理へロールオーバー要
求メッセージを送り、ロールオーバー要求メッセージ
は、第２のプライマリ監査証跡記憶処理及び第２のバッ
クアップ監査証跡記憶処理にアクセス可能な第２の監査
ファイルを識別する固有のシーケンス番号を含み、第１
のプライマリ監査証跡記憶処理から監査ジェネレータへ
ロールオーバー告知メッセージを送り、ロールオーバー
告知メッセージは、第１の監査ファイルがいっぱいであ
る表示及び現行生成監査レコードに対する正しい宛先と
して第２のプライマリ監査証跡記憶処理を識別する情報
を含み、第２のプライマリ監査証跡記憶処理でロールオ
ーバー要求メッセージを受け取ることにより第２のプラ
イマリ監査証跡記憶処理から第２のバックアップ監査証
跡記憶処理へ第１のチェックポイント・メッセージを送
り、第１のチェックポイント・メッセージは、監査レコ
ードの記憶が開始されるべき第２の監査ファイル内の位
置を識別するロケーター情報を含み、第１のチェックポ
イント・メッセージに応じて第２のバックアップ監査証
跡記憶処理から第２のプライマリ監査証跡記憶処理へ第
１のチェックポイント肯定応答メッセージを送り、ロー
ルオーバー要求メッセージに応じて第２のプライマリ監
査証跡記憶処理から第１のプライマリ監査証跡記憶処理
へロールオーバー肯定応答メッセージを送り、第１のプ
ライマリ監査証跡記憶処理から第１のバックアップ監査
証跡記憶処理へ第２のチェックポイント・メッセージを
送り、第２のチェックポイント・メッセージは、第１の
監査ファイルがいっぱいであるという表示を含み、第２
のチェックポイント・メッセージに応じて、第１のバッ
クアップ監査証跡記憶処理から第１のプライマリ監査証
跡記憶処理へ第２のチェックポイント肯定応答メッセー
ジを送る段階を具備する。従って、本発明は、フォール
トトレラント計算機システムは、任意の数のディスクボ
リュームにわたって監査レコードを包含している監査証
跡を分散する。一つの監査証跡ファイルが満たされた後
で、監査レコードは、異なるディスクボリュームに記憶
された次の監査証跡ファイルに向かって指向される。新
しく生成された監査証跡レコードの記憶は、利用可能な
ディスクボリュームの中を回る。満たされた監査証跡フ
ァイルの内容は、最終的には保存されて、それらのスペ
ースは、新しく生成された監査レコードの記憶のために
再び利用可能になる。故障の後のオンライン回復のため
に利用できる監査の量は、どんな単一ディスクボリュー
ムの記憶容量にも制限されない。更に、満たされた監査
証跡ファイルのアーカイビングと新しく生成された監査
レコードの記憶との間のディスクアクセスに対するコン
テンションは、存在しない。また、本発明は、ディスク
ボリュームにオーバーフロー監査証跡記憶として指定さ
れることを許容する。オーバーフロー・スペースは、監
査ダンプに対してテープを取付けるためにオペレーター
が利用できないか、または最も古いファイルが改名の資
格を有する前にプライマリ監査証跡を満たしてしまう監
査発生の突然のバーストが存在するときのような、極端
な情況において用いられる。監査証跡レコードは、オー
バーフローボリュームに転送され、それゆえに新しい監
査発生のためにスペースを解放する。また、システムオ
ペレーターは、回復手順の一部として監査ダンプから復
元された監査証跡ファイルを保持するために用いられる
ローカルディスクボリュームを特定することができる。
更に、本発明では、アクティブ監査証跡ディスクボリュ
ームの数及びボリューム毎のファイルの数のような種々
の監査証跡構成パラメータは、オンラインで調整するこ
とができる。新しい監査ジェネレータは、既存の監査証
跡に追加することができる。グラフィックユーザーイン
ターフェイスは、オペレーターに監査証跡の進行状態を
説明する表示手段を供給する。The method of the present invention also serves as a backup for the audit generator, the first and second primary audit trail storage processes, and the first and second primary audit trail storage processes. In a fault tolerant computer system equipped with backup audit trail storage processing, the audit trail storage processing stores the audit records generated by the audit generator in an audit file accessible to the audit storage processing,
A method of switching the storage of currently generated audit records from a first primary audit trail storage process to a second primary audit trail storage process, the method comprising: accessing a first primary audit trail storage process and a first backup audit trail storage process. The first audit file is full by receiving a buffer of audit records from the audit generator in the first primary audit trail store process for storage in the possible first audit file, and receiving the buffer of audit records and A rollover request message is sent from the first primary audit trail storage process to the second primary audit trail storage process by determining that the buffer of records cannot be accepted by the first primary audit trail storage process. The request message is the second primer Includes a unique sequence number that identifies the audit trail storage process and the second second audit file accessible to the backup audit trail storage process, first
Sends a rollover announcement message from the primary audit trail store operation to the audit generator, the rollover announcement message as the correct destination for the first audit file full indication and the currently generated audit record. The first checkpoint message from the second primary audit trail storage process to the second backup audit trail storage process by receiving a rollover request message in the second primary audit trail storage process. Send, the first checkpoint message includes locator information identifying the location in the second audit file at which storage of the audit record should begin, and the second backup in response to the first checkpoint message. From audit trail storage processing Send a first checkpoint acknowledgment message to the primary audit trail store process of the first and a rollover acknowledgment message from the second primary audit trail store process to the first primary audit trail store process in response to the rollover request message. Sending a second checkpoint message from the first primary audit trail storage process to the first backup audit trail storage process, the second checkpoint message indicating that the first audit file is full. Including, second
In response to the checkpoint message of step 1, sending a second checkpoint acknowledgment message from the first backup audit trail store process to the first primary audit trail store process. Accordingly, the present invention provides a fault tolerant computer system that distributes an audit trail containing audit records across any number of disk volumes. After one audit trail file is filled, the audit records are directed towards the next audit trail file stored on a different disk volume. The storage of newly created audit trail records circulates among the available disk volumes. The contents of the filled audit trail files are eventually saved and their space is made available again for storage of newly created audit records. The amount of audit available for online recovery after a failure is not limited to the storage capacity of any single disk volume. Furthermore, there is no contention for disk access between the archiving of filled audit trail files and the storage of newly created audit records. The present invention also allows disk volumes to be designated as overflow audit trail storage. Overflow space is not available to the operator to mount the tape for the audit dump, or there is a sudden burst of audit occurrences that fills the primary audit trail before the oldest file is eligible to be renamed. Used in extreme situations, such as. Audit trail records are transferred to the overflow volume, thus freeing space for new audit occurrences. The system operator can also identify the local disk volume used to hold the audit trail files restored from the audit dump as part of the recovery procedure.
Further, in the present invention, various audit trail configuration parameters such as the number of active audit trail disk volumes and the number of files per volume can be adjusted online. New audit generators can be added to existing audit trails. The graphical user interface provides the operator with a display to explain the progress of the audit trail.

【００４８】更に、本発明の方法は、監査ジェネレー
タ、プロトコル管理処理、及び複数の監査証跡記憶処理
を備えているフォールトトレラント計算機システムにお
いて、監査証跡記憶処理は、監査ジェネレータによって
生成された監査レコードを監査記憶処理にアクセス可能
な監査ファイルに記憶するためのものであり、監査証跡
記憶処理の中で現行生成監査レコードの記憶に対する応
答性を回す方法であって、ａ）プロトコル管理処理を用
いて、現行割当て監査証跡記憶処理であるべき選択され
た監査証跡記憶処理及び現行割当て監査ファイルである
べき選択された監査証跡記憶処理へアクセス可能な選択
された監査ファイルを割り当て、ｂ）現行割当て監査フ
ァイルにおける記憶のために監査ジェネレータから現行
割当て監査証跡記憶処理へ監査レコードのバッファを伝
送し、ｃ）現行割当て監査証跡記憶処理から現行割当て
監査ファイルへ監査ジェネレータから受け取った監査レ
コードのバッファを書き込み、ｄ）現行割当て監査証跡
記憶処理で、連続するバッファが書き込まれるときに現
行割当て監査ファイルの成長を監視し、ｅ）現行割当て
監査証跡記憶処理で、現行割当て監査証跡の大きさを第
１のスレショルドと比較し、ｆ）大きさが第１の所定の
スレショルドを越えるというｅ）段階の決定により、現
行割当て監査証跡記憶処理からプロトコル管理処理へ第
１のスレショルド警告メッセージを送る段階を具備す
る。従って、本発明は、フォールトトレラント計算機シ
ステムは、任意の数のディスクボリュームにわたって監
査レコードを包含している監査証跡を分散する。一つの
監査証跡ファイルが満たされた後で、監査レコードは、
異なるディスクボリュームに記憶された次の監査証跡フ
ァイルに向かって指向される。新しく生成された監査証
跡レコードの記憶は、利用可能なディスクボリュームの
中を回る。満たされた監査証跡ファイルの内容は、最終
的には保存されて、それらのスペースは、新しく生成さ
れた監査レコードの記憶のために再び利用可能になる。
故障の後のオンライン回復のために利用できる監査の量
は、どんな単一ディスクボリュームの記憶容量にも制限
されない。更に、満たされた監査証跡ファイルのアーカ
イビングと新しく生成された監査レコードの記憶との間
のディスクアクセスに対するコンテンションは、存在し
ない。また、本発明は、ディスクボリュームにオーバー
フロー監査証跡記憶として指定されることを許容する。
オーバーフロー・スペースは、監査ダンプに対してテー
プを取付けるためにオペレーターが利用できないか、ま
たは最も古いファイルが改名の資格を有する前にプライ
マリ監査証跡を満たしてしまう監査発生の突然のバース
トが存在するときのような、極端な情況において用いら
れる。監査証跡レコードは、オーバーフローボリューム
に転送され、それゆえに新しい監査発生のためにスペー
スを解放する。また、システムオペレーターは、回復手
順の一部として監査ダンプから復元された監査証跡ファ
イルを保持するために用いられるローカルディスクボリ
ュームを特定することができる。更に、本発明では、ア
クティブ監査証跡ディスクボリュームの数及びボリュー
ム毎のファイルの数のような種々の監査証跡構成パラメ
ータは、オンラインで調整することができる。新しい監
査ジェネレータは、既存の監査証跡に追加することがで
きる。グラフィックユーザーインターフェイスは、オペ
レーターに監査証跡の進行状態を説明する表示手段を供
給する。Further, in the fault tolerant computer system including the audit generator, the protocol management process, and the plurality of audit trail storage processes, the method of the present invention, the audit trail storage process processes the audit records generated by the audit generator. A method for storing in an audit file accessible to the audit storage process, and a method for rotating the responsiveness to the storage of the currently generated audit record in the audit trail storage process, comprising: a) using a protocol management process, Assign a selected audit file accessible to the selected audit trail storage process that should be the current assigned audit trail storage process and a selected audit trail storage process that should be the current assigned audit file, b) in the current assigned audit file Current allocation audit trail storage from audit generator for storage The buffer of audit records received from the audit generator from the current allocation audit trail storage process to the current allocation audit file, and d) the current allocation audit trail storage process Monitor the growth of the current allocation audit file as it is written, e) compare the size of the current allocation audit trail to a first threshold with a current allocation audit trail storage operation, and f) measure a first predetermined size. The step e) of sending a first threshold warning message from the current allocation audit trail storage process to the protocol management process upon determination of step e) to cross the threshold. Accordingly, the present invention provides a fault tolerant computer system that distributes an audit trail containing audit records across any number of disk volumes. After one audit trail file is filled, the audit records are
Directed towards the next audit trail file stored on a different disk volume. The storage of newly created audit trail records circulates among the available disk volumes. The contents of the filled audit trail files are eventually saved and their space is made available again for storage of newly created audit records.
The amount of audit available for online recovery after a failure is not limited to the storage capacity of any single disk volume. Furthermore, there is no contention for disk access between the archiving of filled audit trail files and the storage of newly created audit records. The present invention also allows disk volumes to be designated as overflow audit trail storage.
Overflow space is not available to the operator to mount the tape for the audit dump, or there is a sudden burst of audit occurrences that fills the primary audit trail before the oldest file is eligible to be renamed. Used in extreme situations, such as. Audit trail records are transferred to the overflow volume, thus freeing space for new audit occurrences. The system operator can also identify the local disk volume used to hold the audit trail files restored from the audit dump as part of the recovery procedure. Further, in the present invention, various audit trail configuration parameters such as the number of active audit trail disk volumes and the number of files per volume can be adjusted online. New audit generators can be added to existing audit trails. The graphical user interface provides the operator with a display to explain the progress of the audit trail.

【００４９】また、本発明の方法は、監査ジェネレー
タ、複数の監査証跡記憶処理を備えているフォールト
トレラント計算機システムにおいて、監査証跡記憶処理
は、監査ジェネレータによって生成された監査レコード
を監査記憶処理にアクセス可能な監査ファイルに記憶す
るためのものであり、連続する監査ファイルがいっぱい
になると、監査ジェネレータによって生成された監査レ
コードを記憶するための現行応答性は、先に応答を有す
る監査証跡記憶処理から新しく応答を有する監査証跡記
憶処理へロールオーバーメッセージを送ることによって
転送され、連続的に用いられた監査ファイルは、順番に
固有のシーケンス番号を割り当てられ、かつ各監査証跡
記憶処理は、監査レコードを記憶するために監査証跡記
憶処理の一つによって用いられた少なくとも既知の監査
ファイルを識別するシーケンス番号を記憶し、第１の監
査証跡記憶処理で受け取ったロールオーバーメッセージ
を処理するフォールトトレラント方法であり、第１の監
査証跡記憶処理は、あたかもそれが既に応答監査証跡記
憶処理であるかのように動作する方法であって、ａ）第
１の監査証跡記憶処理で、第２の監査証跡記憶処理から
ロールオーバーメッセージを受け取り、ロールオーバー
メッセージは、監査レコードを受け取るために次の監査
ファイルの監査ファイルシーケンス番号を含み、ｂ）第
１の監査証跡記憶処理で、ロールオーバーメッセージか
ら監査ファイルシーケンス番号を抽出し、ｃ）受け取っ
た監査ファイルシーケンス番号を第１の監査証跡記憶処
理に記憶された少なくとも既知の監査ファイルシーケン
ス番号と比較し、ｄ）受け取った監査ファイルシーケン
ス番号が記憶された少なくとも既知の監査ファイルシー
ケンス番号よりも大きいという段階ｃ）における決定に
より、段階ｆ）へ進み、ｅ）段階ｈ）へ進み、ｆ）第１
の監査記憶処理内に記憶された監査ファイルシーケンス
番号によって記憶された監査ファイルシーケンス番号に
よって識別された監査ファイルを閉じ、ｇ）第１の監査
証跡記憶処理から監査レコードを受け取るためにロール
オーバーメッセージ内に含まれるシーケンス番号によっ
て識別された新しい監査ファイルを開き、ｈ）ロールオ
ーバーメッセージの処理を終了する段階を具備する。従
って、本発明は、フォールトトレラント計算機システム
は、任意の数のディスクボリュームにわたって監査レコ
ードを包含している監査証跡を分散する。一つの監査証
跡ファイルが満たされた後で、監査レコードは、異なる
ディスクボリュームに記憶された次の監査証跡ファイル
に向かって指向される。新しく生成された監査証跡レコ
ードの記憶は、利用可能なディスクボリュームの中を回
る。満たされた監査証跡ファイルの内容は、最終的には
保存されて、それらのスペースは、新しく生成された監
査レコードの記憶のために再び利用可能になる。故障の
後のオンライン回復のために利用できる監査の量は、ど
んな単一ディスクボリュームの記憶容量にも制限されな
い。更に、満たされた監査証跡ファイルのアーカイビン
グと新しく生成された監査レコードの記憶との間のディ
スクアクセスに対するコンテンションは、存在しない。
また、本発明は、ディスクボリュームにオーバーフロー
監査証跡記憶として指定されることを許容する。オーバ
ーフロー・スペースは、監査ダンプに対してテープを取
付けるためにオペレーターが利用できないか、または最
も古いファイルが改名の資格を有する前にプライマリ監
査証跡を満たしてしまう監査発生の突然のバーストが存
在するときのような、極端な情況において用いられる。
監査証跡レコードは、オーバーフローボリュームに転送
され、それゆえに新しい監査発生のためにスペースを解
放する。また、システムオペレーターは、回復手順の一
部として監査ダンプから復元された監査証跡ファイルを
保持するために用いられるローカルディスクボリューム
を特定することができる。更に、本発明では、アクティ
ブ監査証跡ディスクボリュームの数及びボリューム毎の
ファイルの数のような種々の監査証跡構成パラメータ
は、オンラインで調整することができる。新しい監査ジ
ェネレータは、既存の監査証跡に追加することができ
る。グラフィックユーザーインターフェイスは、オペレ
ーターに監査証跡の進行状態を説明する表示手段を供給
する。Further, the method of the present invention comprises an audit generator and a plurality of audit trail storage processes. In a fault-tolerant computer system, the audit trail storage process accesses the audit records generated by the audit generator to the audit storage process. The current responsiveness for storing audit records generated by the audit generator is to store from the audit trail store process that has the first response when the consecutive audit files are full. A continuously used audit file transferred by sending a rollover message to an audit trail store with a new response is assigned a unique sequence number in sequence, and each audit trail store processes an audit record. By one of the audit trail storage process to store A fault tolerant method of storing a sequence number identifying at least a known audit file used and processing rollover messages received in a first audit trail storage operation, wherein the first audit trail storage operation is as if Is a response audit trail storage process, wherein a) the first audit trail storage process receives a rollover message from the second audit trail storage process, and the rollover message is: Including the audit file sequence number of the next audit file to receive the audit record, b) in the first audit trail storage process, extract the audit file sequence number from the rollover message, and c) specify the received audit file sequence number. At least the known audit files stored in the first audit trail storage process. Comparing with the file sequence number, d) determining in step c) that the received audit file sequence number is greater than at least the stored known audit file sequence number, go to step f) and e) go to step h). , F) first
In the rollover message to close the audit file identified by the audit file sequence number stored by the audit file sequence number stored in the first audit trail storage process, and g) receive the audit record from the first audit trail storage process. Opening a new audit file identified by the sequence number contained in h) and h) ending the processing of the rollover message. Accordingly, the present invention provides a fault tolerant computer system that distributes an audit trail containing audit records across any number of disk volumes. After one audit trail file is filled, the audit records are directed towards the next audit trail file stored on a different disk volume. The storage of newly created audit trail records circulates among the available disk volumes. The contents of the filled audit trail files are eventually saved and their space is made available again for storage of newly created audit records. The amount of audit available for online recovery after a failure is not limited to the storage capacity of any single disk volume. Furthermore, there is no contention for disk access between the archiving of filled audit trail files and the storage of newly created audit records.
The present invention also allows disk volumes to be designated as overflow audit trail storage. Overflow space is not available to the operator to mount the tape for the audit dump, or there is a sudden burst of audit occurrences that fills the primary audit trail before the oldest file is eligible to be renamed. Used in extreme situations, such as.
Audit trail records are transferred to the overflow volume, thus freeing space for new audit occurrences. The system operator can also identify the local disk volume used to hold the audit trail files restored from the audit dump as part of the recovery procedure. Further, in the present invention, various audit trail configuration parameters such as the number of active audit trail disk volumes and the number of files per volume can be adjusted online. New audit generators can be added to existing audit trails. The graphical user interface provides the operator with a display to explain the progress of the audit trail.

[Brief description of drawings]

【図１】本発明によるフォールトトレラント計算機シス
テムを示す図である。FIG. 1 is a diagram showing a fault tolerant computer system according to the present invention.

【図２】本発明によるフォールトトレラント計算機シス
テム上で走行する種々の処理を示す処理説明図である。FIG. 2 is a process explanatory diagram showing various processes running on a fault-tolerant computer system according to the present invention.

【図３】本発明によるマルチボリューム監査証跡を確立
する段階を記述するフローチャートである。FIG. 3 is a flow chart describing the steps of establishing a multi-volume audit trail according to the present invention.

【図４】本発明による監査証跡構成データ構造を示す図
である。FIG. 4 is a diagram showing an audit trail configuration data structure according to the present invention.

【図５】本発明による監査証跡構成データ構造を示す他
の図である。FIG. 5 is another diagram showing an audit trail configuration data structure according to the present invention.

【図６】本発明による監査証跡構成データ構造を示す他
の図である。FIG. 6 is another diagram showing an audit trail configuration data structure according to the present invention.

【図７】本発明による次の現行監査証跡ファイルを選択
する段階を記述するフローチャートである。FIG. 7 is a flow chart describing the steps of selecting the next current audit trail file in accordance with the present invention.

【図８】本発明による次の現行監査証跡ファイルを選択
する段階を記述する他のフローチャートである。FIG. 8 is another flowchart describing the steps of selecting the next current audit trail file in accordance with the present invention.

【図９】本発明によるマルチボリューム監査証跡の動作
の間中にフォールトトレラント計算機システム構成要素
間で交換されたメッセージを示す図である。FIG. 9 illustrates messages exchanged between fault tolerant computer system components during operation of a multi-volume audit trail according to the present invention.

【図１０】本発明によるマルチボリューム監査証跡の動
作の間中にフォールトトレラント計算機システム構成要
素間で交換されたメッセージを示す他の図である。FIG. 10 is another diagram illustrating messages exchanged between fault tolerant computer system components during operation of a multi-volume audit trail according to the present invention.

【図１１】本発明によるマルチボリューム監査証跡の動
作の間中にフォールトトレラント計算機システム構成要
素間で交換されたメッセージを示す他の図である。FIG. 11 is another diagram illustrating messages exchanged between fault tolerant computer system components during operation of a multi-volume audit trail according to the present invention.

【図１２】本発明によるフォールトトレラント・ロール
オーバーメッセージ・プロトコルを示す図である。FIG. 12 illustrates a fault tolerant rollover message protocol according to the present invention.

【図１３】本発明によるオーバーフロー監査証跡記憶を
用いる段階を示す図である。FIG. 13 illustrates the steps of using overflow audit trail storage according to the present invention.

【図１４】本発明による監査証跡状態表示を示す図であ
る。FIG. 14 is a diagram showing an audit trail status display according to the present invention.

[Explanation of symbols]

１００フォールトトレラント計算機システム１０２、１０４、１０６、１０８マルチ処理装置１１０、１１２、１１４、１１６デバイス・コントロ
ーラ１１８、１２０、１２２ディスク記憶装置（ディスク
ボリューム）１２４テープ記憶装置１２６システム端末装置１２８システム・バス100 fault tolerant computer system 102, 104, 106, 108 multi-processing unit 110, 112, 114, 116 device controller 118, 120, 122 disk storage device (disk volume) 124 tape storage device 126 system terminal device 128 system bus

───────────────────────────────────────────────────── フロントページの続き (72)発明者ロバートファンデルリンデンアメリカ合衆国カリフォルニア州 95117 スコッツヴァリーサウスネヴァーラ 128 (72)発明者ウィリアムジェイカーレイアメリカ合衆国カリフォルニア州 95117 サンホセメイプルウッドアベニュー 457 (72)発明者ジェームズエイライアンアメリカ合衆国カリフォルニア州 95125 サンホセマーリンウェイ 1766 (72)発明者マシューシーマックリーンアメリカ合衆国ワシントン州 98007 ベルヴィューユニットビー２ノースイーストワンハンドレッドアンドフォーティセヴンスプレイス 4421 ─────────────────────────────────────────────────── ─── Continued Front Page (72) Inventor Robert Van der Linden 95117 Scotts Valley South Nevera, California, USA 128 (72) Inventor William Jay Carrey, USA 95117 San Jose Maplewood Avenue 457 (72) Inventor James Allian California 95125 San Jose Merlin Way 1766 (72) Inventor Matthew McLean Washington 98007 Bellevue Unit B2 North East One Hundred and Forty Sevens Place 4421

Claims

[Claims]

1. A fault tolerant computer system, comprising: an audit generator for generating audit records;
A plurality of audit trail disk storage devices for storing the audit records in an audit file; a plurality of audit trail storage processing means for receiving the audit records from the audit generator and guiding the audit records to the audit trail disk storage device; Coupled to the plurality of audit trail storage processing means for selecting a new audit file to be the current allocated audit file and a new audit trail storage processing means to be the current response audit trail storage processing when the assigned audit file is full Audit trail configuration processing, each said audit record describes a modification to a database accessed by said audit generator, and each said audit trail storage processing means stores at least one audit trail storage disk device. Have access to the audit record, the audit record From the Nerator through the Current Response Audit Trail Storage Processing Means to the Current Allocation Audit File, each said new audit file is placed on a different audit trail disk storage than the immediately preceding audit file and each said new audit file. A fault tolerant computer system, wherein the trail storage processing means receives a rollover message from an immediately preceding audit trail storage processing means to initiate a responsive handoff.

2. An audit generator, first and second primary audit trail storage operations, and first and second backup audit trail storage operations serving as backups for the first and second primary audit trail storage operations. In the fault-tolerant computer system, the audit trail storage process stores the audit record generated by the audit generator in an audit file accessible to the audit storage process, and the first primary audit trail. A method of switching the storage of currently generated audit records from a storage process to the second primary audit trail storage process, the method comprising access to the first primary audit trail storage process and the first backup audit trail storage process. For storage in one audit file. A buffer of audit records is received from the negotiator in the first primary audit trail storage process, and the buffer of audit records is received, whereby the first audit file is full and the buffer of records cannot be accepted. In the first primary audit trail storage process, sending a rollover request message from the first primary audit trail storage process to the second primary audit trail storage process, wherein the rollover request message is A second primary audit trail storage process and a unique sequence number identifying a second audit file accessible to the second backup audit trail storage process, the first primary audit trail storage process to the audit generator. To rollover notification Messe And the rollover announcement message includes information identifying the second primary audit trail store as the correct destination for an indication that the first audit file is full and a currently generated audit record; Sending a first checkpoint message from the second primary audit trail storage process to the second backup audit trail storage process by receiving the rollover request message in the primary audit trail storage process of
The first checkpoint message includes locator information identifying a location in the second audit file where storage of audit records should begin, and the second checkpoint message is responsive to the second checkpoint message. Sending a first checkpoint acknowledgment message from the backup audit trail storage process to the second primary audit trail storage process, and in response to the rollover request message from the second primary audit trail storage process to the first primary audit trail storage process.
A first checkpoint message to the first backup audit trail storage process, and a second checkpoint message to the first backup audit trail storage process.
Checkpoint message includes an indication that the first audit file is full and, in response to the second checkpoint message, from the first backup audit trail storage process to the first primary audit. Sending the second checkpoint acknowledgment message to the trail storage process.

3. A fault tolerant computer system comprising an audit generator, a protocol management process, and a plurality of audit trail storage processes, wherein the audit trail storage process uses the audit records generated by the audit generator as the audit storage process. For storing in an audit file that is accessible to, and a method of rotating the responsiveness to the storage of the currently generated audit record in the audit trail storage processing, a) using the protocol management processing, Allocate a selected audit trail storage operation that should be an assigned audit trail storage operation and a selected audit file accessible to the selected audit trail storage operation that should be a current assigned audit file; b) said current allocation audit file The current allocation from the audit generator for storage in Transmitting a buffer of audit records to an audit trail storage process, c) writing the buffer of audit records received from the audit generator from the current allocation audit trail storage process to the current allocation audit file, and d) the current allocation audit trail. A store operation monitors the growth of the current allocation audit file as successive buffers are written, and e) compares the size of the current allocation audit trail to a first threshold in the current allocation audit trail storage operation. F) sending a first threshold warning message from the current allocation audit trail storage process to the protocol management process upon the determination in step e) that the magnitude exceeds the first predetermined threshold. A method characterized by the following.