Transaction Backup

From CometWiki

(Difference between revisions)
Jump to: navigation, search
m (added item)
 
(25 intermediate revisions not shown)
Line 2: Line 2:
Transaction Backup is both simple and complex.
Transaction Backup is both simple and complex.
-
==Backup==
+
 
 +
=== First, some definitions. ===
 +
 
 +
*'''Live Data''' -- all of the directories and files that the file server could modify. This is comprised of all directories called for by any user executing a "DirAdd" call.
 +
*'''Backup repository''' -- a place, usually a single folder, containing a faithful copy of the live data at some given point in time.
 +
*'''Checkpoint''' -- that point in time where all data in the backup repository matches the live data.
 +
*'''Transaction''' -- any file activity that changes a file.
 +
**Create a File
 +
**Create a Key
 +
**Erase a File
 +
**Clear a File
 +
**Rename a File
 +
**Delete a record
 +
**Write a record. (includes all variants such as write, rewrite, update, insert, etc.)
 +
**Windows -- Create Directory
 +
**Windows -- Create File
 +
**Windows -- Move File
 +
*'''Backup''' -- a process that makes an exact copy of the live data into the backup repository, creating a checkpoint.
 +
*'''Snapshot''' -- a process whereby transactions are applied to the backup repository, creating a checkpoint.
 +
*'''Rollback''' -- a process that copies all of the files comprising a Checkpoint back to Live Data and possibly applies transactions to it up to some arbitrary point, reestablishing the Live Data at some past point in time.
 +
 
 +
==Recording Transactions==
The server will record every packet that comes to it that modifies data.  
The server will record every packet that comes to it that modifies data.  
-
There are a finite number of these packets that the file server receives:
+
There are a finite number of these packets that the file server receives. These "transactions" will be written to a file. Each transaction will have a serial number which just increments each time a transaction is written. The transaction file will be "cycled" every so often. That is, a new transaction file will be created whose name contains the first transaction number in it. Cycling will be triggered by an action such as a Backup or Snapshot, some time increment, or some number of transactions. This will produce potentially many transaction files, but will allow them to be transmitted from server to server relatively easily.
-
*Create a File
+
*Transactions are atomic, in that every transaction will contain sufficient information so that the file in question will be changed correctly using only the information in the transaction.
-
*Create a Key?
+
*A Transaction File will be a Binary file with variable length records. 
-
*Erase a File
+
*Each record in a Transaction file will contain:
-
*Rename a File
+
*#      Length of entire record
-
*Delete a record
+
*# Length of this header
-
*Write a record. (includes all variants such as write, rewrite, update, insert, etc.)
+
*#      Packed Data Length
 +
*#      Unpacked Data Length
 +
*# This transaction number -  up to 8 bytes
 +
*# Time this transaction was recorded
 +
*# Id of task - up to 4 bytes
 +
*# Function performed
 +
*# Client node -- up to 16 bytes
 +
*# Running program - up to 8 bytes 
 +
*# Path to file - up to 260 bytes (If Keyed File, Points to the data file)
 +
*#      Packed Data -- Depends on operation -- up to 8194 bytes -- could contain the Key
-
These "transactions" will be written to a file. Each transaction will have a serial number which just increments each time a transaction is written. The transaction file will be "cycled" every so often. That is, a new transaction file will be created with its name containing the next transaction number. Cycling will be triggered by some time increment or some number of transactions. This will produce potentially many transaction files, but will allow them to be transmitted from server to server relatively easily.
+
* What about record location information? Is there sufficient info in the packed data to position the file correctly before the transaction is executed?
-
* To be decided:
+
* What about DOS calls?
-
**What is the file type?
+
 
-
***Flat file with variable length records
+
* What about encryption? Is this needed? If so, at what point in the process?
-
**Where does it live?
+
-
**What does it contain?
+
-
**# Transaction length - 2 bytes (includes these 2 bytes)
+
-
**# Transaction number - 8 bytes
+
-
**# Time Stamp -- GMT (YYYYMMDDHHMMSSmmm) Year Month Day Hour Minute Second milliseconds
+
-
**# Node Name -- 16 bytes
+
-
**# Partition Number - 4 bytes
+
-
**# Operation - 1 byte
+
-
**# File Path Length - 2 bytes
+
-
**# File Path - 260 bytes (If Keyed File, Points to the data file)
+
-
**# Data Length - 2 bytes
+
-
**# Data -- Depends on operation -- up to 8194 bytes -- could contain the Key
+
-
* What about compression and encryption? Are these needed? If so, at what point in the process?
+
==Checkpoint==
==Checkpoint==
-
Periodically, the server will perform a "checkpoint". That is, it will make a copy of all data which is valid up to a certain point in time to a backup folder.  
+
Periodically, the server will perform a "checkpoint". That is, it will replicate all Live Data to the Backup Repository. This operation could take some time and provision must be made for users to keep working during this process.  
-
This operation could take some time and provision must be made for users to keep working during this process. The following procedure allows this while providing a "snapshot" of the files as of one instance in time. This snapshot will be at the end of the process, not at the beginning. ''How important is this?''
+
A Checkpoint could be created in 2 ways:
-
When a checkpoint operation is initiated, the server will perform the following steps:
+
==='''Backup -- Make a full copy of all of the data.'''===
 +
 
 +
The Checkpoint will be at the end of this process, not at the beginning. ''How important is this?''
 +
 
 +
'''''Perform the following steps:'''''
#Remember the next transaction number (lets call it 1111)
#Remember the next transaction number (lets call it 1111)
-
#Start copying files from the main directories to a backup folder.
+
#Start copying files from the main directories to a backup repository.
#When copying is finished, remember the last transaction number. (lets call it 2222)
#When copying is finished, remember the last transaction number. (lets call it 2222)
-
#"Play" the transactions to the files in the backup folder from the first one to the last one (1111 to 2222)
+
#"Play" the transactions to the files in the backup repository from the first one to the last one (1111 to 2222)
-
#Compress all of the backup folder and name the compressed file using the last transaction number it contains (in our example, 2222).
+
#Make a Checkpoint file by compressing the backup folder. Name the compressed file using the last transaction number it contains (2222).
-
Checkpoint files could be retained, in an archive.
+
-
==Rollback==
+
==='''Snapshot -- Use the Transaction Logs to create the Checkpoint.'''===
-
At any time, the user can "rollback" the whole system by restoring a checkpoint and playing transactions up to a desired point in time. This could be on a completely different system than the main server. This would facilitate migration to a new server, making a test system, or whatever.
+
 +
This method assumes that the Backup Repository already exists, as a result of either a previous Backup or Snapshot. This method will perform better than a Backup, and guarantees that the checkpoint will be up to date ''at the instant the checkpoint is initiated.''
 +
 +
'''''Perform the following steps:'''''
 +
 +
#Remember the last transaction number (lets call it 2222).
 +
#Look up the transaction number for the previous checkpoint (Lets call it 1111).
 +
#Start a new transaction file (2223).
 +
#Remember this transaction number for the ''next'' checkpoint operation (2223).
 +
#Playback all transactions from 1111 to 2222 to the backup folder.
 +
#*If the destination file does not exist for the transaction, simply ignore it. ''There may be some  danger in this.''
 +
#Make a Checkpoint file by compressing the backup folder. Name the compressed file using the last transaction number it contains (2222).
 +
 +
'''''Compressed Checkpoint files could be retained, in an archive.'''''
 +
 +
 +
==Rollback==
 +
At any time, the user can "rollback" the whole system by restoring the Live Data up to a Checkpoint and then possibly playing transactions up to a desired point in time. This could be on a completely different system than the main server. This would facilitate migration to a new server, making a test system, or whatever.
==Added Benefits==
==Added Benefits==
There are several added benefits from this feature
There are several added benefits from this feature
-
*The Disaster Recovery Service could work with transaction files, potentially speeding up the whole process. Transaction files could be transmitted many times a day making the whole service more valuable.
+
*The Disaster Recovery Service could work with transaction files, potentially speeding up the whole process. Transaction files could be transmitted several times a day making the whole service more valuable.
*Developers could analyze transaction files to determine just how some record was modified etc.
*Developers could analyze transaction files to determine just how some record was modified etc.

Latest revision as of 22:31, 2 February 2010

Contents

Transaction Backup/Checkpoint/Rollback

Transaction Backup is both simple and complex.

First, some definitions.

  • Live Data -- all of the directories and files that the file server could modify. This is comprised of all directories called for by any user executing a "DirAdd" call.
  • Backup repository -- a place, usually a single folder, containing a faithful copy of the live data at some given point in time.
  • Checkpoint -- that point in time where all data in the backup repository matches the live data.
  • Transaction -- any file activity that changes a file.
    • Create a File
    • Create a Key
    • Erase a File
    • Clear a File
    • Rename a File
    • Delete a record
    • Write a record. (includes all variants such as write, rewrite, update, insert, etc.)
    • Windows -- Create Directory
    • Windows -- Create File
    • Windows -- Move File
  • Backup -- a process that makes an exact copy of the live data into the backup repository, creating a checkpoint.
  • Snapshot -- a process whereby transactions are applied to the backup repository, creating a checkpoint.
  • Rollback -- a process that copies all of the files comprising a Checkpoint back to Live Data and possibly applies transactions to it up to some arbitrary point, reestablishing the Live Data at some past point in time.

Recording Transactions

The server will record every packet that comes to it that modifies data. There are a finite number of these packets that the file server receives. These "transactions" will be written to a file. Each transaction will have a serial number which just increments each time a transaction is written. The transaction file will be "cycled" every so often. That is, a new transaction file will be created whose name contains the first transaction number in it. Cycling will be triggered by an action such as a Backup or Snapshot, some time increment, or some number of transactions. This will produce potentially many transaction files, but will allow them to be transmitted from server to server relatively easily.

  • Transactions are atomic, in that every transaction will contain sufficient information so that the file in question will be changed correctly using only the information in the transaction.
  • A Transaction File will be a Binary file with variable length records.
  • Each record in a Transaction file will contain:
    1. Length of entire record
    2. Length of this header
    3. Packed Data Length
    4. Unpacked Data Length
    5. This transaction number - up to 8 bytes
    6. Time this transaction was recorded
    7. Id of task - up to 4 bytes
    8. Function performed
    9. Client node -- up to 16 bytes
    10. Running program - up to 8 bytes
    11. Path to file - up to 260 bytes (If Keyed File, Points to the data file)
    12. Packed Data -- Depends on operation -- up to 8194 bytes -- could contain the Key
  • What about record location information? Is there sufficient info in the packed data to position the file correctly before the transaction is executed?
  • What about DOS calls?
  • What about encryption? Is this needed? If so, at what point in the process?

Checkpoint

Periodically, the server will perform a "checkpoint". That is, it will replicate all Live Data to the Backup Repository. This operation could take some time and provision must be made for users to keep working during this process.

A Checkpoint could be created in 2 ways:

Backup -- Make a full copy of all of the data.

The Checkpoint will be at the end of this process, not at the beginning. How important is this?

Perform the following steps:

  1. Remember the next transaction number (lets call it 1111)
  2. Start copying files from the main directories to a backup repository.
  3. When copying is finished, remember the last transaction number. (lets call it 2222)
  4. "Play" the transactions to the files in the backup repository from the first one to the last one (1111 to 2222)
  5. Make a Checkpoint file by compressing the backup folder. Name the compressed file using the last transaction number it contains (2222).

Snapshot -- Use the Transaction Logs to create the Checkpoint.

This method assumes that the Backup Repository already exists, as a result of either a previous Backup or Snapshot. This method will perform better than a Backup, and guarantees that the checkpoint will be up to date at the instant the checkpoint is initiated.

Perform the following steps:

  1. Remember the last transaction number (lets call it 2222).
  2. Look up the transaction number for the previous checkpoint (Lets call it 1111).
  3. Start a new transaction file (2223).
  4. Remember this transaction number for the next checkpoint operation (2223).
  5. Playback all transactions from 1111 to 2222 to the backup folder.
    • If the destination file does not exist for the transaction, simply ignore it. There may be some danger in this.
  6. Make a Checkpoint file by compressing the backup folder. Name the compressed file using the last transaction number it contains (2222).

Compressed Checkpoint files could be retained, in an archive.


Rollback

At any time, the user can "rollback" the whole system by restoring the Live Data up to a Checkpoint and then possibly playing transactions up to a desired point in time. This could be on a completely different system than the main server. This would facilitate migration to a new server, making a test system, or whatever.

Added Benefits

There are several added benefits from this feature

  • The Disaster Recovery Service could work with transaction files, potentially speeding up the whole process. Transaction files could be transmitted several times a day making the whole service more valuable.
  • Developers could analyze transaction files to determine just how some record was modified etc.
Personal tools