[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Multiple files per database in Objectivity



Multiple files per database in Objectivity

Dave Malon heard at the RD45 meeting of the proposed scheme that
Objectivity plans to implement for supporting multiple files per
database.  At SC98, I stopped at the Objectivity booth and talked to
Leon Guzenga and another person (I forgot his name) to explain what they
plan to do and how it will effect the GC work.  Here is what I got.

It will be possible to split each database into multiple files.  A file
is made of an integral number of containers.  A max_file_size will be
specified for each database, as well as a max_database_size.  Thus,
dividing the database size by the file size determines how many files
can be in a database.  Dividing the file size by the container size
determines the number of containers per file.  One can choose to have
exactly one container per file, but multiple containers will be
supported as well.

For example, container size: 100 MB, file size: 1 GB, database size: 50
GB, imply 10 containers/file, and max of 50 files/database.  

Neither the containers nor the files have to be full.  For consecutive
writes the system will fill containers one after another.  Suppose that
for the above setup a database is written 2 MBs chunks at a time.  Then,
after the first container will be filled with 50 chunks, the system will
automatically fill the second container, etc.  This will continue till
all 10 containers are full.  Container 11 is then the beginning of the
second file, etc.

However, if one wished to terminate a file before all of its containers
are full, that can be done by asking to write to the first container of
the next file.  For example, suppose, one wishes to close the first file
after 60 writes each 2 MBs.  50 write will fill the first container, and
10 will go into the second.  One can then choose to write the next
chunk  (the 61st chunk) into container 11, which the first container of
the second file of the database.  Space will not be wasted, though. 
Only the written pages make up the file.  So, effectively one can create
any file size up to the max declared.

All files are closed at the same time when the database is closed.

The directory will not contain the names of individual files, only the
name of the database.  However, a function will be provided to get the
container size and the file size for each database.  Thus, given an
object_id, one can use the container_id and the sizes to calculate the
file_number.  Using the database_id, one can get the database_name from
the directory.  Concatenating the database_name and the file_number is
the unique file name.

A consequence of this scheme is that all the files in the same database
must reside in the same directory.  

The reason for this scheme is to minimize the changes needed to the
Objectivity modules, while maintaining the current functionality.  A
choice of the file size to be the same as the database size is a default
that will support all current applications without change.

So, I think we can work with this setup with little adjustments to the
GC components.

Arie.

p.s.  I asked if it will be possible to have open_file, close_file
functions, so that when writing one does not have to keep tracks which
containers to write to.  A "close" could be fake, only having the effect
of stopping the writes into a file.  An "open" will have the effect of
starting the following write to the first container of the next file. 
The answer was "no", but they said they'll put this request on the list
of requested features.  If this is important to people who generate
files, they should make it clear to Objectivity as early as possible.