[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Storage manager to do list



Here is what I think was agreed to do to enhance the storage manager 
for MDC1.

1.  Cache status initialization.  

When the Storage Manager is initialized, the QM will initialize its
cache status, and the CM will initialize its availale cache status.

To do that, the a new method will be added between the QM and the CM,
asking "what's in the cache".  The response is a list of FIDs.

The CM will check what's in the cache, get the names and sizes, and
go to Objectivity to obtain the FIDs.  It will then update its
"available cache", and return the list of FIDs to the QM.

People involved: Luis, Alex.


2.  Get "what's in cache for this query" for estimation.

The QO can at any time ask for "query estimation".  At that time, the
QE will request "what's in cache for this query" from the QM.  The QE
passes to FID list, and a query token.  

The QM returns 2 lists:
1) list of files in cache, and 2) list of files to be cached.
If query is not in execute status (i.e it was not launched yet), the 
above lists are "what's in cache" and "the rest".  If the query was
launced, all the files that were "done" are first removed, and then
the above lists are "what's in cache" and "the rest".

What's needed: 
1) One more estimation variable: "what's in cache".
2) One more method between from the QE to QM, and assoiated variable.

People involved: Henrik, Alex.


3.  Add estimation varialbles

Two variables were recommended: 
1) "time remaining to process query", and
2) "clustering effciency" of the query.

For 1) the estimate is for moving data to cache only (no processing
or objectivity retreival included).  for MDC1, we assume a constant
parameter.  This estimate assumes no competition from other queries,
and all files are cached sequentially (no parallel caching from
HPSS).  Thus, it is "best case estimate without parallel caching".
Thus, all that will be calculated is: (number of files to be
cached)x(time to cache a file).

For 2) the estimate gives the ratio of "number of events that qualified
for this query" to "number of events that have to be moved to cache"
(regardless of what's in the cache).  For this purpose the QE will
have to maintain "total number of events" per file.

People involved: Henrik.