[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

What to measure



Enclosed are suggestions on what we might want to measure in General and
in MDC1 in particualr.  These suggestions are based on the brief
discussion we had at the "integration workshop" as well as additional
discussions I had with various people.

Comments are welcome.

-------------------------------------------------------------------
Measurements to be performed for the HENP-GC

The measurements fall into 3 categories:

M1:  measure the benefit of using a clustering index
    i.e. given a query, find events that qualify, and access them as
   [set of (file:events)]

M2:  measure the benefit of using caching policies including:
    1) reuse of files in cache
    2) schedule first files that have the largest number of qualified 
        events for a query (useful for queries that abort before
        completion)
    3) schedule first files that are requested by most queries
         etc.

M3:  measure the benefit of physical clustering of events on tapes

For MDC1:

Given a set of typical queries:

M1:  
1) can be done using the tag data directly, but since we have
             that in the QE, we'll use that.
2) we will not measure the relative speed of indexes of QE
            vs. Objectivity indexing in MDC1 since tag database is
            small.
3) we measure time to perform the set of queries sequentially.
4) Before - no indexing, get list of OIDs, read randomly
                through objectivity (how?); After: use index

M2:  
1) launch all queries simultaneously, compare to sequential time.
2) overlap same query multiple times
3) overlap same query staggered over time

M3:
1) restructure index to reflect new organization
2) use query estimator to estimate benefit
3) no real runs

Note:  We also need to measure the effect of using
Objectivity vs. files directly.  How?