[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: null values value



> I don't think we can come up with a standard null value but
> should let that be application specific.  I.e., the physicists working
> with some data "know" that ave_pt = -99 means a null value
> and will just take that into account for any range queries.
> The storage manager would not have to care for query estimation
> or execution.

In principle one can treat an "application specific null value" as
just another value.  However, the bit-sliced index uses bins.  Thus,
it is necessary to know the null value so that each is given its own
bin, rather than some arbitrary range.

In addition, knowing what are the null values is also needed for the
clustering analysis.  This is because a range of valid values is used
to specify the binning.  This makes it also possible to identify
"outliers" that are not null.

Having said that, I agree with Doug that we do not necessarily need
to support the concept of "null" in the query language, although it
could be nice to do that later on.  We can leave the burden on the
physicist to find out what null values are when specifying queries.

Also, the value of a null value should indeed be "application
specific", i.e. defined for each variable.  That will complicate our
code, but I cannot see a way around that.  Perhaps for the initial
tests we can settle on some commom null values, so this issue does not
slow down the tests.

Arie.