Set intersection oracle
A set intersection oracle (SIO) is a data structure which represents a collection of sets and can quickly answer queries about whether the set intersection of two given sets is non-empty. The input to the problem is n finite sets. The sum of the sizes of all sets is N (which also means that there are at most N distinct elements). The SIO should quickly answer any query of the form:
Minimum memory, maximum query timeWithout any pre-processing, a query can be answered by inserting the elements of Si into a temporary hash table and then checking for each element of Sk whether it is in the hash table. The query time is . Maximum memory, minimum query timeAlternatively, we can pre-process the sets and create an n-by-n table where the intersection information is already entered. Then the query time is , but the memory required is . A compromiseDefine a "large set" as a set with at least elements. Obviously there are at most such sets. Create a table of intersection data between every large set to every other large set. This requires memory. Additionally, for each large set, keep a hash table of all its elements. This requires additional memory. Given two sets, there are three possible cases:
In general, if we define a "large set" as a set with at least elements, then the number of large set is at most so the memory required is , and the query time is . Reduction to approximate distance oracleThe SIO problem can be reduced to the approximate distance oracle (DO) problem, in the following way.[1]
This graph has the following properties:
So, with a DO whose approximation factor of less than 2, we can solve the SIO problem. It is believed that the SIO problem does not have a non-trivial solution. I.e., it requires space to answer queries in time . If this conjecture is true, this implies that there is no DO with an approximation factor of less than 2 and a constant query time.[1] References
|
Portal di Ensiklopedia Dunia