Niagaracq: A Scalable Continuous Query System For Internet Databases

NiagaraCQ : A Scalable
Continuous Query System

for Internet Databases
Jianjun Chen et al
Computer Sciences Dept.
University of Wisconsin-Madison
SIGMOD 2000
Presented by
Mukund Agrawal
Continuous Queries
A triple ( Q, A, Stop)
Scope also includes future data
Example
Inform me when there is a new publication related to
multi-query optimization
A broad classification
 Change based
 Timer based
NiagaraCQ
A CQ system for the Internet
Continuous Queries on XML data sets
Scalable CQ processing
Incremental group optimization
Handles both change based and timer based
queries in a uniform way
Outline
General strategy of incremental group
optimization
Query split with materialized intermediate
files
Incremental grouping of selection and join
operators
System architecture
Experimental results
NiagaraCQ command language
Creating a CQ
Create CQ_name
XML-QL query
Do action
{ START start_time} { EVERY time_interval}
{ EXPIRE expiration_time}
Delete CQ_name
Incremental group optimization
General Strategy
Why can’t we regroup all queries when

a new query is added ?
Use of expression signatures for

grouping
 Same syntax structure
 Different constant values
Expression Signature
Query examples
Where <Quotes><Quote><Symbol>INTC</></></>
element_as $g in “http://www.stock.com/quotes.xml”
construct $g
Where <Quotes><Quote><Symbol>MSFT</></></>
element_as $g in “http://www.stock.com/quotes.xml”
construct $g
Expression signatures
=
Quotes.Quote.Symbol constant
in quotes.xml
Query plans
Trigger Action I Trigger Action J
Select Select
Symbol = “INTC” Symbol = “MSFT”
File Scan File Scan
quotes.xml quotes.xml
Group
Group Signature
 Common signature of all queries in the group
Group constant table
Constant_value Dest_buffer
INTC Dest. I
MSFT Dest. J
The group plan
Incremental Grouping Algo
When a new query is submitted
If the expression signature of the new query
matches that of existing groups
Break the query plan into two parts
Remove the lower part
Add the upper part onto the group plan
else create a new group
Query split with materialized
intermediate files
Why not use a pipeline scheme ?
 Split operator may block simple queries
 Gives a single complicated execution plan
 A large portion of query plan may not need to be
executed at each invocation
 Does not work for grouping timer based queries
Using intermediate files

 Cut query plan into 2 parts at split operator
 Add a file scan operator to upper part to read
intermediate file
The query split scheme
Trade-offs
Other advantages of materialized
intermediate files
 Only the necessary queries are executed
 Uniform handling of intermediate files and
original data source files
Disadvantages
 Split operator becomes a blocking operator
 Extra disk I/Os
Incremental grouping of
selection predicates
Multiple selection predicates in a query
 CNF for predicates on same data source
Incremental grouping
 Choose the most selective conjunct
Evaluation of other predicates
 Upper levels of continuous query
Example query
Where <Quotes><Quote><Symbol>”INTC”</>
<Current_Price>$p</></> element_as $g </>
in “quotes.xml”, $p < 100
Construct $g
Range-query groups
Problem
 Intermediate files may contain duplicate tuples
Solution : Virtual intermediate files

 Virtual intermediate file stores value ranges
 One real intermediate file has a clustered index
Incremental grouping of
join operators
A join query
Quotes.Quote.Change_Ratio constant in “quotes.xml”
Where <Quotes><Quote><Symbol>$s</></>
element_as $g </> in “quotes.xml”,
<Companies><Company><Symbol>$s</></>
element_as $t</> in “companies.xml”
construct $g, $t
Queries that contain both
join and selection
Example query :
Where <Quotes><Quote><Symbol>$s</>
<Industry>”Computer Service”</></>
element_as $g </> in “quotes.xml”,
<Companies><Company><Symbol>$s</></>
element_as $t</> in “companies.xml”
construct $g, $t
Where to place the selection operator ?

 Below the join
 Above the join
Grouping timer-based queries
Challenge
 Sharing common computation
Event List
 Stores time events sorted in time order
Incremental evaluation
Invoke queries only on changed data
For each file, NiagaraCQ keeps a delta

file
Incremental evaluation of join operators

requires complete data files
Memory Caching
Thousands of continuous queries can’t
fit in memory
What should we cache ?
 Grouped query plans
 What about non-grouped queries ?
 Favor small delta files

 Front part of the event list
System Architecture
CQ processing
Experimental Results
Example query :
Where <Quotes><Quote><Symbol>”INTC”</></>
element_as $g </> in “quotes.xml”, construct $g
Thank You

Niagaracq: A Scalable Continuous Query System For Internet Databases

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Niagaracq: A Scalable Continuous Query System For Internet Databases

Transféré par

Droits d'auteur :

Formats disponibles

NiagaraCQ : A Scalable

Continuous Query System

Why can’t we regroup all queries when

Use of expression signatures for

File Scan File Scan

Group constant table

Using intermediate files

Solution : Virtual intermediate files

Where to place the selection operator ?

For each file, NiagaraCQ keeps a delta

Incremental evaluation of join operators

 Favor small delta files

Vous aimerez peut-être aussi