Académique Documents
Professionnel Documents
Culture Documents
1 of 7
http://www.codeproject.com/Articles/63929/Correlated-Sub-Query-vs-C...
Sign in
home
articles
quick answers
discussions
features
community
help
CPOL
Rate this:
3.00 (6 votes)
A simple case study to compare the efficiency of using a Sub-Query vs a Case statement to perform data aggregation.
Introduction
This is a simple case study to compare the efficiency of using a correlated Sub-Query or a case statement to perform data aggregation.
Background
Here is a common scenario: I have a news web-site that displays stories grouped by the usual categories:
World news
Local news
Sport
Lifestyle
Etc.
When a user leaves the site, the number of page views for each news item are stored in the database.
Hide Copy Code
Local
Sport
World
Sport
Total
Hits
2010-02-25 1
2010-02-26 1
36
44
2010-02-27 35
10
47
Date
Local
News
To demonstrate the steps required, let's build a simple sample database named NewsSite:
Hide Shrink
Copy Code
USE Master
GO
CREATE DATABASE NewsSite
Go
-- Now add some tables...
6/11/2015 1:03 AM
2 of 7
http://www.codeproject.com/Articles/63929/Correlated-Sub-Query-vs-C...
USE NewsSite
GO
CREATE TABLE Category(
ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
Name VARCHAR(30) UNIQUE NOT NULL
)
GO
CREATE TABLE NewsItem(
ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
Cat_ID INT NOT NULL FOREIGN KEY REFERENCES Category(ID),
ItemDate DATETIME NOT NULL DEFAULT GETDATE(),
Name VarChar(100) NOT NULL
)
GO
CREATE TABLE NewsHits(
ID INT IDENTITY(1,1) NOT NULL PRIMARY KEY,
Date DateTime Not Null Default GETDATE(),
News_ID INT NOT NULL FOREIGN KEY REFERENCES NewsItem(ID),
Hits INT NOT NULL
)
GO
Now we add some dummy data (I will skip some of the boring stuff, but you will get the picture...
Hide Shrink
Copy Code
USE NewsSite
GO
-- Category Table
INSERT Category(Name)
VALUES('Local News')
INSERT Category(Name)
VALUES('World News')
INSERT Category(Name)
VALUES('Local Sport')
INSERT Category(Name)
VALUES('World Sport')
INSERT Category(Name)
VALUES('Business')
INSERT Category(Name)
VALUES('Lifestyle')
INSERT Category(Name)
VALUES('Crime')
INSERT Category(Name)
VALUES('Weather')
-- NewsItem Table
INSERT NewsItem(Cat_ID, ItemDate, ItemTitle)
VALUES(1, '2/27/2010','Man Bites Dog')
INSERT NewsItem(Cat_ID, ItemDate, ItemTitle)
VALUES(1, '2/27/2010','Rat Bites Cat')
INSERT NewsItem(Cat_ID, ItemDate, ItemTitle)
VALUES(3, '2/27/2010','Rugby League Player Not in Trouble')
INSERT NewsItem(Cat_ID, ItemDate, ItemTitle)
VALUES(4, '2/27/2010','NFL Player Goes to Jail')
-- ...
-- ...
INSERT NewsItem(Cat_ID, ItemDate, ItemTitle)
VALUES(5, '2/27/2010','USD Plummets')
INSERT NewsItem(Cat_ID, ItemDate, ItemTitle)
VALUES(7, '2/27/2010','Murder in East LA')
INSERT NewsItem(Cat_ID, ItemDate, ItemTitle)
VALUES(8, '2/27/2010','Chile Tsunami Warning')
-- NewsHits Table
INSERT NewsHits(Date,
VALUES('2/25/2010',
INSERT NewsHits(Date,
VALUES('2/25/2010',
INSERT NewsHits(Date,
VALUES('2/25/2010',
-- ...
-- ...
INSERT NewsHits(Date,
VALUES('2/27/2010',
INSERT NewsHits(Date,
VALUES('2/26/2010',
Go
News_ID, Hits)
1, 2)
News_ID, Hits)
3, 1)
News_ID, Hits)
7, 4)
News_ID, Hits)
2, 2)
News_ID, Hits)
4, 7)
6/11/2015 1:03 AM
3 of 7
http://www.codeproject.com/Articles/63929/Correlated-Sub-Query-vs-C...
Copy Code
USE NewsSite
GO
-- Outer Query
SELECT nh.Date,
CASE ISNULL(ln.LocalNews, 0)
WHEN 0 THEN 0
ELSE ln.LocalNews
END AS LocalNews,
CASE ISNULL(wn.WorldNews, 0)
WHEN 0 THEN 0
ELSE wn.WorldNews
END AS WorldNews,
CASE ISNULL(ls.LocalSport, 0)
WHEN 0 THEN 0
ELSE ls.LocalSport
END AS LocalSport,
CASE ISNULL(ws.WorldSport, 0)
WHEN 0 THEN 0
ELSE ws.WorldSport
END AS WorldSport,
CASE ISNULL(b.Business, 0)
WHEN 0 THEN 0
ELSE b.Business
END AS Business,
CASE ISNULL(l.Lifestyle, 0)
WHEN 0 THEN 0
ELSE l.Lifestyle
END AS LifeStyle,
CASE ISNULL(c.Crime, 0)
WHEN 0 THEN 0
ELSE c.Crime
END AS Crime,
CASE ISNULL(w.Weather, 0)
WHEN 0 THEN 0
ELSE w.Weather
END AS Weather,
SUM(nh.Hits) As TotalHits
FROM NewsHits nh
-- Inner Queries...
LEFT JOIN (SELECT nhh.Date, SUM(nhh.Hits) As LocalNews
FROM NewsHits nhh
JOIN NewsItem nii
ON nhh.News_ID = nii.ID
JOIN Category cc
ON nii.Cat_ID = cc.ID
WHERE cc.Name = 'Local News'
GROUP BY nhh.Date) ln
ON nh.Date = ln.Date
LEFT JOIN ( SELECT nhh.Date, SUM(nhh.Hits)
FROM NewsHits nhh
JOIN NewsItem nii
ON nhh.News_ID = nii.ID
JOIN Category cc
ON nii.Cat_ID = cc.ID
WHERE cc.Name = 'World News'
GROUP BY nhh.Date) wn
ON nh.Date = wn.Date
As WorldNews
6/11/2015 1:03 AM
4 of 7
http://www.codeproject.com/Articles/63929/Correlated-Sub-Query-vs-C...
ON nii.Cat_ID = cc.ID
WHERE cc.Name = 'World Sport'
GROUP BY nhh.Date) ws
ON nh.Date = ws.Date
LEFT JOIN ( SELECT nhh.Date, SUM(nhh.Hits) As Business
FROM NewsHits nhh
JOIN NewsItem nii
ON nhh.News_ID = nii.ID
JOIN Category cc
ON nii.Cat_ID = cc.ID
WHERE cc.Name = 'Business'
GROUP BY nhh.Date) b
ON nh.Date = b.Date
LEFT JOIN ( SELECT nhh.Date, SUM(nhh.Hits) As Crime
FROM NewsHits nhh
JOIN NewsItem nii
ON nhh.News_ID = nii.ID
JOIN Category cc
ON nii.Cat_ID = cc.ID
WHERE cc.Name = 'Crime'
GROUP BY nhh.Date) c
ON nh.Date = c.Date
LEFT JOIN ( SELECT nhh.Date, SUM(nhh.Hits) As Weather
FROM NewsHits nhh
JOIN NewsItem nii
ON nhh.News_ID = nii.ID
JOIN Category cc
ON nii.Cat_ID = cc.ID
WHERE cc.Name = 'Weather'
GROUP BY nhh.Date) w
ON nh.Date = w.Date
LEFT JOIN ( SELECT nhh.Date, SUM(nhh.Hits) As LifeStyle
FROM NewsHits nhh
JOIN NewsItem nii
ON nhh.News_ID = nii.ID
JOIN Category cc
ON nii.Cat_ID = cc.ID
WHERE cc.Name = 'Lifestyle'
GROUP BY nhh.Date) l
ON nh.Date = l.Date
WHERE nh.Date BETWEEN '2/24/2010' AND '2/28/2010'
GROUP BY nh.Date, ln.LocalNews, wn.WorldNews, ls.LocalSport,
ws.WorldSport, b.Business, c.Crime, w.Weather,
l.LifeStyle
GO
Ugly, isn't it? And according to my more learned friends, not a particularly brutal version of the species!
It is pretty simple to follow, the outer query returns the results of the sub-queries for each date in the range. Run the sample script and
you will see output similar to the table above. I was quite happy to use this (or something similar but worse looking) in a production
environment when it was kindly pointed out that, in this case at least, I may be better served by using a simpler CASE statement
construct.
Copy Code
USE NewsSite
Go
SELECT nh.[Date],
SUM(CASE WHEN
THEN
SUM(CASE WHEN
THEN
SUM(CASE WHEN
THEN
SUM(CASE WHEN
THEN
SUM(CASE WHEN
THEN
SUM(CASE WHEN
THEN
SUM(CASE WHEN
'Local News',
'World News',
'Local Sport',
'World Sport',
'Business',
'Lifestyle',
6/11/2015 1:03 AM
5 of 7
http://www.codeproject.com/Articles/63929/Correlated-Sub-Query-vs-C...
CSQ
CASE
Number of Transactions
8748
1990
435
445
1.2
4.3
9.6
5.2
8.4
0.9
UPDATE statements
UPDATE statements
Network Statistics
Time Statistics
The CASE statement runs nearly twice as fast as the Correlated Sub-Query example (9.6 ms vs 5.5 ms).
The CSQ spent 1.2 ms processing on the client and then 8.4 sec waiting on the server, whereas the CASE example spends 4.2 ms
processing on the client and then a miniscule 0.9 ms waiting for the server.
This may not be significant in this small example, but in a situation where there may be thousands or more records, the CSQ starts to fade
into the distance in this race.
Using a similar structure on a large dataset, the CASE example was more than 100X faster than the CSQ. Granted, after adding an index
or two, I was able to reduce this to about 16:1.
Points of Interest
This article came about because of a short discussion in the General Database Forum, and shows how the first "good' idea you may have
when it comes to a solution may not always be the best one.
Oh, and CodeProject is a great technical resource, frequented by some talented and generous people - thanks to Mycroft Holmes and
i.jrussell for pointing me down this path.
History
6/11/2015 1:03 AM
6 of 7
http://www.codeproject.com/Articles/63929/Correlated-Sub-Query-vs-C...
Version 1
License
This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)
Share
EMAIL
6/11/2015 1:03 AM
7 of 7
http://www.codeproject.com/Articles/63929/Correlated-Sub-Query-vs-C...
Go
Search Comments
Profile popups
Spacing Relaxed
Layout Normal
Per page 25
Update
Pivoting
christopherliu
Re: Pivoting
30-Mar-10 12:41
Andy_L_J
Re: Pivoting
2-Apr-10 14:22
Christopher Liu
Re: Pivoting
6-Apr-10 7:01
Andy_L_J
6-Apr-10 20:59
Ashaman
19-Mar-10 3:58
My vote of 1
Sandeep Mewara
8-Mar-10 2:54
Andy_L_J
8-Mar-10 8:13
Re: My vote of 1
Re: My vote of 1 [modified]
General
News
TheGuru37
8-Mar-10 16:11
Suggestion
Question
Refresh
Bug
Answer
Joke
Praise
Rant
Admin
Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.
6/11/2015 1:03 AM