Vous êtes sur la page 1sur 7

Targeted Advertising on the Web with

Inventory Management

David Maxwell Chickering David Heckerman


Microsoft Research, One Microsoft Way, Redmond, Washington 98052-6399
dmax@microsoft.com heckerma@microsoft.com
This paper was refereed.

Companies that maintain Web sites can make considerable revenue by running advertise-
ments, and they therefore compete to attract advertisers. The ability to deliver high click-
through rates on a site can attract advertisers and, under an appropriate pricing model, can
also increase revenue directly. Consequently, companies can benet from delivery systems
that display advertisements selectively to those visitors most likely to click though. To satisfy
contractual obligations, however, these systems must simultaneously manage inventory. We
developed a delivery system that maximizes click-through rate given inventory-management
constraints in the form of advertisement quotas. The system uses predictive segments in con-
junction with a linear program to perform the constrained optimization. Using a real Web
site (msn.com), we demonstrated the efcacy of the system. We can generalize our system
to nd revenue-optimal advertisement schedules under a wide variety of pricing models.
(Marketing: advertising, media. Programming: linear, applications.)

A dvertising revenue on the Web is important


for many companies that host Web sites. The
revenue can provide those companies with prof-
The method the advertiser follows to pay the pub-
lisher to display advertisements is called the pricing
model. A number of pricing models for Web advertis-
its without their charging visitors for using their ing have been developed that differ by the relative
sites. According to the Internet advertising revenue degree of risk that the publisher and the advertiser
report (Interactive Advertising Bureau 2002), Internet- take (Hoffman and Novak 2000). At one extreme,
advertising revenues in the United States totaled using the cost-per-thousand or CPM pricing model, the
$2.98 billion for the rst six months of 2002. Hoffmann advertiser pays a xed amount to the publisher every
et al. (1997) discuss the business model of sponsored time an advertisement is displayed. In this model,
content sites. the advertiser assumes all of the risk; the publisher
In a typical advertising scenario, a hosting Web site is guaranteed per-impression (per-page-view) revenue
or publisher will have a system in place for deliver- regardless of the effectiveness of the advertisement,
ing advertisements so that a visitor (a person using a whereas the advertiser benets only if the advertise-
browser that is directed to the hosting site) asking to ment is effective in inuencing those customers who
view a page will obtain the normal content from the view it. In contrast, performance-based models move
Web site plus one or more advertisements provided some or most of the risk to the publisher; under
by a third-party advertiser. A visitor can click through this general class of pricing models, the advertiser
the advertisement (direct the browser to the link asso- pays the Web site only when a customer takes some
ciated with the advertisement) by either clicking on action in response to the advertisement. For exam-
the advertisement with the mouse or by some other ple, the advertiser could pay (1) a xed amount for
means. every click, (2) a xed amount for every purchase,

0092-2102/03/3305/0071 Interfaces 2003 INFORMS


1526-551X electronic ISSN Vol. 33, No. 5, SeptemberOctober 2003, pp. 7177
CHICKERING AND HECKERMAN
Targeted Advertising on the Web

or (3) a xed percent of the purchase price for every guarantees). Furthermore, when they are targeting,
purchase. publishers can face per-targeted-group quotas. That
As described in the Internet advertising revenue is, the publisher will promise to deliver a mini-
report (Interactive Advertising Bureau 2002), roughly mum number of impressions to visitors that interest
40 percent of the 2002 second-quarter advertising the advertiser (for example, people reading a sports
revenue came from deals based on a so-called story). Given these quotas and given that a limited
hybrid pricing model that combines CPM with a number of visitors come to the publishers Web site
performance-based model. An example is a CPM deal during any period, the publisher will face inventory-
that gives a bonus to the publisher for every click. management constraints: (1) a publisher must not
Such a combination has the advantage of providing oversell to advertisers, and (2) the delivery system
the publisher with guaranteed revenue and an incen- must deliver all of the advertisements sold, regardless
tive to place advertisements in a way that benets the of the type of visitors that come to the site.
advertiser. In the context of the CPM pricing model, Adler
As Baudisch and Leopold (2000) discuss, many et al. (2002) consider the problem of allocating banner
companies have turned to targeting to compete advertisements, given inventory-management con-
for advertising dollars; they employ advertisement- straints, when each impression is modeled as a con-
delivery systems that use information collected about strained area into which varying numbers of banner
the visitors to decide which advertisements to show. advertisements of different sizes can be placed. Find-
This information can include demographic informa- ing the revenue-optimal schedule, which Adler et al.
tion that the visitor has previously entered, it can (2002) show is NP-hard, is related to the well-known
include the set of pages previously visited on the (and NP-hard) bin-packing problem; Adler et al.
publishers site, or it can be simply the specic area (2002) provide a 2-approximation for the optimal
of the publishers site (the so-called content chan- solution. Amiri and Menon (2001) and Kumar et al.
nel) that the user is navigating. For example, the (2001) consider variations of this problem and alter-
advertisement-delivery system could serve sports- native heuristic solutions.
related advertisements on any sports-related page on We developed a system for serving advertisements
the site. to maximize the overall number of clicks on a Web
Both the publisher and the advertiser can bene- site; that is, our system maximizes revenue under
t from targeting, regardless of the pricing model. a hybrid pricing model that gives the publisher a
In a pure CPM pricing model, the publisher can constant bonus for every advertisement clicked. We
implement differential (per impression) pricing based used information about the visitor in conjunction
on the advertisers desire to reach a particular type with a linear program to construct an advertisement-
of visitor. For example, a tennis-clothing company delivery system that maximizes the expected number
advertising on a news site might be willing to pay of clicks given the inventory-management constraints.
a large amount for any advertisement shown in We developed the system because the managers of
the sports section; the publisher can gain large per- a real-world Web site (msn.com) wanted to increase
impression revenue, while the advertiser pays to click-through rates on their advertisements; the pric-
reach only those visitors of interest. In a hybrid ing model at the time was strict CPM, but the man-
pricing model or in a pure performance-based pric- agers believed that the advertisers would appreciate
ing model, the publisher can maximize revenue by higher click-through rates.
showing advertisements based on expected revenue, A linear program to solve a similar advertisement-
and the advertiser can maximize revenue by adjust- delivery problem was developed independently by
ing per-click (or per-purchase or percent-of-purchase) Langheinrich et al. (1999). Our system contains novel
payments. extensions to this approach as well asto our
Deals based on CPM or hybrid pricing mod- knowledgeits rst empirical validation in a real
els typically include quotas (minimum-impression setting.

Interfaces
72 Vol. 33, No. 5, SeptemberOctober 2003
CHICKERING AND HECKERMAN
Targeted Advertising on the Web

The Basic Approach In the second phase of our approach, we use the
estimated click-through probabilities to construct a
An important concept underlying our approach is the
idea of an impression context. Intuitively, an impres- new schedule that maximizes the expected overall
sion context is the information about the impression click-through probability for the site. To describe this
(page view) that the targeting system uses to decide phase in more detail, we need some notation. Assume
what advertisements to show. We can dene impres- there are m segments and n advertisements. We use
sion contexts in a number of ways. An impression pij  i = 1     n, j = 1     m, to denote the probability,
context may depend on the entire history of the visitor estimated in the rst phase, that a visitor will click
to that site who is about to receive the impression. on advertisement i shown in segment j. We dene a

Alternatively, an impression context may correspond particular delivery schedule by the set X = i j xij
,
to the area of the Web site on which the impres- where xij is the number of impressions of advertise-
sion is being delivered. For example, a news site ment i to be shown in segment j in some specied
may have contexts corresponding to advertisements amount of time T (for example, one day, one week,
shown on front-page, nance, sports, entertainment, or one month).
and weather sections of the site. An advantage to Assuming that the click-through probabilities do
using this simple context is that we obtain some infor- not depend on the schedule, an observation we verify
mation about each user (that he or she is reading a experimentally, we can express the expected overall
story in a particular section of the site) without the click-through probability on the site, for any sched-
need to track users across the site. ule, as
At the core of our approach is the use of predic- E overall click-through probability
tive segments or clusters. We partition the impression
contexts into a small number of segments and then 
n 
m xij
= pij  (1)
estimate for every advertisement in each segment the i=1 j=1
N
click-through probability (the probability that a visi-
where N is the total number of impressions to be
tor shown the advertisement in the segment will click
delivered in time T .
through on the advertisement). We then use these
Let qi denote the quota for advertisement ithe
individual click-through probabilities to target deliv-
minimum number of all impressions (in time T ) of
ery so as to increase the overall click-through proba-
advertisement i to be shown. In addition, let sj denote
bility on the site. In the approach we discuss, we use
the capacity of segment jthe maximum number of
the simple impression context corresponding to the
all impressions (in time T ) that could be shown in
area of the Web site on which the impression is being
segment j. The quantities qi are determined by the
delivered.
advertisers, whereas the quantities sj are determined
Our approach consists of two phases. In the rst
by the amount of trafc on the site. Also, the capac-
phase, the system delivers each advertisement with
ities are uncertain, because they are determined by
probability proportional to its quota (and without
future trafc. Nonetheless, at least for large sites and
regard to segment) and collects statistics about click
for optimization problems in which T is fairly small
through. In particular, for each advertisement and
(less than a month), capacities are stable over time
segment the system records (1) the number of times
and can be estimated with little error.
that advertisement was shown in the segment, and
For each advertisement i, the quota qi imposes the
(2) the number of times that a visitor shown the
following constraint on the delivery schedule:
advertisement in the segment clicks through. Using
these counts, we estimate the click-through probabil- 
m

ity for each advertisement in each segment. We need xij qi  (2)


j=1
to run the rst phase only long enough to get accurate
probability estimates. The greater the number of seg- That is, the number of impressions in which adver-
ments, the longer we must run the collection phase. tisement i is delivered must be at least the number

Interfaces
Vol. 33, No. 5, SeptemberOctober 2003 73
CHICKERING AND HECKERMAN
Targeted Advertising on the Web

promised to the advertiser. Similarly, to avoid over- sample, we are almost guaranteed to have two differ-
booking any section on the site, we have the con- ent estimates for the two probabilities. Suppose that
straint, for each segment j: one of the estimated probabilities is 0.501 and the other
is 0.499. In this case, the linear program is likely to

n
place all impressions of this advertisement in the seg-
xij sj  (3)
i=1 ment with the higher probability. We would prefer a
 more uniform placement for two reasons. First, visi-
We would like nd the schedule X = i j xij
that
tors in a given segment will get a greater variety of
maximizes Equation (1), subject to the inventory-
advertisements. The managers of msn.com, the site we
management constraints expressed in Equations (2)
studied, found this property highly desirable. Second,
and (3). Because of the enormous number of hits that we expect this approach to yield higher overall click-
typical Web sites receive per day, it is reasonable to through probabilities, because we avoid overtting the
treat each xij as a continuous variable. For example, training data by making the solution less sensitive to
using a time unit of a day, the average xij in our exper- uctuations in individual click-through probabilities.
iments was in the thousands; the difference in overall Extending the basic linear-program approach,
click-through probability between serving, say, 2,342.7 Tomlin (2000) solves this problem by optimizing a
impressions of a particular advertisement per day to nonlinear function of X that trades off expected over-
a segment versus serving 2,343 such impressions is all click-through probability with the uniformity of
insignicant. As a resultbecause the objective func- the solution. Our approach is to bucket the pij values.
tion is a linear function of X, and both constraints are In particular, we partition the pij values into sets or
linear functions of Xwe identify the optimal sched- buckets of similar value and replace each pij with the
ule using a linear program (Chvtal 1983). mean of the bucket into which it falls. We then opti-
Once we have identied the optimal schedule X, mize the delivery of advertisements with these new
the delivery system must deliver xij impressions of click-through probabilities.
advertisement i to segment j. A straightforward way Given a desired number of buckets k, we use a sim-
to determine approximately the right number of each ple agglomerative clustering algorithm to identify the
advertisement to show follows. When delivering an buckets. Initially, we place each pij value in a separate
impression in segment j, we randomly choose to serve bucket. Then, as long as we have more than k buckets,
advertisement i with probability we merge the two buckets whose means are the clos-
xij est. The best choice for k will depend on the domain
  and should be chosen empirically.
i  xi  j With bucketing added, there will likely be many
With this approach, the system does not need to optimal schedules because many click-through prob-
keep track of which advertisements it has already abilities are equal. We break these ties by nding the
served. Furthermore, the random nature of the algo- most uniform of schedules among the optimal ones.
rithm ensures that any particular visitor is likely to be That is, we rst run the original linear program to
shown a variety of advertisements. identify an optimal schedule and note its expected
overall click-through probability (C). Then, we dene
a second optimization that chooses an optimal sched-
Simple Extensions ule (which is likely to be different from the previously
chosen optimal schedule) that is closest to the sched-
Our basic approach has a potential problem: The solu-
ule in which each advertisement is shown the same
tion to the linear program can be sensitive to small
number of times in each segment. In particular, we
errors in the estimates of pij . For example, suppose minimize the following objective function:
that for two different segments, the true click-through  
n  m  
probabilities for a particular advertisement are iden- xij qi  (4)
 m
tical and equal to 0.5. Even with a reasonably large i=1 j=1

Interfaces
74 Vol. 33, No. 5, SeptemberOctober 2003
CHICKERING AND HECKERMAN
Targeted Advertising on the Web

subject to constraints to be discussed. Recall that qi Experimental Results


is the number of impressions promised for advertise-
We used our approach to deliver banner advertise-
ment i and that m is the total number of segments.
ments on the msn.com Web site. At the time, the
Thus, if we place an equal number of impressions of a
msn.com site was organized in about 20 sections, with
particular advertisement i in each of the m segments,
each section corresponding to a broad class of news
we will have xi j = qi /m for all j. Term (4) measures a
stories. The site was running roughly 500 advertise-
distance in impressions, for each advertisement, from
ments at a time. Also, the site and the advertisers
this uniform conguration.
scheduled the advertisements manually. In particu-
The constraints of this optimization problem include
lar, the advertisers chose, for each advertisement and
the constraints from the original problem (Equa-
for each segment, a daily quota that did not violate
tions (2) and (3)) and the added constraint that the
the capacity constraints of the site. In our experi-
new schedule have the same (optimal) overall click-
ments, we used segments that corresponded to these
through probability as that identied by the rst linear
Web-site sections. For example, if msn.com delivered
program. In particular, we include the constraint
an impression in the sports section, we labeled that

n 
m xij impression as belonging to the sports segment.
pij = C
i=1 j=1
N In a preliminary experiment, we performed a pas-
The secondary optimization thus identies the most sive test of our approach. That is, without imple-
uniform delivery schedule, subject to the inventory- menting our schedule on the site, we estimated the
management constraints and the constraint that the improvement in overall click-through probability that
schedule must have the optimal overall expected would have resulted had we implemented the sched-
click-through probability. It is well known that the ule. We simulated the rescheduling of advertisements
second optimization, which involves absolute-value across the entire site.
terms, can be solved by a linear program as well. We collected statistics from the Web logs of Decem-
Earlier we stipulated that advertisements should ber 21 and 22, 1998. Approximately 1.5 million
be served uniformly across segments in the data- impressions were delivered on each day. We used
collection phase of the process so that we would have counts extracted from the December 21 logs to esti-
estimates of pij for all i and j. In fact, we do not need mate the click probabilities and segment capacities.
to estimate the probability pij if we do not plan to show We estimated each probability using an average of
advertisement i in segment j. For example, suppose a 4,000 impressions. Then, we ran the linear program
segment corresponds to impressions in the sports area to identify the schedule (with T = 1 day) that maxi-
of a Web site, and an advertiser makes a specic request mized the expected overall click-through probability.
not to show any of its advertisements in this segment. The linear program identied the optimal schedule in
We can implement this request as the linear constraint less than a minute on a Pentium II 200 MHz computer
xij = 0. Equivalently (and more efciently) we can sim- running the Windows NT 4.0 operating system.
ply remove all instances of xij from the optimization. We used counts extracted from the December 22
We can add new advertisements dynamically to our logs to estimate how well the resulting schedule
system easily as long as the current schedule has not would have worked. In particular, we used the data
consumed the segments capacity. In particular, we from the second day to reestimate each click-through
can collect data (phase one) for a new set of advertise- probability pij and then calculated the expected over-
ments, while an existing (optimal) delivery schedule all click-through probability for the optimized sched-
is in effect. After collecting these statistics, we can nd ule via Equation (1) under the assumption that
a new optimal schedule that includes the new adver- changes in schedule do not inuence click-through
tisements. Removing current advertisements from the probabilities. We compared this number to the actual
schedule is even easier: We simply reoptimize with overall click-through probability seen on the second
fewer advertisements using the pij values that are still day and found that our approach yielded an improve-
relevant. ment of between 20 and 30 percent depending on the

Interfaces
Vol. 33, No. 5, SeptemberOctober 2003 75
CHICKERING AND HECKERMAN
Targeted Advertising on the Web

method used to estimate (smooth) the pij values. In between the two weekends. In contrast, our approach
addition, the schedule successfully avoided overbook- yielded a 30 percent increase for the targeted adver-
ing (because the expected and actual capacities were tisements. As in the passive experiment, the schedule
close) and thus fullled all quotas. avoided overbooking. In addition, we found that the
The managers of the msn.com site were quite click-through probabilities pij were almost identical
pleased with these results and, in conjunction with a before and after the schedule change, thus validating
particular advertiser, authorized an active experiment. one of the assumptions underlying our method.
The advertiser had ve advertisements running on As with the passive experiment, the managers of
msn.com and was interested in how much we could msn.com were quite pleased with these results. At
increase the overall click-through probability on these their request, however, we do not report a net mone-
advertisements. For this experiment, we estimated the tary gain from our approach.
click-through probabilities using statistics from the
entire weekend of May 15, 1999. We estimated each
probability using roughly 15,000 impressions. We par- Additional Extensions
titioned these probabilities into 10 buckets. (We chose There are several straightforward extensions to our
k = 10 buckets by repeating the passive experiment approach. We can use our method to optimize any
for many values of k using earlier data from the site.) linear function of X, not just the overall click-through
Then, we used our approach to identify a uniform probability. For example, we could add a constant ij
schedule with maximum overall click-through prob- to each term in Equation (1) that weighs the impor-
ability. Because the number of advertisements was tance of showing the given advertisement. The site
small, the linear programs ran in under a second. could then give preferential treatment to, for example,
Finally, we implemented this schedule during the fol- advertisers who pay more.
lowing weekend of May 22 (Figure 1). Assuming the data is available, it is straightforward
The overall click-through probability for the non- to construct an appropriate (linear) objective function
targeted advertisements did not change signicantly to maximize for almost any pricing model. For exam-
ple, if we redene each pij term from Equation (1) to
denote the probability that an impression of adver-
tisement i in segment j will result in a purchase, our
approach can be applied directly to nd the revenue-
Targeted ads optimal schedule under a hybrid pricing model in
Probability

Other ads which the publisher is paid a xed bonus for every
purchase. More generally, suppose that for each adver-
tisement i and segment j, we can estimate the expected
prot rij that will result from showing the advertise-
ment in the segment. We can then use our process to
maximize the total expected revenue across advertis-

0 ers ( ij rij xij ) using the same inventory-management
constraints used in the original formulation of the
May 15 -16 May 22-23 problem. We obtain a revenue-optimal schedule under
(Pretargeting) (Posttargeting) a pure performance-based pricing model by further
removing the quota constraints.
The schedule that maximizes the overall click-
Figure 1: An active implementation of our system on msn.com showed
the relative overall click-through probabilities of targeted and nontargeted
through probability across all advertisers may reduce
advertisements during the weekends of May 15 to 16, 1999 and May 22 the number of clicks for a particular advertiser. In
to 23, 1999. (At the request of msn.com, we do not show the absolute another extension, we can explicitly prevent this from
magnitudes of the overall click-through probabilities.) happening (in expectation) by adding the constraint

Interfaces
76 Vol. 33, No. 5, SeptemberOctober 2003
CHICKERING AND HECKERMAN
Targeted Advertising on the Web

that the expected net click-through probability for on Inform. Systems and Technology (CIST 2001). Miami Beach,
each particular advertiser be no less than this proba- FL, 133140.
bility in the pretargeted schedule. As another exam- Baudisch, P., D. Leopold. 2000. Indifference, dislike, action: Web
advertising involving users. Netnomics J. 2(1) 7583.
ple, we can include targeted branding in our system
Chvtal, V. 1983. Linear Programming. W. H. Freeman and Company,
by allowing advertisers to require that a certain num- New York.
ber of advertisement impressions remain in particular Hoffman, D. L., T. P. Novak. 2000. Advertising pricing models for
segments while allowing the remaining impressions the World Wide Web. D. Hurley, B. Kahin, V. Varian, eds. Inter-
to be optimized for click throughs. net Publishing and Beyond: The Economics of Digital Information
and Intellectual Property. MIT Press, Cambridge, MA.
Finally, Interactive Public Relations (1996) shows
Hoffman, D., T. Novak, P. Chatterjee. 1997. Commercial scenar-
that the click-through probability for an advertise- ios for the Web: Opportunities and challenges. R. Kalakota,
ment will depend (in a nonlinear way) on the num- A. Whinston, eds. Readings in Electronic Commerce. Addison-
ber of times a user has seen that advertisement. In Wesley, Reading, MA, 2954.
our approach, we do not model this effect. This omis- Interactive Advertising Bureau. 2002. IAB Internet advertising
sion likely does little harm in our msn.com applica- revenue report: Second quarter results. http://www.iab.net/
resources/ad_revenue.asp.
tion, because a user is unlikely to see the same adver-
Interactive Public Relations. 1996. The sweet spot in banner adver-
tisement more than once on this large site. Nonethe- tising. Interactive Public Relations 2(18).
less, it would be interesting to extend our approach to Kumar, S., V. S. Jacob, C. Sriskandarajah. 2001. Hybrid genetic
include nonlinear optimization that could take these algorithms for scheduling advertising on a Web page. Proc.
effects into account. 22nd Internat. Conf. Inform. Systems (ICIS). New Orleans, LA,
461468.
Langheinrich, M., A. Nakamura, N. Abe, T. Kamba, Y. Koseki.
1999. Unintrusive customization techniques for Web advertis-
References ing. Proc. 8th Internat. World Wide Web Conf., Toronto, Ontario,
Adler, M., P. Gibbons, Y. Matias. 2002. Scheduling space-sharing for Canada, 181194.
Internet advertising. J. Scheduling 5(2) 103119. Tomlin, J. A. 2000. An entropy approach to unintrusive targeted
Amiri, A., S. Menon. 2001. Internet banner advertisement schedul- advertising on the Web. Proc. 9th Internat. World Wide Web
ing via Lagrangean decomposition. Proc. Sixth INFORMS Conf. Conf., Amsterdam, The Netherlands, 767774.

Interfaces
Vol. 33, No. 5, SeptemberOctober 2003 77

Vous aimerez peut-être aussi