Opre 2016 1553

This article was downloaded by: [129.110.242.
97] On: 08 April 2017, At: 15:32

Publisher: Institute for Operations Research and the Management Sciences (INFORMS)
INFORMS is located in Maryland, USA
Operations Research
Publication details, including instructions for authors and subscription information:
http://pubsonline.informs.org
Optimizing Performance-Based Internet Advertisement

Campaigns
Radha Mookerjee, Subodha Kumar, Vijay S. Mookerjee
To cite this article:

Radha Mookerjee, Subodha Kumar, Vijay S. Mookerjee (2017) Optimizing Performance-Based Internet Advertisement
Campaigns. Operations Research 65(1):38-54. http://dx.doi.org/10.1287/opre.2016.1553
Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions
This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact permissions@informs.org.
The Publisher does not warrant or guarantee the articles accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.
Copyright 2016, INFORMS
Please scroll down for articleit is on subsequent pages
INFORMS is the largest professional society in the world for professionals in the fields of operations research, management
science, and analytics.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
OPERATIONS RESEARCH
Vol. 65, No. 1, JanuaryFebruary 2017, pp. 3854
http://pubsonline.informs.org/journal/opre/ ISSN 0030-364X (print), ISSN 1526-5463 (online)
Optimizing Performance-Based Internet Advertisement

Campaigns
Downloaded from informs.org by [129.110.242.97] on 08 April 2017, at 15:32 . For personal use only, all rights reserved.
Radha Mookerjee,a Subodha Kumar,b Vijay S. Mookerjee a

a Naveen Jindal School of Management, University of Texas at Dallas, Richardson, Texas 75080; b Mays Business School, Texas A&M
University, College Station, Texas 77843
Contact: radham@utdallas.edu (RM); subodha@tamu.edu (SK); vaym@utdallas.edu (VSM)
Received: January 4, 2014 Abstract. This study provides an approach to manage an ongoing Internet ad campaign
Revised: January 30, 2014, February 1, 2015, that substantially improves the number of clicks and the revenue earned from clicks. The
October 20, 2015, May 9, 2016 problem we study is faced by an Internet advertising firm (Chitika) that operates in the
Accepted: June 10, 2016 Boston area. Chitika contracts with publishers to place relevant advertisements (ads) over
Published Online in Articles in Advance: a specified period on publisher websites. Ad revenue accrues to the firm and the pub-
December 5, 2016
lisher only if a visitor clicks on an ad (i.e., we are considering the cost-per-click model
Subject Classifications: dynamic programming: in this study). This might imply that all visitors to the publishers website be shown ads.
applications; industries: computer/electronic; However, this is not the case if the publisher imposes a click-through-rate constraint on
marketing: advertising and media the advertising firm. This performance constraint captures the publishers desire to limit
Area of Review: OR Practice ad clutter on the website and hold the advertising firm responsible for the publishers
https://doi.org/10.1287/opre.2016.1553 opportunity cost of showing an ad that did not result in a click. We develop a predictive
model of a visitor clicking on a given ad. Using this prediction of the probability of a click,
Copyright: 2017 INFORMS we develop a decision model that uses a threshold to decide whether or not to show an ad
to the visitor. The decision models objective is to maximize the advertising firms revenue
subject to a click-through-rate constraint. A key contribution of this paper is to character-
ize the structure of the optimal solution. We study and contrast two competing solutions:
(1) a static solution, and (2) a rolling-horizon solution that resolves the problem at certain
points in the planning horizon. The static solution is shown to be optimal when accu-
rate information on the input parameters to the problem is known. However, when the
parameters to the model can only be estimated with some error, the rolling-horizon solu-
tion can perform better than the static solution. When using the rolling-horizon solution,
it becomes important to choose the appropriate resolving frequency. The implemented
models operate in real time in Chitikas advertising network. Implementation challenges
and the business impact of our solution are discussed. To present a head-to-head compar-
ison of our implemented approach with the past practice at Chitika, we implemented our
solution in parallel to the past practice.
Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2016.1553.
Keywords: Internet advertising performance constraints visitor profiling revenue optimization
1. Introduction of the performance constraint imposed on the ad net-

This paper builds on a preliminary investigation of work, and (4) presenting a head-to-head comparison
the problem in Mookerjee et al. (2012); a finalist paper of our implemented approach with the past practice
for the 2011 Daniel H. Wagner Prize for Excellence at Chitika.
in Operations Research Practice. In Mookerjee et al.
(2012), a heuristic solution to the problem is provided. 1.1. Background
In the current study, we present a full and rigorous In recent times, the Internet advertising revenue in
treatment of the problem and report on further experi- the United States has increased significantly (Kumar
ence gathered from implementing the solution. More 2016). In 2013, it increased by 17% over 2012 to reach
specifically, the current study develops on Mooker- $42.78 billion (Interactive Advertising Bureau 2014).
jee et al. (2012) by (1) finding (and mathematically This trend is expected to continue: eMarketer (2011)
characterizing) the optimal solution to the problem, estimates that the Internet advertising revenue in the
(2) demonstrating that the true value of the proposed United States will reach $50 billion by 2015. Another
rolling-horizon approach lies in its ability to cope with estimate shows that the U.S. Internet advertising will
incomplete knowledge of the problems parameters, reach $77 billion in 2016, and will comprise 35% of
(3) providing insights into the solution depending on all advertising spending, overtaking television adver-
the publishers volume of traffic and the stringency tising (Hof 2011). Finally, Internet advertising is not
38
Mookerjee, Kumar, and Mookerjee: Optimizing Performance-Based Campaigns
Operations Research 65(1), pp. 3854, 2017 INFORMS 39
purely a U.S. phenomenon. In the United Kingdom, the Internet, the display of an ad (or an impression)
the online advertising revenues in 2013 increased by can be associated to a click or sometimes, even a final
15.2% over 2012 to reach almost 6.3 billion (Interactive business outcome such as a sale or a signup (or more
Advertising Bureau UK 2014). According to a report generally, a conversion). As a result, performance-based
by Digital TV Research, the global Internet advertising pricing (where payment often depends on user clicks
spending will reach $143 billion in 2017 (Kemp 2012). generated for an ad) is the leading ad pricing model,
Figure 1 describes the main players in the Internet accounting for approximately 65% of the total ad rev-
advertising ecosystem. Other than the visitor, there are enues in 2013 (Interactive Advertising Bureau 2014,
three main entities involved in Internet advertising: Atkinson 2014).
(1) the advertiser (whose advertisement is displayed),
(2) the publisher (that provides the real estate where 1.2. Problem and Motivation
the advertisement is displayed), and, (3) the advertis- Despite the obvious attractiveness (to advertisers) of a
ing firm (usually referred to as an ad-network). The performance-based payment scheme (such as a cost-
ad-networks business is one of monetizing the traf- per-click model), at the end, it is the publisher that
fic for publishers (website owners, mobile application must bear the opportunity cost of an ad display that
providers, etc.). An ad-network gathers data on visi- does not result in a click. To address this incentive mis-
tor behavior from its large publisher network and uses alignment, an emerging trend in the ad industry is to
this knowledge to target advertisements (hereafter also require the ad-network to manage ad display at a pub-
referred to as ads) to visitors on a variety of websites. lishers site such that an efficiency (or a click-through-
Such firms collect ads from advertisers (or from firms rate) constraint is respected. Typically, the publisher
that aggregate ads from many advertisers, namely, ad- and the ad-network enter into a contract. A publisher
aggregators) and display these ads to targeted visitors usually contracts with a single or a small number of
on different publisher sites. When a visitor clicks on an ad-networks. This is because the more the traffic seen
ad on a publishers website, the advertiser pays the ad- by an ad-network, the better the ad-network is able
aggregator a contractually agreed amount that corre- to accurately estimate important characteristics of the
sponds to the cost-per-click for the advertiser (say, $1). traffic. The better the knowledge of the traffic available
A part of this revenue (say, $0.20) is retained by the ad- to an ad-network, the better its ad targeting algorithms
aggregator. The rest ($0.80) is paid to the ad-network. get, and hence lead to increased clicks. Publishers may
The ad-network shares a portion of this amount (say, contract with more than one ad-network to create some
$0.50) with the publisher and keeps the rest for itself competition among these partners.
($0.30). If there is no click and the contract with the At the end of each month, the ad-network pays a
advertiser is on a cost-per-click basis, then no money publisher a (contractually agreed upon) fraction of rev-
exchanges hands. enue accrued from clicks that were generated on the
The attractiveness of the Internet as a medium for ads displayed on the publishers site. In addition, the
advertising arises, in large part, from the ability to track publisher enforces a click-through-rate constraint on
and measure the performance of an ad campaign. On the ad-network, which requires that the monthly aver-
Figure 1. (Color online) The main players in the Internet advertising ecosystem.
Internet advertising ecosystem
Ad Ad
Ad Visitors
Aggregator Ad-network
Demand side Supply side

(advertisers) (publishers)
40 Operations Research 65(1), pp. 3854, 2017 INFORMS
age click-through-rate (over the contract period that way to use the impression details is to map them to
could be several months) is above a specified value. a click-probability. That is, given these details, what is
Such a constraint ensures that the publishers space for the probability of a click for the ad associated with the
ads on its website is used efficiently. If too many ads impression? Next, for this click-probability, should we
placed did not result in clicks, this is an indication of a show the ad? If the decision is to show an ad, the best
wasted opportunity for the publisher. Either the pub- ad is picked given what the ad-network knows about
lisher could have placed more relevant ads, or used the impressions details and the visitor. In cases where
the space for additional content. The click-through-rate the ad-network decides not to show an ad, there are
constraint essentially balances two opposing goals of usually two possibilities. First, the ad unit could simply
the publisher: (1) generate as much revenue as possible collapse and the publisher could use the extra space for
from ads, and (2) keep the website content interesting displaying additional content, or expand its display of
so that visitors continue to patronize the website. If a existing content. Second, the ad-network could make
publisher becomes too greedy and shows an excessive a programmed call to another ad-network (designated
number of ads, this could come at the expense of the by the publisher, e.g., Chitika could call Google) that
contentthe main driver of traffic to the website. In could potentially use the available slot for displaying
this case, the publisher would likely suffer in the long another ad.
run. On the other hand, if a visitor clicks on an ad, The above discussion suggests the use of a threshold
it often indicates that she likes it. In such cases, the policy. This policy can be stated as follows: If the click-
ad could be considered to be almost indistinguishable probability of an impression is greater than a thresh-
from content. old, then show the ad, otherwise do not show it. We
1.3. Overview of Solution later show that such a threshold policy is optimal. Fig-
An ad-network usually manages multiple publishers. ure 2 illustrates the overall solution process. The figure
However, the problem for each publisher can be con- depicts that the solution to the user profiling problem
sidered a separate problem as long as there is no con- consists of two parts: (1) a step involving data analytics
straint from the advertisers that links two publish- (to predict the click-probability, p), and (2) a follow-up
ers. In the problem setting considered in this paper, step involving decision analytics (to choose the thresh-
the advertisers do not impose any such constraint. old, ).
Therefore, we analyze the problem for each publisher With regard to the problem of choosing the cor-
separately. rect threshold, the natural question arises: should the
Essentially, any solution to the user profiling prob- threshold be held constant over the planning horizon,
lem can only address the problem by some variant or should the value of the threshold be varied depend-
of an ad filter: using this filter, only some users are ing on the current state of the problem, i.e., the impres-
selected to be shown ads. In past studies, this filter sions and clicks that have been observed so far and
is also referred to as targeting (Gerken 2008, Goldfarb the time left in the planning horizon (typically, this
and Tucker 2011). The actionable information associ- horizon is one month)? Put differently, should the dis-
ated with a user includes the available details of the play criterion (the threshold) be relaxed or tightened,
impression. Based on these details, a decision needs to depending on how much time is left in the planning
be made to show or not to show an ad. One convenient horizon and whether the current click-through-rate is
Figure 2. Overall solution process.
Click
Show ad

Estimated click probability p
Visitor X p = L(X)
No click
p<

Dont show ad
above or below the target level that has to be achieved information on the input parameters to the problem is
at the end of the month (on average)? If we are ahead known. However, when the parameters to the model
(i.e., the current click-through-rate is above the target can only be estimated with some error, the rolling-
value), then should the threshold be lowered (so more horizon solution can perform better than the static
ads can be shown) to potentially earn more ad rev- solution. When using the rolling-horizon solution, it
enue? On the other hand, if we are behind (i.e., the cur- becomes important to choose the appropriate resolv-
rent click-through-rate is below the target value), then ing frequency. The solution approach proposed in this
should the threshold be increased (so that ads are only paper has been implemented in Chitika. The imple-
shown to more interested visitors), in an attempt to mentation details of the solution (together with the
meet the click-through-rate constraint? Increasing the revenue impact on the company) have been presented.
threshold sacrifices ad revenue, but we may need to
take this action to achieve the target click-through-rate
at the end of the planning horizon. 2. Related Work
We briefly review related work in three areas: (1)
1.4. Genesis of the Problem optimization of Internet ad campaigns, (2) consumer
The study in this paper is motivated by the user profil- choice models, and (3) multistage decision making.
ing problem at the Internet ad-network, Chitika (www
.chitika.com). In early 2010, in an effort to grow its 2.1. Optimization of Internet Ad Campaigns
publisher base and to attract highly visible publishers Several studies in this stream have focused on the dis-
(e.g., The Wall Street Journal), Chitika launched Chitika play of ads on a variety of platforms (e.g., websites,
Premium, a new service offering publishers an inno- mobile phones, Internet-enabled game consoles, etc.)
vative value proposition. In Chitika Premium, a pub- to optimize a goal (e.g., clicks, revenue, etc.) over a
lisher could (for the first time) control the average click- given planning horizon. In addition to incorporating
through-rate of Chitikas ads. To solve this problem, it the characteristics of the ads to be displayed (such
was necessary to find a way to respect a publishers as size, location, etc.) and the challenges associated
click-through-rate constraint while collecting as much with fitting a set of ads in the available display space,
revenue as possible for Chitika and the publisher. More ad-servers usually consider advertiser constraints that
generally, it was necessary for Chitika to come up with arise from ad saturation and competition between
a way to target only those users with ads if they had a ads (Kumar et al. 2007, Turner et al. 2011, Turner 2012).
reasonable chance of clicking on them. In previous studies, the underlying optimization prob-
lem is cast as one of scheduling ads over a planning
1.5. Contributions of the Study horizon while respecting a set of constraints that reflect
The main contribution of this study is to provide an advertiser interests (Dawande et al. 2003, 2005; Kumar
approach to manage an Internet ad campaign that sub- et al. 2006). Ghosh et al. (2009b) design a bidding agent
stantially improves the number of clicks and the rev- that acquires a given number of impressions for the
enue earned from clicks. Past research on managing advertiser with a given target budget. Unlike these
ad campaigns has focused on solving the problem of studies, our problem is motivated from the joint per-
choosing the best ad to a given visitor. The implicit spective of the publisher and the ad-network. While
assumption in most extant studies on campaign man- the guarantees provided by the models in these studies
agement is that ads are shown to every arriving visitor. are directed toward advertisers, our problem requires
Given the emphasis on efficiency (namely, the click- that we provide a guarantee to the publisher, namely,
through-rate constraint), our approach decides when one of meeting or exceeding a specified click-through-
not to show an ad to a visitor. This filtering of ads must rate constraint.
occur in a manner to balance the opposing goals of rev- Some recent studies have also examined the prob-
enue and efficiency. If too many ads are filtered, the lem from the perspective of publishers. For example,
efficiency constraint would be met (or even exceeded), Najafi-Asadolahi and Fridgeirsdottir (2014) present a
but opportunities to generate clicks would likely be revenue model for the publisher to optimize the pric-
lost. On the other hand, if not enough filtering is done, ing of display ads. Further, Balseiro et al. (2014) ana-
more clicks would likely be generated at the cost of not lyze a trade-off for the publisher between the short-
meeting the target efficiency constraint. A key contri- term revenue from an ad exchange and the long-
bution of this paper is to characterize the structure of term benefits of delivering good quality impressions
the optimal solution to the problem of maximizing rev- to the advertisers. Ghosh et al. (2009a) model the pub-
enue subject to a click-through-rate constraint. Here, lishers problem of fulfilling guaranteed advance con-
we study and contrast two competing solutions: (1) a tracts by bidding in the spot market. Finally, Balseiro
static solution, and (2) a rolling-horizon solution. The et al. (2015) study the competitive landscape that arises
static solution is shown to be optimal when accurate in ad exchanges and the implications for publishers
decisions. Although these studies are related to our Ghosh et al. 2009b, Besbes and Maglaras 2012, Wang
research, their focus is clearly different. et al. 2014), both our context and implementation are
In addition to the practical issues concerning ad different from those studies.
scheduling addressed in previous work, there are a As mentioned earlier, this paper builds on a pre-
growing number of patents that deal with inventions to liminary investigation of the problem where a rolling-
measure the effectiveness of an ad campaign (Harvey horizon solution is provided as a heuristic (Mooker-
et al. 2010, Gerken 2008, Lindsay et al. 2010, Srinivasan jee et al. 2012). The current study contributes by find-
and Shamos 2010). Finally, beyond the research on ing (and mathematically characterizing) the optimal
optimizing Internet ad campaigns, there are numerous solution to the problem. In the current study, we also
examples of academic studies that examine Internet demonstrate that the true value of the rolling-horizon
advertising at a more microlevel: (1) impact of ad posi- approach lies in its ability to cope with incomplete
tion on profitability (Agarwal et al. 2011), (2) target- knowledge of the problems parameters. We also pro-
ing strategies, including privacy concerns (Evans 2009, vide insights into the manner in which the rolling-
Goldfarb and Tucker 2011), and (3) wear-in, wear-out horizon approach should best be employed; the key
effects of Internet ads (Chatterjee et al. 2003). is to carefully choose the update frequency of this
approach. Essentially, in the presence of noisy param-
2.2. Consumer Choice Models eters, there is a subtle trade-off between two effects
There is a vast body of work in the marketing area that occur in the rolling-horizon solution: (1) the pos-
that deals with predicting a consumers choice among itive effect of being able to react to the actual state of
a set of discrete alternatives. The predictive technique the problem (and hence, being able to better cope with
we employ in this study draws from the stream of lit- noise), (2) the negative force that arises because this
erature on consumer choice models. Unlike previous solution departs from the optimal policy that is shown
studies, we employ logistic regression for predicting to be static. The positive force is more influential when
in real-timea visitors probability of a click. The rea- the problem parameters are sufficiently inaccurate, but
son that prediction must occur in real time is that a the negative force hurts the rolling-horizon approach
real-time decision (to show or not to show an ad) is con- if the problem parameters are known more accurately.
tingent on this prediction. In previous studies, the use To present a head-to-head comparison of our imple-
of logit is usually made to infer consumer trends (e.g., mented approach with the past practice at Chitika, we
how display ads affect subsequent choices consumers implemented our solution in parallel to the past prac-
make to consume brand-specific content (Bucklin and tice. The results show that our approach improves the
Sismeiro 2009)). The key difference in real-time predic- number of clicks significantly.
tion arises, of course, from the fact that the prediction
must be done quickly. This time constraint naturally
3. Model and Solution
limits the set of independent variables that real-time
3.1. Description of the Data Analytic Model
predictive models can employ. We will discuss such
Here we describe how the probability of a click and the
limitations later in Section 6 where we describe actual
distribution of these probabilities in the visitor popu-
implementation details.
lation for a publisher is estimated.
2.3. Multistage Decision Making 3.1.1. Predicting the Probability of a Click. The pre-
A central feature of the problem in our study is that diction method uses a vector of observations collected
we need to manage an ongoing ad campaign, i.e., from the visitors cookie as well as meta data avail-
one that evolves over time. The two problem-solving able from the http header (Kolluri et al. 2013). These
approaches we propose in this papera static model, observations include variables such as the visitors
and a rolling-horizon modelare inspired by models search string, Internet browser, operating system, pre-
of multistage decision making that are often solved vious click data, and so on. We use logistic regression
using dynamic programming techniques (Baldacci (logit) to combine information associated with the vis-
et al. 2013, Hwang et al. 2013, Lai et al. 2010, Moallemi itor with other information about the publishers web-
and Saglam 2013). Typically, in past research, rolling- site. One of the major strengths of an ad-network lies
horizon solutions are considered to be a good way in its ability to develop a large publisher network. This
to solve complex, multistage optimization problems enables the firm to have repeated interactions with vis-
where analytical solutions prove to be intractable. On itors across different websites, thus enabling the firm
the other hand, we use the rolling-horizon approach to develop a profile of each visitor. The logit model uses
as a way to cope with the presence of incomplete this profile information to estimate the chances that
(inaccurate) knowledge of the problems parameters. a visitor will click on a given ad. This model can be
Although some of the past studies have used the expressed as a function p L(X), where p is the esti-
rolling-horizon approach in a similar manner (e.g., mated click-probability, and X is the vector of variables
Table 1. Important variables in the logit model.
Variable Description Possible values
operating_system Which operating system is being used by the Linux, Android, Mac OS, Microsoft Windows, etc.
visitor?
browser Which browser is being used by the visitor? Internet Explorer, Firefox, Chrome, Safari, Opera, etc.
search_engine Which search engine is used by the visitor? Google, Yahoo, Bing, etc.
screen_resolution Pixel density in visitors screen One among a few possibilities
bad_speller Is the visitor a bad speller? Yes, No

search-string_type Does the search string have local intent? Yes, No
keyword_interest Total number of clicks by this visitor in the past Integer
for ads on similar search strings
domain_keyword_interest Total number of clicks by all the visitors in the Integer
past for ads on similar search strings
ad_keyword_fit Match between a given ad and keyword Numeric
day Day of the visit Monday, Tuesday,. . . , Sunday
time Time of the visit Morning, Afternoon, Evening, Night
height Height of the ad unit Numeric
width Width of the ad unit Numeric
loc_x x-location of the ad Numeric
loc_y y-location of the ad Numeric
user_clicks Total number of clicks by this visitor in the past Integer
user_imps Total number of impressions for this visitor in Integer
the past
CLICK Dependent Variable Did the visitor click on the ad Yes, No
shown?
used for the prediction. The model uses over 50 differ- threshold, . Let () denote the conditional proba-
ent variables; some of the important ones are listed in bility that a visitor will click on an ad, given that the
Table 1. click-probability exceeds the threshold . This proba-
3.1.2. The Click-Probability Distribution. Using the bility is given by
above logit model, the value of p for any given visitor 1
to the publishers site can be estimated. We can esti-
p f (p) dp
mate the click-probability distribution for a given pub- () . (2)
()
lisher using a sample of these probabilities. The click-
probability distribution provides us with vital infor- Now, in the following proposition, we present two
mation. As we show later, the optimal policy requires important properties of (). The proofs are provided
us to show an ad to a visitor only if the click-probability in the electronic companion.
(p) meets or exceeds a given threshold value (say, ).
That is, show the ad only if p > , where 0 6 6 1. For Proposition 1. The conditional probability of a click (i.e.,
any given threshold, the click-probability distribution ()) increases with , i.e., d()/d > 0, < 1.
allows us to estimate the probability that an ad will be Proposition 2. A threshold of , [0, 1), ensures that the
shown to a randomly selected visitor and the condi- click-through-rate will be greater than (i.e., () > ).
tional probability that a visitor will click on an ad given
that the click-probability exceeds the threshold. Con- The result in Proposition 1 highlights the trade-off
ceptually, the probability that an ad will be shown is between the revenue and the click-through-rate. As
the upper tail of the distribution (i.e., the area under the increases, the probability of showing an impression
density curve above ), and the click-probability that a (i.e., ()) decreases. Therefore, the expected number
visitor will click on an ad is the conditional expectation of clicks (hence, the revenue of the ad-network) also
of the upper tail of the distribution. decreases with . However, as shown in Proposition 1,
Let f (p) denote the probability density of p and () the conditional probability of a click increases with .
denote the probability that an ad is shown to a visitor Hence, as increases, the click-through-rate increases.
for a given threshold . Then, () can be calculated as Proposition 2 provides a clue to find the threshold
1 needed to achieve a certain click-through-rate.
() f (p) dp. (1)
3.2. Description of the Decision Analytic Model
It is clear from the expression for () that the prob- Table 2 summarizes the main notation in the decision
ability of showing an impression decreases with the analytic model. We divide the planning horizon into K
Table 2. Model parameters and variables. Then, we have

K

Symbol Definition X
[ r] p, S I I[p, SiI ]p , and
K Number of periods in the planning horizon. i1
Depending on the situation being considered, each
XK

period could correspond to the arrival of one visitor p, S I
[ m] I[p, SiI ] .
or a certain number of visitors i1
m
Random variable for the number of impressions in the

planning horizon In the above, the expectation is over the random click
I
r Random variable for the number of clicks in the probability p and the random state matrix S . Row i
planning horizon
of this matrix is a random vector that can take values
p
p, Random variable for the click-probability, realization
represented as a vector of states, SiI , i 1, 2, . . . , K. The
of the click probability
optimization problem can now be formally expressed
f (p) Density function for the click-probability
Click-probability threshold in a static policy
as
() Probability of impression for threshold Problem P
() Conditional probability of click, given that p > K
X

Publishers click-through-rate constraint max p, S I I[p, SiI ]p ,
I
i1
K K

X X
periods where each period corresponds to an arrival of s.t. p, S I I[p, SiI ]p p, S I I[p, SiI ] > 0.
i1 i1
a visitor to the publishers website. For each arrival, we (4)
need to decide whether or not to show an ad. As dis-
cussed earlier, the objective is to maximize the expected 3.3. Solution
number of clicks subject to the following click-through- The Lagrange dual function for this problem can be
rate constraint: written as
r
> . (3) K

m X
L() max p, S I I[p, SiI ]p
I
In Equation (3), r (respectively, m) represents the i1
random variable for the number of clicks (respec- K K

X X
tively, impressions) that arise from an underlying click- + p, S I I[p, SiI ]p p, S I I[p, SiI ] ,
i1 i1
probability distribution f (p) and a given policy to
(5)
show ads. An analytical expression for this constraint
is difficult to obtain. However, it is possible to approx-
where ( > 0) is the Lagrange multiplier. In the
imate the constraint very accurately using the ratio of
above, I represents a policy to decide the show/no-
expectations, rather than finding the expectation of the
show decisions. Note that the dual function L() is an
ratio. We present this result below.
upper bound for the optimal solution to Problem P. If
Proposition 3. As the number of arrivals (i.e., K) ap- 0, then the objective functions for the dual function
proaches , [ r/ m]
approaches [ r]/[ m].
and Problem P are the same, but the constraint can be
ignored for the dual function. If > 0, the value of the
We therefore consider the revised optimization dual function for any feasible solution for Problem P
problem with the left-hand side of Constraint (3) suit- will be greater than or equal to the value of the objec-
ably replaced. Using this revised constraint, the opti- tive in Problem P. The dual function can be rewritten
mization problem can be developed as follows. as
Let I[p, SiI ] be the 1/0 (for show ad and not show
K

ad, respectively) decision variable associated with the X
ith arrival, under the policy I. Here, SiI is the state of L() max p, S I I[p, SiI ](p(1 + ) ) .
I
i1
the system (a vector) just before the ith arrival. The
state represents the entire history of events (impres- Let I be an optimizer of L(), assuming it exists.
sion and click events at each prior arrival) that have Later, we demonstrate its existence. Thus,
occurred under the policy I, i 1, 2, . . . , K. Further, p
K

represents a realization from the distribution f (p), the X
() p, S I I [p, SiI ](p(1 + ) ) .
click-probability of an arriving visitor. Note that since i1
the click-probability distribution f (p) is the same for
all arrivals, the realization p i (for the ith arrival) can If we take the maximization step inside the summation,
simply be written as p. we will get an upper bound for L() because we are
maximizing each component of this expression. Hence, It can be shown that the derivative of () with respect
we get to is
()
K
(() ).
X ( )2
() 6 p, S I max Ii [p, SiI ](p(1 + ) ) .
Ii
i1 Observe that the sign of this derivative depends only
In the above, we are allowing ourselves to choose the on the sign of the term (() ). There are two cases to
best policy Ii , associated with the ith arrival. This is a consider. For both of these cases, we will first assume
relaxation because although each component Ii is being that < 1. Before presenting the cases, we first provide
individually maximized, we are still taking an expecta- the following lemma.
I
tion over the matrix S , corresponding to the optimal Lemma 1. For a threshold policy (i.e., show ad if p > ),
policy. To maximize each component of the summa- Problem P can be written as

tion in the above, set Ii [p, SiI ] 1 if p(1 + ) > 0,

otherwise, set Ii [p, SiI ] 0. Note that Ii [p, SiI ] does not max K()(),

depend on the state SiI , i. This policy provides an
s.t. K()(() ) > 0.
upper bound for L(). Substituting this policy, we get
K
p [(p(1 + ) )]+ .
X
() 6
i1
Case 1. (0) <
We need to find a solution for () 0,
In words, the expression inside the summation in [0, ). From Proposition 2, we know that for any < 1,
the above inequality is the expectation (over p) of () > . Hence, () > . Thus, there is a solution ( )
(p(1 + ) ), in the region where this quantity and it is unique since () increases in . Also, note
is greater than or equal to zero. The random vari- that if () < , () decreases in . Thus, is a global
able p represents the click-probability of any given and unique minimum.
arrival, and follows the distribution f (p). The inequal- To show that is optimal for Problem P, we first
ity (p(1 + ) ) > 0 implies that p is greater than or need to check that this solution is feasible for Problem
equal to a threshold, i.e., p > /(1 + ). Denote this P. As can be seen from Lemma 1, the solution ()
threshold as ( /(1 + )). In the range > 0, is feasible for Problem P, i.e.,
[0, ). Hence, we get
1 K( )(( ) ) > 0.
+
p [(p(1 + ) )] (p(1 + ) ) f (p) dp
In addition, we need to show that the objective func-
() tion value of Problem P at is equal to K( ).
(() ).
For , the objective function in Lemma 1 becomes
K( ). Also,
Thus,
K() ( )( )
L() 6 (() ). K( ) K( ).

We know that an upper bound for the optimal solu-
tion to Problem P is Therefore, is optimal for Problem P.
Case 2. (0) >
min L().
>0
First, let (0) > . In this case, since 0 () > 0, there is
no solution for () , in the range (0, ). Then, in
We can rewrite it as this case, the minimizer should be set to 0 because
K() () is increasing in if (0) > . In this case, the
min L() 6 min (() ). constraint presented in Lemma 1 is satisfied because
>0 [0, )
(0) 1. Hence, 0 is feasible for Problem P. Also,
If we drop the constant K, in order to minimize the the objective function in Lemma 1 becomes K(0)(0),
expression on the right-hand side, we need to find the which is equal to
minimum value of
K(0)((0) 0)
() K(0) K(0)(0).
() (() ), [0, ). 0

Using the definitions of () and (), we can write Therefore, 0 is optimal for Problem P.
If (0) , we also have 0. In a similar manner,
1 1
1 it can be shown that this solution is optimal for Prob-
() p f (p) dp f (p) dp .

lem P.
Finally, if 1, then the only feasible solution is to set parameters of the logit model are not subject to esti-
1, implying that no visitor is shown ads. However, mation error. In other words, we focus on the uncer-
the constraint in Lemma 1 will be satisfied since (1) tainty of the number of arrivals of each type, not the
0. Also, when 1, both the objection function in uncertainty due to misspecifying our logit model. This
Lemma 1 and K( ) are equal to zero. Hence, this allows us to not worry about censoring, i.e., when an
solution is optimal for Problem P. arrival comes, we can use our logit model to estimate
Based on the above discussion, we present the fol- the click-probability for that arrival accurately.
lowing proposition. With inaccurate parameters, let us assume that the

true functions and are wrongly estimated as
Proposition 4. The optimal solution to Problem P is the and , respectively. To address inaccurate param-
following: eters, we propose a rolling-horizon approach. In a
(a) If < 1 and (0) > , 0. rolling-horizon approach, Problem P (presented in
(b) If < 1 and (0) < , then is the solution of () Section 3.2) is resolved at certain points during the
. planning horizon. Formally, after s arrivals (or (K s)
(c) If 1, 1. remaining arrivals), s (1, 2, . . . , K 1), given that the
Observe that the above proposition states that it is impressions and the clicks that have occurred are m s
optimal to use a static, threshold policy. That is, a thresh- and rs , respectively, we need to maximize the expected
old policy is optimal and the threshold value is the number of clicks in the (K s) remaining arrivals. The
same for different states. In the electronic companion, constraint requires that rs plus the expected number of
clicks in the remaining arrivals is greater than times
we provide the intuition for this result using a simple
m s plus the expected number of impressions in the
example of a problem with exactly two arrivals.
remaining (K s) arrivals. Then, as shown earlier, the
Finally, we present the following remark.
threshold that should be used for the (s + 1)th arrival
Remark 1. An alternate way to express the constraint is given by the solution of
in Problem P is to require that the probability of obtain-

()
rs + (K s) ()
ing a required click-through-rate is more than a certain . (6)
specified value . With this revised constraint, a static
m s + (K s) ()
threshold policy is no longer optimal.
The rolling-horizon approach can be expected to
The intuition for the above remark is as follows. In cope well in the presence of inaccurate model parame-
any realization of a problem, at the end, the required ters because it uses actual state information to update
click-through-rate is achieved or not achieved. With the threshold in each period. We explain the basic
the expectation constraint, the click-through-rate val- idea behind varying the thresholds as follows. Imagine
ues achieved across different realizations of the prob- that we are at the beginning of the planning horizon,
lem matter, since the constraint requires that the click- i.e., the first period of the planning horizon. We set
through-rate, on average, must exceed a given thresh- the threshold at the smallest level such that the target
old. On the other hand, with a probabilistic constraint, click-through-rate level should just be achieved at the
the only thing that matters in any realization is whether end of the month (on average). We keep this threshold
the click-through-rate is achieved or not. Thus, dur- value constant for the current period. At the end of the
ing the course of the problem (i.e., at any intermediate period, we collect data on the number of impressions
state of the problem), if it becomes clear that the click- and the number of clicks. The current click-through-
through-rate constraint will surely be met irrespective rate value is the number of clicks divided by the num-
of future events, the threshold should immediately be ber of impressions. If this value is better than the
set to zero. This suggests that the optimal policy will required click-through-rate at the end of the month,
change the threshold depending on the problem state, we can afford to lower the threshold and show more
i.e., a static threshold policy is no longer optimal. ads. Conversely, if the current click-through-rate value
is below the threshold, we must set the next periods
threshold higher. At the beginning of each period, we
4. Impact of Inaccurate Problem set the threshold to a value that, if kept constant for the
Parameters rest of the planning horizon, would just achieve, on an
We now examine the impact of inaccurate problem expected basis, the revised target threshold for the rest
parameters on the solution presented in Proposition 4. of the planning horizon.
The inaccuracy concerns the estimation of the click- In Figure 3, we consider a simple example to illus-
probability distribution f (p). Specifically, we focus our trate how the rolling-horizon approach works. Imag-
attention on the impact of inaccurately estimating the ine that there are three days in the planning horizon
parameters of this distribution. We consider that the and that the value of the threshold () is updated on
Figure 3. The mechanics of the rolling-horizon approach. frequency be used? The danger of updating too fre-
quently (more importantly, if updating is based on the
Day 1 Day 2 Day 3
outcomes of a small number of arrivals) is that the
Impressions m0 m1 m2
threshold could be changed based upon a unlikely
Clicks r0 r1 r2
random draw (of impressions and clicks). Thus, in an
Decision variables 1 2 3
attempt to cope with noise, an excessively frequent
update policy could lead to poor decisions, and hurt
a daily basis. Let the click-through-rate constraint be overall performance. The second question may be espe-
0.01. At the beginning of the first day, we find the low- cially relevant when we expect there to be a moder-
est value of that (if held constant) would just achieve ate or small amount of noise. In such situations, there
the required click-through-rate. Suppose this value is is a tension between the benefits of using an optimal
1 . We set the threshold to 1 for the first day. At the approach (i.e., static) versus the virtues of using a sub-
end of the first day, we observe the number of clicks optimal one that is able to cope with noise. The details
(say r1 ) and the number of impressions (say m 1 ). We of experimental setup and results are provided in Sec-
tion EC.8 of the electronic companion. Below, we first
next use these values (r1 and m 1 ) to find the lowest
summarize the experimental setup and results, and
value of that, if held constant for the remaining two
then present the final recommendations.
days, would just achieve the required click-through-
rate of 0.01. Suppose this value is 2 . Intuitively, this 5.1. Summary of Experimental Setup and Results
value should be higher than 1 if the click-through-rate For these experiments, we first estimate the click-
in the first day was below the required target (0.01), probability distribution using the data from different
otherwise it should be lower. After considering the fact publishers. We were able to estimate that the click-
that we are behind (ahead) on the constraint, we set probability follows a Gamma distribution, with shape
the value of to 2 for the second day. Similarly, at and scale parameters specialized for each publisher.
the beginning of the third day, we use the actual clicks The details are provided in Section EC.8.1 of the elec-
and impressions that have occurred so far to calcu- tronic companion. Then, in Section EC.8.2 of the elec-
late the lowest value of needed to achieve the click- tronic companion, we present the details of experimen-
through-rate constraint (say, 3 ). For this problem, the tal setup and some analytical results regarding the per-
static approach would have used a constant threshold formance of static approach in presence of an incorrect
value (1 ) for the entire duration, that is, the threshold shape parameter. In all the experiments, care is taken
value used by the rolling-horizon approach in the first to ensure that the performance of the rolling-horizon
period. approach indeed converges to its expected value before
The rolling-horizon approach has the advantage this value is compared with the performance of the
that it can increase or reduce the threshold to incor- static approach. Hence, in Section EC.8.3 of the elec-
porate state information. In any particular state, we tronic companion, we provide the details of the con-
may be better or worse than what we need to be with vergence procedure.
respect to the target click-through-rate constraint. If For some publishers that have relatively low traffic,
we are doing better, we can afford to use a lower the rolling horizon could suffer from having to update
value of . Conversely, if real-world feedback is such the threshold after a relatively small sample of arrivals
that we are behind on the constraint, we need to (impressions, clicks). These updates may not be as reli-
tighten (increase) the threshold to catch up. Although able as the ones that were made with a large sam-
some of the past studies have used the rolling-horizon ple of impressions and clicks. Hence, in Section EC.8.4
approach in a similar paper (e.g., Ghosh et al. 2009b, of the electronic companion, we derive an expression
Besbes and Maglaras 2012, Wang et al. 2014), both our for the update threshold that ensures that the num-
context and implementation are different from those ber of arrivals is large enough such that the update
studies. decision is not based on an unlikely draw of random
events. We refer to a publisher as a high-volume pub-
lisher if the total number of arrivals per month exceeds
5. Experimental Results the update threshold for that publisher. On the other
The primary goal of the numerical experiments is to hand, if the monthly traffic is not sufficient to war-
provide answers to two questions that are of practi- rant at least one update, we refer to such a publisher
cal interest: (1) What update frequency should be used as a low-volume one. In Sections EC.8.5 and EC.8.6 of
in the rolling-horizon approach? (2) In the presence the electronic companion, we present and discuss the
of inaccurate problem parameters (or noise), which results for high-volume and low-volume publishers,
method (static or rolling-horizon) should be used? The respectively.
first question essentially boils down to asking, within At the end of each period (say, a month), the param-
practical constraints, should the most frequent update eters of the click-probability distribution f (p) could
potentially be updated. In Section EC.8.7 of the elec- approach is appropriate, unless the click-through-rate
tronic companion, we present a Bayesian framework constraint or the traffic volume is low. Then, if the click-
for this process. In the following subsection, we now probability distribution parameters for the publisher
present our final recommendations based on all the stabilize, the static approach can be used. Regardless
results presented in Section EC.8 of the electronic of the approach used (static or rolling horizon), the
companion. different traffic parameters associated with a publisher
(click-probability distribution parameters and traffic
5.2. Final Recommendations
volume) are checked at the end of each month to see if

For low-volume publishers, the static approach is rec- these parameters have changed. If so, the publisher can
ommended to avoid the danger of updating based on be regarded as a new one (with a new set of parame-
small samples. Also, when is low, the static approach ters) from the standpoint of the approach that should
and the rolling-horizon approach have similar per- be employed. The recommendations for a new pub-
formance for any level of error. Hence, when 6 kq lisher are summarized in Figure 4. We may consider
(where k is the shape parameter and q is the scale updating the traffic parameters more frequently using
parameter), it may be better to use the static approach a Bayesian learning process. However, more frequent
to avoid the overhead of implementing an approach updating of parameter values will create another level
that attempts to update the threshold, but ends up not of noise. Hence, we choose to update traffic parameters
needing to since the value of is low. For high-volume only at the end of each month.
publishers and high levels of , the rolling-horizon
approach outperforms the static approach for most 6. Implementation Experience at Chitika
error levels. In the case of overestimation errors, nei- We begin this section with some background on Chi-
ther approach meets the click-through-rate constraint tikas existing network architecture and some essential
but the rolling-horizon approach comes closer to meet- details of the ad data flow, i.e., the process of how an
ing this constraint. Since overestimation errors often ad is served. Figure 5 shows the architecture and some
lead to infeasible solutions, it is better to be conser- details of the ad data flow. Currently, Chitika deliv-
vative while estimating the parameters of the clickers ads using ad-servers in five data centers across the
probability distribution for a new publisher. country. Each data center handles the ad traffic from
To start off, for a new publisher, we use one month a specific geographical region, e.g., Midwest, South,
as a guess period. Beyond the first month, we estimate East, and so on. A geo-balancer placed at the head
the model parameters using the past data. In the ini- of the network ensures that ad-servers in the correct
tial months for a new publisher, the rolling-horizon geographical are contacted for ad delivery. Once the
Figure 4. Final recommendations.
Static approach
Low click-through-rate
constraint ( kq)
High volume
publisher Begin with rolling-horizon
High click- approach with appropriate
New Be conservative through-rate update frequency
publisher in estimating constraint ( > kq) When the click probability
click probability parameters stabilize, switch
parameters to static approach
Low volume
publisher
Static approach
Figure 5. (Color online) The ad-delivery architecture at Chitika.

Geo-balancer
Data center 1 Data center 2 Data center 3 Data center 4 Data center 5
ad load
servers balancer
log
processor
1 hour
1 second Alpha
Master Log processor
Rollup
Decision 10 minutes
analytic model
ad-request goes to a data center in a particular geo- challenges is to decide how to implement the logic
graphical area, a load balancer within each data center to calculate the value of the click-probability so that
ensures that no particular ad-server in the data center we can compare it against the threshold . While the
gets overloaded. There are several hundred ad-servers value of stays constant for a period (say, for a day),
in each data center and each ad-server in a data center we need to calculate the value of the click-probability
typically gets about 3040 ad-requests per second. This (p) in real time for every visitor. Recall that we need
request originates at a script that executes on the pub- to calculate the click-probability using p L(X). This
lishers webpage at the time the page is being rendered function can be quickly computed using the knowl-
for the incoming visitor. edge of X for every visitor.1 A database lookup (for
the historical information in X) is infeasible, given the
6.1. The Ad Data Flow time constraints. Fortunately, we can extract historical
When a visitor comes to a publishers site, a request contents from the visitors cookie and other (real-time)
is made for an ad to one of the ad aggregators that details from the http header and calculate the value
partner with Chitika to serve ads. This ad aggregator of p in about 50 milliseconds. Thus the evaluation of
returns the ad as an ad-unit (in about 200 millisec- the rule p > is feasible given the constraints of the
onds) from one of the many advertisers that it serves. problem. However, we need to implement the entire
This ad is rendered on the publishers page. Some- logic at the front end, implying that every ad-server
times, it is necessary to get an image associated with in Chitikas network must replicate this logic. For the
the ad. In such cases, the ad-server obtains the image rolling-horizon approach, we must consider the time
by calling an Akamai (www.Akamai.com) edge-server taken for replication in deciding how frequently we
that stores the image. After displaying the ad, a log wish to update the value of . Note that to find a new
of this impression is created. This log records detailed value of , we need to know the actual events (impres-
data on the specifics of the ad shown, time shown, and sions and clicks) that occurred in the last period. As
other information concerning the visitor. If the visi- mentioned above, Chitika processes the logs of these
tor clicks on the ad, the ad-server logs this event. It is events only once every hour. Thus it is not feasible to
important to note that the ad-server that records the update the value of more than once every hour. For
impression may be different from the one that records practical reasons, however, updating more frequently
the click (although both are likely to be in the same than once a day was deemed impractical by Chitika
data center). Once every hour, all the ad-servers at a engineers.
data center purge their logs to a local log processor We also encountered another implementation issue
concerning the click-through-rate constraint. As men-
at the data center. These local log processors (one for
tioned earlier, the threshold chosen in a given period
each data center), in turn, broadcast their contents to a
depends on the click-through-rate constraint set by the
master log processor that collects the entire log for the
publisher. In practice, we found that publishers were
past hour. At the end of each hour, it takes about 1015
not very good at setting a reasonable value for this con-
minutes to collect the log data from each data center to
straint. Clearly, the constraint has revenue implications.
get a consolidated picture of the events that transpired
If the constraint is set too high, we will only show ads
during the last hour. Thus at the end of each hour, we
to interested visitors and while the click-through-rate
must process the raw log data to match impressions
will likely be high, the total revenue from ads will suf-
with clicks. The entire process, starting from the call fer. To help publishers balance the trade-off between ad
to Chitika by a script on the publishers page, to the revenue and ad clutter, we provided publishers with
rendering of the ad on the page typically takes less a revenue slider. The revenue slider allows the pub-
than 0.5 second. If the ad rendering takes more than a lisher to slide a button and observe the revenue impact
second, the impression is likely to be wasted, i.e., a of a candidate value of the click-through-rate con-
click is unlikely to be generated for such a delayed ad. straint. At each point along the slider (representing dif-
ferent values of the click-through-rate constraint), the
6.2. Implementation Challenges
slider application shows the expected revenue that the
The above architecture poses several implementation publisher will get at the end of a month, if the current
challenges. With the new implementation, not all ad- value of the click-through-rate constraint is enforced.
requests result in an ad display. This is because an ad- This revenue calculation uses the specific details of
server must first decide whether or not to call the ad- the publishers traffic and the revenue sharing contract
partner for an ad. To make this decision, the ad-server between Chitika and the publisher. For more details,
calculates the probability of a click for that specific vis- please refer to Mookerjee et al. (2012).
itor and compares this value with the threshold for
ad display for that publisher. If the click-probability is 6.3. Benefits and Impact
higher than the threshold, the ad-server makes a call The implementation began in early 2010 and initially
to the ad partner to supply an ad. One of the main resulted in an increase in revenue for Chitika at the
rate of about $3,000 per day. Based on the data col- has been extremely successful and has contributed to
lected between March 2010 and September 2010, the a 15% increase in Chitikas revenue in each year fol-
total increase in revenue was estimated to be in the lowing 2011. While all this success cannot be attributed
order of $1.2 million per year. This revenue increase to the methods developed in this paper, they have
came from Chitika being able to sign up more publish- surely acted as a platform upon which new innovation
ers under the Chitika Premium program. Chitika was has become possible. To isolate the impact of the pro-
also able to use its Premium program to partner with posed solution, we implemented our solution in paral-
a very large ad aggregator to show ads in the United lel to the past practice at Chitika. Below, we present the
Kingdom. As part of the trial process, Chitika was asked results of our experiment. We begin with presenting
to demonstrate a click-through-rate of 0.015, or 1.5%. the details of past practice at Chitika.
Our methodology was able to provide a click-through-
rate of 0.0151 or 1.51%. This accuracy won Chitika the 6.3.1. Chitikas Past Approach. Before our solution
contract and contributed to a huge revenue increase for was devised, Chitika used a greedy approach to ensure
the company. that their solution satisfied the click-through-rate con-
In December 2010, Chitika offered another service straint. Before each arrival, a check was made to see
called Chitika Select. Most of Chitikas publishers came if the click-through-rate achieved so far was above
on board to use Chitika Premium with the expectation or below the required click-through-rate. When click-
that Chitika would show ads only to visitors coming through rate achieved so far was found to be above
to the site from search engines (i.e., search traffic). This the constraint, an ad was shown to the incoming vis-
was a good starting point. Although Chitika had ads itor. However, if the click-through-rate was below the
for visitors who came to the site from other sources constraint, the ad display decision was made as fol-
(e.g., by directly typing in the URL), Chitika was not lows. Based on certain easily measurable attributes of
able to show these ads to such visitors as it could dilute the visitor (such as the number of past clicks on sim-
the click-through-rate. However, with Select, Chitika ilar ads), each visitor was considered to be in one of
offered publishers the chance to expand the usage of two categories: clicker and nonclicker. The ad was shown
ads (and hence drive more revenue) with the assurance
to clickers but not to nonclickers. Over time, based
that the expanded coverage would not dilute the Pre-
on past click performance, a clicker could become a
mium click-through-rate by more than 25%. Without
nonclicker, and vice versa. Before the first arrival, the
a way to control the dilution in the click-through-rate,
click-through-rate was assumed to be below the con-
the expanded coverage of ads could have seriously hurt
straint.
the click-through-rate and run the risk of losing some
publishers completely. But, with our solution, Chitika 6.3.2. Experimental Design for Head-to-Head Compar-
was able to guarantee a certain level of click-through- ison. This experiment was conducted for one month
rate, and hence able to give the publishers this option with 30 different publishers. These publishers were
with assurance. The Select offering expanded usage of selected from three different categories: (i) 10 large
Chitikas service across a large percentage of the net- publishers with arrival rates around 1,000,000 per day,
work traffic. Although with Chitika Premium, Chitika (ii) 10 medium publishers with arrival rates around
only took search traffic and collapsed the ads for non-
100,000 per day, and (iii) 10 small publishers with
search traffic, the Select service could show ads to a
arrival rates around 10,000 per day. Initially, Chitika
much larger traffic base. The Chitika Select offering
allowed us to select 5% of the traffic randomly for each
gave the firm an additional 25% boost in revenue.
of the publishers to conduct experiments. To conduct
In the years 2012 and 2013, Chitika developed a
a head-to-head comparison between our proposed
patented product called Prophet that uses the ideas
developed in this paper (Kolluri et al. 2013). Because approach and Chitkas past approach, we decided to
many publishers have begun to sell some of their use our approach for half of the randomly selected traf-
impressions on ad exchanges,2 it has become possible fic (i.e., 2.5% of the traffic for each publisher) and Chi-
to observe a publishers characteristics without even tikas past approach for the rest of the traffic. However,
entering into a performance contract with the pub- for the last category of publishers (i.e., with arrival
lisher. Having observed the traffic characteristics of rates around 10,000 per day), it resulted in a small num-
a publisher, Chitika is able to offer the publisher an ber of arrivals per day in each group. Therefore, we
attractive, customized contract for displaying ads. This requested Chitika to experiment with 10% of the traf-
approach is similar to the marketing strategy used by fic for just the last category of publishers. Hence, for
credit card companies (pioneered by Capital One): an this category, we used our approach for 5% of the ran-
offer is made to a potential customer rather than a domly selected traffic and Chitikas past approach for
customer approaching the company for a credit card. the rest of the traffic. The results are presented in the
This reduces the problem of adverse selection. Prophet next subsection.
6.3.3. Implementation Results. In Table 3, we present Table 4. Overall improvement and statistical
the results of our experiment. Both approaches achieve significance.
the click-through-rate constraint for each publisher.
Number of clicks
Therefore, we do not report the final click-through-
rate in the table. As shown in the table, our proposed Improvement within a Overall
approach improves the number of clicks significantly. Arrival rate category of publishers improvement
The maximum percentage improvement is more than 1,000,000 1243.6 448.7
68%. Interestingly, our proposed approach provides 100,000 86.2

inferior solutions for four publishers (out of 30 pub- 10,000 16.3
lishers). All of these publishers are from the category Notes.
p < 0.05, p < 0.01.
of publishers with arrival rates around 10,000 per day
(i.e., the category with lowest arrival rates). Note that
we implemented each approach for only 5% of the traf- 7. Extensions and Ongoing Work at
fic for these publishers. Hence, for these publishers, the Chitika
number of arrivals per day for each approach is just We have begun to apply the idea developed in this
around 500. The inferior results may be attributed to study to other, related problems faced by ad-networks.
the fact that the number of arrivals is too low to illus- We briefly discuss two extensions here. The first exten-
trate the true benefit of the proposed approach. sion that we are considering is to use a slightly modi-
Finally, Table 4 presents the results for improvement fied objective function for the user profiling problem,
within a category of publishers and over all the pub- whereas the second extension deals with including
lishers. In this table, we also provide the t-statistics to advertiser constraints in the user profiling problem.
assess the significance of improvement. These results For details of some other extensions, such as real-time
indicate that the improvements are statistically signifi- media buying and fading ads, the readers can refer to
cant both within each category of publishers and over Mookerjee et al. (2012). We briefly describe two exten-
all the publishers. sions below.
Table 3. Improvement in the number of clicks for the experiments conducted at Chitika.
Number of clicks
Shape Scale Click-through-rate Chitikas past Proposed Percentage

Arrival rate parameter parameter constraint () approach solution improvement
1,000,000 2.0 0.0045 0.011 4,982 6,264 25.73

1,000,000 2.0 0.0045 0.011 5,151 6,193 20.23
1,000,000 2.1 0.0046 0.012 5,367 6,591 22.81
1,000,000 2.1 0.0046 0.012 5,198 6,659 28.11
1,000,000 2.2 0.0048 0.013 5,447 7,209 32.35
1,000,000 2.2 0.0048 0.013 5,874 6,958 18.45
1,000,000 2.3 0.0050 0.014 6,793 7,782 14.56
1,000,000 2.3 0.0050 0.014 6,808 7,764 14.04
1,000,000 2.4 0.0050 0.015 6,683 7,923 18.55
1,000,000 2.4 0.0050 0.015 6,626 8,022 21.07
100,000 2.0 0.0045 0.011 556 599 7.73
100,000 2.0 0.0045 0.011 549 648 18.03
100,000 2.1 0.0046 0.012 560 675 20.54
100,000 2.1 0.0046 0.012 554 612 10.47
100,000 2.2 0.0048 0.013 563 676 20.07
100,000 2.2 0.0048 0.013 650 707 8.77
100,000 2.3 0.0050 0.014 766 767 0.13
100,000 2.3 0.0050 0.014 532 807 51.69
100,000 2.4 0.0050 0.015 785 824 4.97
100,000 2.4 0.0050 0.015 692 754 8.96
10,000 2.0 0.0045 0.011 96 130 35.42
10,000 2.0 0.0045 0.011 128 114 10.94
10,000 2.1 0.0046 0.012 144 130 9.72
10,000 2.1 0.0046 0.012 121 114 5.79
10,000 2.2 0.0048 0.013 92 155 68.48
10,000 2.2 0.0048 0.013 139 127 8.63
10,000 2.3 0.0050 0.014 128 151 17.97
10,000 2.3 0.0050 0.014 131 162 23.66
10,000 2.4 0.0050 0.015 144 159 10.42
10,000 2.4 0.0050 0.015 124 168 35.48
7.1. Maximizing Weighted Sum of Clicks decision variable (a probability threshold that governs
In this extension, we use the expected revenue from the display of ads). Since an ad campaign usually lasts
the clicks, instead of using the expected number of several weeks (or even months), it raises the potential
clicks. Thus, in the modified objective function, we to dynamically manage the campaign (i.e., change the
use a weighted sum of clicks, where the weights are probability threshold across periods during the plan-
the revenue-per-click values associated with each click. ning horizon). We present a rolling-horizon approach
Here, similar to the base model, the publisher first sets to vary the threshold and find that it is useful when the
the click-through-rate constraint. After that, the ad net- model parameters are not known accurately.
work determines the threshold for the click-probability
that is a function of the revenue-per-click, which is Endnotes
1
determined by the advertiser. The threshold policy In some situations, not all model variables are available for an arriv-
remains optimal for the new objective function consid- ing visitor. In such situations, the missing variables are set to default
values.
ered in this problem. 2
Ad exchanges are supply side agents that publishers use to sell their
inventory of ad space on a real-time basis, i.e., ad space is auctioned
7.2. Inclusion of Advertiser Constraints
one-by-one in real time when a visitor comes to the website.
In this extension, we include advertiser constraints in
the user profiling problem. This problem is similar to References
the one described in the base model except that the Agarwal A, Hosanagar K, Smith MD (2011) Location, location, loca-
solution must respect the performance constraints of tion: An analysis of profitability of position in online advertising
both the publisher and the advertisers that advertise on markets. J. Marketing Res. 48(6):10571073.
the publishers site. In addition to the publishers con- Atkinson J (2014) The best ad networks: Display advertising
and more. MonetizePros (July 31). Retrieved November 2,
straint of exceeding a given click-through-rate, adver- 2014, http://monetizepros.com/blog/2013/what-are-the-top
tiser interests motivate an additional constraint, which -ad-networks-2013/.
can be referred to as a conversion constraint. This con- Baldacci R, Mingozzi A, Roberti R, Calvo RW (2013) An exact algo-
straint requires that the conversion ratio be above a cer- rithm for the two-echelon capacitated vehicle routing problem.
Oper. Res. 61(2):298314.
tain specified constant. The conversion ratio is defined Balseiro S, Besbes O, Weintraub GY (2015) Repeated auctions with
as the number of conversions that are generated from budgets in ad exchanges: Approximations and design. Manage-
clicks divided by the number of clicks. To keep adver- ment Sci. 61(4):864884.
Balseiro S, Feldman J, Mirrokni V, Muthukrishnan S (2014) Yield opti-
tisers happy, we require that the conversion ratio (of
mization of display advertising with ad exchange. Management
ads shown at the publishers site) be above a specified Sci. 60(12):28862907.
fraction. To solve this problem, we plan to use the data Besbes O, Maglaras C (2012) Dynamic pricing with financial
analytics step to predict, for a given ad, both the proba- milestones: Feedback-form policies. Management Sci. 58(9):
17151731.
bility of a click and the probability of a conversion. The Bucklin RE, Sismeiro C (2009) Click here for Internet insight:
decision to show an ad would depend on both these Advances in clickstream data analysis in marketing. J. Interactive
probabilities. In other words, we will use two thresh- Marketing 23(1):3548.
olds (one for the click-probability and the other for the Chatterjee P, Hoffman DL, Novak TP (2003) Modeling the click-
stream: Implications for web-based advertising efforts. Market-
conversion ratio) to control ad display. The readers can ing Sci. 22(4):520541.
refer to Figure 7 of Mookerjee et al. (2012) for the over- Dawande M, Kumar S, Sriskandarajah C (2003) Performance bounds
all solution process. of algorithms for scheduling advertisements on a web page.
J. Scheduling 6(4):373394.
Dawande M, Kumar S, Sriskandarajah C (2005) Scheduling web
8. Concluding Remarks advertisements: A note on the MINSPACE problem. J. Scheduling
The main contribution of this study is to provide an 8(1):97106.
eMarketer (2011) Online advertising market poised to grow 20% in
approach to manage an ongoing Internet ad campaign 2011. eMarketer.com (June 8). Retrieved January 4, 2014, http://
that substantially improves the number of clicks and www.emarketer.com/newsroom/index.php/online-advertising
the revenue earned from clicks. The basic idea is to not -market-poised-grow-20-2011/.
show ads to every visitor, but show ads to only those Evans DS (2009) The online advertising industry: Economics, evolu-
tion, and privacy. J. Econom. Perspect. 23(3):3760.
visitors that have a reasonable chance of generating a Gerken DA (2008) System and method for selectively acquiring and
click. Most publishers would like to maximize the rev- targeting online advertising based on user IP address. United
enue earned from ads (via clicks), but not clutter the States Patent No: US 7,376,714 B1.
Ghosh A, McAfee P, Papineni K, Vassilvitskii S (2009a) Bidding for
website with too many ad impressions that do not gen-
representative allocations for display advertising. Leonardi S,
erate a click. Thus, for the ad-network, which does not ed. Proc. Fifth Internat. Workshop on Internet and Network Econom.,
directly suffer from unclicked ad impressions, it is nec- WINE 09 (Springer, Berlin), 208219.
essary to optimize the revenue earned from the ad cam- Ghosh A, Rubinstein BIP, Vassilvitskii S, Zinkevich M (2009b) Adap-
tive bidding for display advertising. Proc. 18th Internat. Conf.
paign while respecting a click-through-rate constraint World Wide Web (ACM, New York), 251260.
that is supplied by the publisher. We formulate this Goldfarb A, Tucker C (2011) Online display advertising: Targeting
optimization problem and find the optimal value of the and obtrusiveness. Marketing Sci. 30(3):389404.
Harvey WM, Despain GL, Lieberman L, Canning BP, Bochman P Najafi-Asadolahi S, Fridgeirsdottir K (2014) Cost-per-click pricing
(2010) Analyzing return on investment of advertising campaign for display advertising. Manufacturing Service Oper. Management
by matching multiple data sources. United States Patent No: US 16(4):482497.
7,729,940 B2. Srinivasan K, Shamos MI (2010) Determining the effectiveness of
Hof R (2011) Online ad spend to overtake TV by 2016. Forbes (August Internet advertising. United States Patent No: US 7,747,465 B2.
26). Retrieved January 4, 2014, http://www.forbes.com/sites/ Turner J (2012) The planning of guaranteed targeted display adver-
roberthof/2011/08/26/online-ad-spend-to-overtake-tv/. tising. Oper. Res. 60(1):1833.
Hwang H, Ahn H, Kaminsky P (2013) Basis paths and a polynomial Turner J, Scheller-Wolf A, Tayur S (2011) Scheduling of dynamic in-
algorithm for the multistage production-capacitated lot-sizing game advertising. Oper. Res. 59(1):116.
problem. Oper. Res. 61(2):469482. Wang Z, Deng S, Ye Y (2014) Close the gaps: A learning-while-doing
Interactive Advertising Bureau (2014) IAB Internet advertising
algorithm for single-product revenue management problems.
revenue report: An industry survey conducted by PwC and
Oper. Res. 62(2):318331.
sponsored by the Interactive Advertising Bureau (IAB), 2013
Full Year Results. April. Retrieved November 2, 2014, http://
www.iab.net/media/file/IAB_Internet_Advertising_Revenue_ Radha Mookerjee holds a Ph.D. in management from Pur-
Report_FY_2013.pdf. due University. Her current research interests include global
Interactive Advertising Bureau UK (2014) 2013 Full year digi- distribution systems, e-commerce, Internet advertising, and
tal adspend results. April 8. Retrieved November 2, 2014,
business analytics. She has published, and has articles forth-
http://www.iabuk.net/research/library/2013-full-year-digital
-adspend-results. coming, in Management Science, Operations Research, Informa-
Kemp S (2012) Study: Global online ad revenue to hit $143 bil- tion Systems Research, MIS Quaterly, INFORMS Journal on Com-
lion by 2017. The Hollywood Reporter (November 14). Retrieved puting, Communications of the ACM, and other information
January 4, 2014, http://www.hollywoodreporter.com/news/ system and operations research journals.
study-global-online-ad-revenue-390377. Subodha Kumar earned his Ph.D. from the University
Kolluri V, Dorosario A, Mookerjee V, Mookerjee R, Caswell C (2013)
Method and system for determining user likelihood to select
of Texas at Dallas. His research interests include social
an advertisement prior to display. United States Patent No: US media analytics, healthcare management, retailing, supply
2013/0124344 A1. chain management, project management, electronic com-
Kumar S (2016) Optimization Issues in Web and Mobile Advertising: Past merce, web advertisement, scheduling, combinatorial opti-
and Future Trends. Springer Briefs in Operations Management mization, web pricing, database management, and network
(Springer, Cham, Switzerland).
security. He has published several papers in reputed jour-
Kumar S, Jacob VS, Sriskandarajah C (2006) Scheduling advertise-
ments on a web page to maximize revenue. Eur. J. Oper. Res. nals and refereed conferences. In addition, he has authored
137(3):10671089. a book, and coauthored book chapters, Harvard Business
Kumar S, Dawande M, Mookerjee VS (2007) Optimal scheduling and School cases, and Ivey Business School cases. He also has a
placement of Internet banner advertisements. IEEE Trans. Knowl- patent. He has been featured on the Indian School of Business
edge Data Engrg. 19(11):15711584. Management Briefs, the University of Washington Television,
Lai G, Margot F, Secomandi N (2010) An approximate dynamic pro-
gramming approach to benchmark practice-based heuristics for
and the Industrial Engineer Magazine.
natural gas storage valuation. Oper. Res. 58(3):564582. Vay S. Mookerjee holds a Ph.D. in management from
Lindsay RT, Carriero T, Juan Y (2010) Measuring impact of Purdue University. His current research interests include
online advertising campaigns. United States Patent No: US social networks, optimal software development method-
2010/0306043 A1. ologies, storage and cache management, content delivery
Moallemi CC, Saglam M (2013) OR ForumThe cost of latency in
systems, and the economic design of expert systems and
high-frequency trading. Oper. Res. 61(5):10701086.
Mookerjee R, Kumar S, Mookerjee VS (2012) To show or not show: machine learning systems. He has published in and has arti-
Using user profiling to manage Internet advertisement cam- cles forthcoming in several archival information system, com-
paigns at Chitika. Interfaces 42(5):449464. puter science, and operations research journals.

Opre 2016 1553

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Opre 2016 1553

Transféré par

Droits d'auteur :

Formats disponibles

This article was downloaded by: [129.110.242.

97] On: 08 April 2017, At: 15:32

Optimizing Performance-Based Internet Advertisement

To cite this article:

Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions

Copyright 2016, INFORMS

Please scroll down for articleit is on subsequent pages

Optimizing Performance-Based Internet Advertisement

Radha Mookerjee,a Subodha Kumar,b Vijay S. Mookerjee a

Supplemental Material: The online appendix is available at https://doi.org/10.1287/opre.2016.1553.

Keywords: Internet advertising performance constraints visitor profiling revenue optimization

1. Introduction of the performance constraint imposed on the ad net-

Demand side Supply side

Figure 2. Overall solution process.

Table 1. Important variables in the logit model.

Variable Description Possible values

bad_speller Is the visitor a bad speller? Yes, No

Table 2. Model parameters and variables. Then, we have

Random variable for the number of impressions in the

lowing proposition. With inaccurate parameters, let us assume that the

volume) are checked at the end of each month to see if

Figure 4. Final recommendations.

Figure 5. (Color online) The ad-delivery architecture at Chitika.

Master Log processor

68%. Interestingly, our proposed approach provides 100,000 86.2

Shape Scale Click-through-rate Chitikas past Proposed Percentage

1,000,000 2.0 0.0045 0.011 4,982 6,264 25.73

Vous aimerez peut-être aussi