Académique Documents
Professionnel Documents
Culture Documents
865
50.2
Networking
Computing
Best-effort Best-effort le transfers, e-mail, and web trac to streaming media, voice
Storage
packet query
delivery processing, communications, social networking and collaborative comput-
(Internet ? Relaxed ing [1].
Protocol) consistency
Storage: Every popular website built on top of a tradi-
tional relational database experiences problems with scaling
the storage back end [2, 3, 4, 5, 6, 7, 8]. The most impor-
tant consideration for a data storage system is its ability to
rapidly scale to handle more users. Here, scaling refers to
Figure 1: Best-eort model in networking and storage the capability of the system to service more users while keep-
systems ing the cost per user constant. Due to the interactive nature
of users queries, the response time for any given query must
be independent of the number of users in the system. Data
also refer to the capability of a system to increase the total scale independence allows the user base to grow by orders of
throughput (as load increases) in direct proportion to the ad- magnitude without changing the application. For example,
dition of more resources (in the context of parallel computing consider the social networking web site Facebook, which has
systems, more processors). There are two broad methods of several billion dynamically generated page views per day [9].
adding more computing resources to a system. Scale-up adds Trac of this magnitude results in over 23,000 page views a
more resources to a single node in a system, typically involving second (and each page view could result in many queries to
addition of processors or memory to a single computer. Scale- the database). Facebook responded by developing their own
out adds more nodes to a system, such as adding a new com- complex, proprietary storage system [10]. Several functions
puter to a cluster or distributed system. In the context of high and services that are commonly implemented in traditional
performance computing, there are two other notions of scala- relational data stores hindered the scalability of the storage
bility. Strong scaling refers to how solution time varies with system. Notably, their system provides only eventual or re-
number of processors for a xed total problem size. Weak scal- laxed consistency, wherein a write to the storage system will
ing refers to how solution time varies when we x the problem not be seen by all users for an unspecied, variable period of
size per processor, and add more processors. Unfortunately, time. For Facebook and many other popular collaborative ap-
adding more resources to a system does not necessarily trans- plications like Flickr, Yelp etc., the relaxed consistency model
late to improved performance. Irrespective of the road taken oers superior scalablity and performance as compared to a
to achieve scalability, or the metrics used to measure scalabil- traditional database that provides full atomicity, consistency,
ity, a fundamental challenge in designing computing systems isolation, and durability (ACID) compliance.
is to translate increased resources into improved application We see that there are really two reasons to place functional-
performance. We pose the question: Is there a simple system ity in the network or storage sub-system rather than the end
design principle that has served us well over the past several hosts or applications. Either all applications need it, or a large
decades to build scalable systems? While the focus of this paper number of applications benet by an increase in performance
is on computing systems, we answer the question by examin- due to the clever implementation of a complex service or func-
ing how scaling challenges have been addressed in networking tion in the sub-system. Unfortunately, when new applications
and storage systems. like VoIP or Facebook use existing services and functions like
As shown in Figure 1, applications rely on functions pro- guaranteed delivery or consistency, their performance deteri-
vided by the networking, storage and compute sub-systems. orates with increased load. Also, workload characteristics of
Scalability of an application is intimately related to the scal- these new applications (VoIP, Facebook etc.) are inherently
ability of each of these sub-systems. We briey examine the dierent. Therefore, new packet delivery and data consistency
scalability of networking and storage sub-systems to identify mechanisms with radically dierent semantics and goals were
a common design principle for system scalability. necessary to improve scalability and performance of these ap-
Networking: The Internet Protocol (IP), on which the In- plications. In general, new applications will continuously force
ternet is built, was developed three decades ago with system us to re-evaluate semantics and goals of functions and services
scaling as a rst order priority. Towards that end, an impor- that are included in common sub-systems.
tant design decision was to make the end hosts responsible for Note that the adjective best-eort does sometimes get in-
reliability rather than the core network. Excessive workload terpreted as highly unreliable, but this is unfortunate. In
in the network is dealt with in a simple manner, by allowing fact, in networking systems, best-eort packet delivery is re-
switches and routers in the network to drop packets. The IP ferred to as a highly reliable delivery service [1, 11]. Therefore,
protocol has been successful in managing ever-increasing vol- equating best-eort with unreliability is neither appropriate
umes of end points and packet trac for over three decades nor factual. Reliability is not a discrete quantity with only
(exponential growth with packet trac doubling every two two values yes or no. Rather, it allows for a spectrum of possi-
years). While this could be attributed to a combination of bilities that range from completely reliable to outright unreli-
factors, one of the key design decisions to enable scalability able. As the networking and storage applications show, mod-
was simply reserving the right to drop packets. By sacric- est relaxations of traditional strict guarantees do not materi-
ing guarantees, it has become possible to build simpler, faster ally alter the core functionality of these applications. Rather,
and scalable networks. When the IP protocol was developed, such relaxations have enabled disproportionate performance
there were very few applications that could directly use the and scalability benets. Our proposal of best-eort comput-
best-eort packet delivery service, and most applications used ing platforms should also be viewed in a similar spirit.
TCP to guarantee packet delivery. However, over the years,
numerous new scenarios have emerged where applications pre-
fer to routinely use the higher-performance, best-eort packet 3. BEST-EFFORT COMPUTING MODEL
delivery protocols. For example, the UDP protocol is built on Since a large number of emerging applications already tol-
top of the IP protocol, and it does not guarantee reliability erate imperfections in the network or storage systems in re-
or correct sequencing of packets. Applications that use UDP turn for higher, scalable performance, it begs the question of
handle reliability requirements, if any, on their own. DNS is whether computing platforms have to be perfect. From a dif-
the perfect description of this use case. The costs of connection ferent perspective, can some of the functions currently being
set-up for reliable communication are too high, and DNS im- handled by the computing platform be shifted to the appli-
plements its own protocol for reliability or re-sends. Another cations in return for higher, more scalable computing perfor-
scenario is when delivering data that can be lost (without ad- mance? The best-eort computing paradigm is a proposal
versely aecting the application) because new data is being that oers applications an unprecedented, scalable computing
generated that will replace the lost data. Weather data, video platform, by making a fundamental change in the contract be-
866
50.2
867
50.2
K-means clustering quality classes [20]. Similar to K-means, inputs are encoded as vectors
with centroid perturbation
300 in a multi-dimensional space. During training, the algorithm
10% builds a model from the labeled training data set by creating a
92
RV1 moves closer to the training vector, and RV2 moves
91 farther from it.
90 (b)
0% The above computations are performed for each training vec-
89
10% tor, and the process is iterated until the quality of the model
88 50% does not improve. In the case of GLVQ, we evaluated the
90%
87 forgiving nature of the algorithm by executing a software im-
1 2 3 4 5 6 7 8 9 10 11 plementation that was used to detect faces in images. During
# epochs (iterations) the execution of the algorithm, we injected errors in the re-
sults computed by each iteration, i.e., the reference vectors.
Figure 3(b) presents the results of our evaluation. Again, the
dierent curves represent dierent rates of error injection and
Figure 3: Illustration of the forgiving nature of K- each curve shows how the quality of the model improves as the
means clustering and GLVQ classication algorithm iterates to convergence. In the case of GLVQ, we
note that error rates of up to 50% have very little impact on
a multi-dimensional space. The algorithm begins by picking the nal quality of the model constructed.
a random set of cluster centroids, and performs the following While we have focused on just two representative applica-
operations to assign points to clusters in an iterative manner tions, we would like to reiterate that the forgiving nature is
until the clustering does not change any more. observed across a wide range of application domains, including
1. Compute the distance between every point and every digital signal processing, multimedia (image, video, and audio)
cluster centroid processing, network processing, wireless communications, and
recognition and data Mining. The causes and extent of the
2. Assign each point to the cluster with the closest centroid. forgiving nature may vary, but we believe that most if not all
of these workloads can benet from the proposed best-eort
computing model.
3. Re-compute the new centroid for each cluster as the mean
of all the points in the cluster.
5. BEST-EFFORT SOFTWARE AND HARD-
A common application of K-means is to segment images into
regions with similar color and texture characteristics (each WARE: ILLUSTRATIONS
The forgiving nature of computing workloads suggests that
pixel represents a point in the K-means clustering algorithm). the traditional interface of guaranteed execution of computa-
Image segmentation is a useful pre-processing step for image tions may be changed by the underlying computing platform.
content analysis or compression. In order to evaluate the for- In this section, we provide concrete examples of how parallel
giving nature of the K-means algorithm, we executed a soft- hardware and software can be re-designed based on the best-
ware implementation of K-means to perform image segmenta- eort service model. First, we address the problem of how to
tion across several image data sets, while injecting errors in partition the computations in an application into best-eort
the cluster centroid that is computed as a result of each iter- and guaranteed computations through parallel programming
ation 1 . Figure 3(a) shows how the quality of the clustering models. Second, we discuss how a given computing platform
computed by K-means varies with dierent rates of error injec- can scale to handle larger workloads by dropping (not execut-
tion. Each curve corresponds to a dierent error injection rate, ing) some of the best-eort computations. Next, we discuss
and a best-eort shows the improvement of clustering quality how scalable parallel execution can be obtained by taking ad-
as the algorithm iterates to convergence. The results suggest vantage of the forgiving nature of the workload to alleviate the
that executing K-means on a computing platform that intro- bottlenecks to parallel execution. We describe how the best-
duces fairly large error rates (up to 1%) will have virtually no eort model can be taken down into the hardware by incorpo-
impact on the clustering quality. However, it is important to rating knobs that modulate the eort expended by hardware
note that not all computations in the algorithm can be subject towards computing correct results. Finally, we discuss how the
to errors - for example, we note that the operations that deter- best-eort model may be used to build computing platforms
mine whether to continue iterating must be executed without from unreliable components, without incurring excessive over-
any errors. This corresponds to the view that most applica- heads for fault tolerance.
tions will consist of computations that may be executed on a
best-eort basis, and others that may not. 5.1 Best-effort parallel programming models
GLVQ is a supervised learning algorithm that is used for Applications that execute on a best-eort computing plat-
classifying input data into one of a pre-determined set of form must address the separation of computations into best-
1 eort and guaranteed components. While there are several
The errors are injected to represent the impact of executing possible approaches to achieving this separation, we believe
K-means on a best-eort computing platform. More detailed that approaches that minimize the additional burden on the
evaluations considering the exact nature of errors introduced programmer are desirable. Towards that goal, Meng et al. [16]
are presented in [16, 17, 18, 19]. We believe that this sim- proposed parallel programming templates such that the identi-
ple model is adequate to illustrate the forgiving nature of K- cation of best-eort computations is a natural by-product of
means, since the net eect of errors in the computations of each specifying the algorithm using the template. For example, Fig-
iteration of the K-means algorithm is to change the clusters to ure 4 presents a template for iterative-convergence algorithms
which the points are assigned. that naturally embodies the concept of best-eort computing.
868
50.2
iterate {
mask[0:M] = filter();
869
50.2
At the architecture level, the number of processing elements the networking domain, best-eort does equal highly unreli-
and the precision of each processing element are regulated in able. Only a very small degradation in reliability is actually
order to provide just enough computational accuracy. At the encountered in practice. Finding the right extent to which the
circuit level, the concept of voltage over-scaling (scaling the forgiving nature of each application should be exploited will
supply voltage while keeping the clock frequency xed, thereby require further investigation. Similar to the concept of dier-
introducing errors in the outputs of processing elements) is uti- entiated services in networking, it may be interesting to ex-
lized together with enabling circuit-level design techniques, in plore further classication of the best-eort computations into
order to obtain improved energy eciency at the cost of errors dierent categories that are treated dierently by the comput-
introduced into the operations of the algorithm. By combining ing platform. Finally, in the context of both parallel hardware
these knobs, the hardware implementation can be regulated to and software, techniques used for verifying the implementation
expend just enough eort to achieve the desired output ac- must account for the fact that numerical or Boolean equiva-
curacy. As illustrated in Figure 7, energy improvements of lence may no longer be maintained. Despite these challenges,
2X-3X are reported as compared to a well-optimized conven- we believe that the potential benets of the best-eort com-
tional hardware implementation. puting model (such as improvements in performance, energy
eciency, and scalability) make it very appealing as a direction
5.5 Dealing with unreliable hardware for future research in parallel hardware and software systems.
With continued scaling of feature sizes, transistors and in- Acknowledgment: We acknowledge Jiayuan Meng, Suren
terconnects are expected to become increasingly unreliable due Byna, Srihari Cadambi, Hyungmin Cho, Vinay Chippa, De-
to escalating defects, soft errors, and process variations. In the babrata Mohapatra, and Kaushik Roy, whose inputs have
face of high defect or error rates, classical fault-tolerant design greatly shaped our thoughts on best-eort computing.
techniques (based on spatial and/or temporal redundancy) will
become either very expensive or ineective. Recently, there
has been a proposal to design multi-core computing platforms 7. REFERENCES
by combining a small number of reliable processing cores with a
[1] S. Floyd and M. Allman. Comments on the Usefulness of Simple
large number of unreliable cores, and exploiting this asymmet- Best-Eort Trac. IETF RFC 5290,
ric reliability using a suitable software architecture [21]. Such http://tools.ietf.org/html/rfc5290, July 2008.
an architecture will have a larger number of cores, thereby po- [2] B. Fitzpatrick. Livejournal: Behind the scenes scaling storytime
tentially improving performance, compared to an architecture (Invited talk). In USENIX, 2007.
that only uses fully reliable cores (since reliable cores are signif- [3] G. Linden. Early Amazon: Splitting the website. Online:
icantly larger than unreliable cores). We have recently demon- http://glinden.blogspot.com/2006/02/early-amazon-splitting-web.
strated that the best-eort computing model can be combined [4] J. Newton. Scaling out like technorati.
with the architecture proposed in [21] to take full advantage of http://newton.typepad.com/content/2007/09/scaling-out-
lik.html.
the forgiving nature of computing workloads. The separation [5] T. OReilly. Web 2.0 and databases part 1: Second life.
of computations into best-eort and guaranteed computations http://radar.oreilly.com/archives/2006/04/web-20-and-databases-
can be used to drive the assignment of computations to the part-1-se.html.
reliable and unreliable cores. We have also demonstrated that [6] T. OReilly. Database war stories 5: craigslist.
simultaneously exploiting the forgiving nature of the workloads http://radar.oreilly.com/archives/2006/04/database-war-stories-5-
craigsl.html.
through software techniques (dropping computations, relaxing [7] T. OReilly. Database war stories 3: Flickr.
data dependencies) and execution on unreliable cores can lead http://radar.oreilly.com/archives/2006/04/database-war-stories-3-
to signicantly better performance than either of these tech- ickr.html.
niques alone [22]. [8] M. Armbrust et al. SCADS: Scale-independent storage for social
computing applications. In Proc. 4th Biennial Conference on
Innovative Data Systems Research (CIDR), 2009.
6. SUMMARY AND DISCUSSION [9] Adf Facebook infosession. Sponsored by Industrial Relations
Oce (IRO), EECS Department, UC Berkeley, September 2008.
The emergence of mainstream parallel computing mandates [10] J. Sobel. Scaling out.
that applications scale in performance primarily by taking ad- http://www.facebook.com/note.php?note id=23844338919.
vantage of increasing numbers of cores. Motivated by ap- [11] S. Armstrong et al. Multicast Transport Protocol. RFC 1301,
proaches used to design large scale networking and storage sys- http://www.rfc-archive.org/getrfc.php?rfc=1301, Feb. 1992.
tems, we propose best-eort computing as a model for future [12] L. Breslau and S. Shenker. Best-eort versus reservations: a
parallel computing platforms. Under the best-eort model, dif- simple comparative analysis. In Proc. SIGCOMM, pages 316,
ferent layers of the computing platform stack are designed to 1998.
provide a best-eort service, relaxing conventional guarantees [13] Pradeep Dubey. A Platform 2015 Workload Model: Recognition,
Mining and Synthesis Moves Computers to the Era of Tera. White
of complete or correct execution. We presented a conceptual Paper, Intel Corporation, 96, 2008.
model for a best-eort computing platform, described charac- [14] Y. K. Chen et al. Convergence of Recognition, Mining, and
teristics of applications that can leverage such a platform, and Synthesis Workloads and Its Implications. Proc. of IEEE,
96:790807, 2008.
provided concrete examples of parallel software and hardware [15] J. B. MacQueen. Some methods for classication and analysis of
design techniques that adopt the best-eort approach to signif- multivariate observation. In Proceedings of the Berkeley
icantly improve performance and energy eciency. We believe Symposium on Mathematical Statistics and Probability, 1967.
that the best-eort model can also be exploited to optimize [16] J. Meng, S. Chakradhar, and A. Raghunathan. Best-eort parallel
many other computing platform metrics like cost or system execution framework for recognition and mining applications. In
Proc. IEEE Int. Parallel and Distributed Processing Symp., May
management eort of the computing infrastructure. 2009.
We would also like to mention some of the challenges that [17] J. Meng, A. Raghunathan, A. Chakradhar, and S. Byna.
must be overcome in order to facilitate adoption of the best- Exploiting the forgiving nature of applications for scalable parallel
execution. In Proc. IEEE Int. Parallel and Distributed
eort model described in this paper. The best-eort model in- Processing Symp., Apr. 2010.
creases the burden on application developers since they must [18] S. Byna, J. Meng, A. Raghunathan, S. Chakradhar, and
partition their applications into computations that can work S. Cadambi. Best Eort Semantic Document Search on GPUs. In
with a best-eort service model and computations that can- Proc. Third Wkshp. on General-Purpose Computation on
Graphics Procesing Units, Mar. 2010.
not. The development of programming models, such as the [19] V. Chippa, D. Mohapatra, A. Raghunathan, K. Roy, and S. T.
iterative-convergence template described in this paper, for a Chakradhar. Scalable Eort Hardware: Exploiting Algorithmic
wide range of algorithms and computational patterns, will alle- Resilience for Energy Eciency. In Prof. ACM/IEEE Design
viate this burden. Alternatively, for some applications it may Automation Conf., June 2010.
[20] N. R. Pal, J. C. Bezdek, and E. C.-K. Tsao. Generalized
be better to use programming language extensions to mark clustering networks and Kohonens self-organizing scheme. IEEE
the regions in a program that correspond to best-eort com- Trans. on Neural Networks, 4(4):549557, Jul 1993.
putations. An important question that must be addressed is [21] L. Leem, H. Cho, J. Bau, Q. Jacobson, and S. Mitra. ERSA:
to what extent the underlying platform should actually take Error-Resilient System Architecture for Probabilistic
Applications. In Prof. ACM/IEEE Design Automation and Test
advantage of the forgiving nature of the best-eort computa- in Europe, Mar. 2010.
tions (i.e., how aggressively should it drop or incorrectly ex- [22] H. Cho, S. T. Chakradhar, and A. Raghunathan. Trading
ecute these computations). In this context, we note that in reliability for parallelism. Personal Communication, Sep. 2009.
870