Académique Documents
Professionnel Documents
Culture Documents
?
Computer Science Division Yahoo! Research Computer Systems Laboratory
University of California, Berkeley Santa Clara, CA Stanford University
Berkeley, CA Stanford, CA
CPU
3 Activity Tracking
Sensor
Tracking power states and energy consumption of energy
sinks over time shows where and when the energy is go- Radio Node A
ing, but leaves a semantic gap to the programmer of why
the energy is being spent. Act: sensing Act: sending/storing
The key here is to attribute energy usage to entities – Proxy Rx Activity Packet Tx
or resource principals – that are meaningful to the pro-
grammer. In traditional operating systems, processes or Figure 4: Activity tracking for a sensing, sending, and
threads combine the roles of protection domain, schedu- storing a sample across two nodes. The developer chose
lable unit, and resource principal, but there are many sending as a separate activity. Receiving is part of a
situations in which it is desirable that these notions be proxy activity until the CPU can decode the true activ-
independent. This idea was previously explored in the ity and correctly bind the resource usage.
context of high-performance network servers [2] but it is
also especially true in networked embedded systems.
We borrow from earlier work the concept of an ac- automatically to causally related operations. For exam-
tivity as our resource principal. In the Rialto system in ple, when a CPU that is “painted red” invokes an opera-
particular [19], an activity was defined as the “the ab- tion on the sensor, the CPU paints the sensor red as well.
straction to which resources are allocated and to which The programmer may decide to change the CPU activ-
resource usage is charged.” In other words, an activity is ity if it starts work on behalf of a new logical activity,
a set of operations whose resource consumption should such as when transitioning from sensing to sending (red
be grouped together. In the environments we consider, to blue in the figure). Again the system will propagate
where most of the resource consumption does not hap- the new activity to other devices automatically.
pen at the CPU, and sometimes not even on the same This propagation includes carrying activity labels on
node that initiated an activity, it is fundamental to sup- network messages, such that operations on node B can
port activities that span different hardware components be assigned to the activity started on node A. This exam-
and multiple nodes. ple also highlights an important aspect of the propaga-
We close the gap of why energy is spent by assigning tion, namely proxy activities. When the CPU on node B
the energy consumption to activities that are defined by receives an interrupt indicating that the radio is starting
the programmer at a high level. To do this we follow all to receive a packet, the activity to which the receiving
operations related to an activity across hardware compo- belongs is not known. This is generally true in the case
nents on a single node and across the network. of interrupts and external events. Proxy activities are a
solution to this problem. The resources used by a proxy
3.1 Overview activity are accounted for separately, and then assigned
to the real activity as soon as the system can determine
To account for the resource consumption of activities, we what this activity is. In this example the CPU can deter-
track when a hardware component, or device, is perform- mine that it should be colored blue as soon as it decodes
ing operations on behalf of an activity. A useful analogy the activity label in the radio packet. It terminates the
is to think of an activity as a color, and devices as be- proxy activity by binding it to the blue activity.
ing painted with the activity’s color when working on its The programmer can define the granularity of activi-
behalf. By properly recording devices’ successive colors ties in a flexible way, guided by how she wants to divide
over time and their respective resource consumptions, we the resource consumption of the system. Some opera-
can assign to each activity its share of the energy usage. tions do not clearly belong to specific activities, such as
Figure 4 shows an example of how activities can span data structure maintenance or garbage collection. One
multiple devices and nodes. In the figure, the program- option is to give these operations their own activities,
mer marks the start of an activity by assigning to the CPU representing this fact explicitly.
the sensing activity (“painting the CPU red”). We repre- The mechanisms for tracking activities are divided
sent activities by activity labels, which Quanto carries into three parts, which we describe in more detail next,
interface SingleActivityDevice { task void sensorTask() {
// Returns the current activity call CPUActivity.set(ACT_HUM);
async command act_t get(); call Humidity.read();
call CPUActivity.set(ACT_TEMP);
// Sets the current activity call Temperature.read();
async command void set(act_t newActivity); }
1 1 0 1 5.53 5.53
10 10
LED1 2.23
0 0 1 1 1.62 1.62
LED2 0.83
8 8 1 0 1 1 4.15 4.12
Const. 0.79
0 1 1 1 3.88 3.85
6 6 1 1 1 1 6.30 6.36
4 4
2 All LEDs On 2 Table 2: Oscilloscope measurements of the current for
Mean (6.30 mA)
0 0 the steady states of Blink, and the results of the regres-
0 0.5 1 1.5 0 0.5 1 1.5
sion with the current draw per hardware component. The
Time(ms) Time(ms)
relative error (kY − XΠk/kYk) is 0.83%.
Figure 10: Current over time for two states of Blink
recorded with the oscilloscope, showing the mean cur- quency fiC , in kHz, have a linear dependency given by
rent and the iCount pulses that Quanto accumulates. Iavg = 2.77fiC −0.05, with an R2 value of 0.99995. We
can infer from this that each iCount pulse corresponds,
in this hardware, at 3 V, to 8.33 mJ. We also verified that
ferent nodes. We then look at three case studies in which
Iavg was stable during each interval.
Quanto exposes real-world effects and costs of applica-
Lastly we tested the regression methodology from
tion design decisions, and lastly we quantify some of the
Section 2.5, using the average current measured by the
costs involved in using Quanto itself. In these experi-
oscilloscope in each state of Blink and the external state
ments processed Quanto data with a set of tools we wrote
of the LEDs as the inputs. We also added a constant
to parse and visualize the logs. We used GNU Octave to
term to account for any residual current not captured by
perform the regressions.
the LED state. Table 2 shows the results, and the small
relative error indicates that for this case the linearity as-
4.1 Calibration sumptions hold reasonably well, and that the regression
is able to produce a good breakdown of the power draws
We set up a simple experiment to calibrate Quanto per hardware device.
against the ground truth provided by a digital oscillo-
scope. The goal is to establish that Quanto can indeed 4.2 Two Illustrative Examples
measure the aggregate energy used by the mote, and that
the regression does separate this energy use by hardware 4.2.1 Blink
components. We instrumented Blink with Quanto to verify the re-
We use Blink, the hello world application in TinyOS. sults from the calibration and to demonstrate a simple
Blink is very simple; it starts three independent timers case of tracking multiple activities on a single node.
with intervals of 1, 2, and 4s. When these timers fire, We divided the application into 3 main activities: Red,
the red, green, and blue LEDs are toggled, such that in Green, and Blue, which perform the operations related
8 seconds Blink goes through 8 steady states, with all to toggling each LED. Each LED, when on, gets labeled
combinations of the three LEDs on and off. The CPU is with the respective activity by the CPU, such that its en-
in its sleep state during these steady states, and only goes ergy consumption can be charged to the correct activity.
active to perform the transitions. We also created an activity to represent the managing of
Using the Hydrowatch board (cf. Section 2.2), we con- the timers by the CPU (VTimer). We recorded the power
nected a Tektronix MSO4104 oscilloscope to measure states of each LED (simply on and off), and consider the
the voltage across a 10W resistor inserted between iCount CPU to only have two states as well: active, and idle.
circuit and the mote power input. We measured the volt- Figures 11(a) and (b) show details of a 48-second run
age provided by the regulator for the mote to be 3.0V. of Blink. In these plots, the X axis represents time, and
We confirmed the result from [9] that the switching each color represents one activity. The lower part of (a)
frequency of iCount varies linearly with the current. Fig- shows how each hardware component divided its time
ure 10 shows the current for two sample states of Blink. among the activities. The topmost portion of the graph
This curve has a wealth of information: from it we shows the aggregate power draw measured by iCount.
can derive both the switching frequency of the regula- There are eight distinct stable draws, corresponding to
tor, which is what Quanto measures directly, and the the eight states of the LEDs.
actual average current, Iavg . We verified over the 8 Part (b) zooms in on a particular state transition span-
power states that Iavg , in mA, and the switching fre- ning 4 ms, around 8 s into the trace, when all three LEDs
35
Power(mW) 30
25 50
20 Led2
15
10
5 40
0
Hardware Components
Power (mW)
Led2 Led1
30
Hardware Components
Led1
Led0 20
Led0
10
CPU
CPU
0
7999 7999.5 8000 8000.5 8001 8001.5 8002 8002.5 8003
0 5 10 15 20 25 30 35 40 45 7999 7999.5 8000 8000.5 8001 8001.5 8002 8002.5 8003
Time(ms)
Time(s) Time(ms)
CPU Led0 (Red)
1:Blue 1:Red 1:VTimer 1:Blue 1:Red 1:VTimer Led2 (Blue) Constant
1:Green 1:int_TIMER 1:Green 1:int_TIMER Led1 (Green) Oscilloscope Trace
(a) Power draw measured by Quanto and the (b) Detail of a transition from all on to all off, (c) Stacked power draw for the hardware
activities over time for each hardware showing the activities for each hardware components, with values from the re-
component. component. gression, overlaid with the oscilloscope-
measured power.
Figure 11: Activity and power profiles for a 48-second run of the Blink application on the Hydrowatch platform.
Hardware Components
Hardware Components
led1 led1 led1
1.5 2 2.5 3 3.5 1552 1554 1556 1558 1560 1562 1564 2060 2065 2070 2075 2080
Time(ms) Time(ms) Time(ms)
1:BounceApp 1:pxy_RX 1:int_TIMER 1:BounceApp 1:pxy_RX 1:int_TIMER 1:BounceApp 1:pxy_RX 1:int_TIMER
4:BounceApp 1:int_UART0RX 1:VTimer 4:BounceApp 1:int_UART0RX 1:VTimer 4:BounceApp 1:int_UART0RX 1:VTimer
(a) A 2-second window on a run of Bounce (b) Detail of a packet reception with an activ- (c) Detail of a packet transmission on node 1
at a node with id 1. ity label from node 4. as part of the activity started at node 4.
Figure 12: Activity tracking on Bounce. Each packet carries the activity current at the time it was generated, and the
receiving node executes some operations as part of that remote activity.
tribute to the overall energy consumption. The graph also of a proxy activity to the activity to which it binds. The
shows an overlaid power curve measured with the oscil- receive operation starts with a timer interrupt for the start
loscope for the same run. The graph shows a very good of frame delimiter, followed by a long transfer from the
match between the two sources, both in the time and en- radio FIFO buffer to the processor, via the SPI bus. This
ergy dimensions. We can notice small time delays be- transfer uses an interrupt for every 2 bytes. When fin-
tween the two curves, on the order of 100 µs, due to the ished, the packet is decoded by the radio stack, and the
time Quanto takes to record a measurement. activity in the packet can be read and assigned to the
4.2.2 Bounce CPU. The CPU then “paints” the LED with this activ-
ity and schedules a timer to send the packet.
The second example we look at illustrates how Quanto
Transmission in Bounce is triggered by a timer inter-
keeps track of activities across nodes. Bounce is a sim-
rupt that was scheduled upon receive. The timer carries
ple application in which two nodes keep exchanging two
and restores the activity, and “paints” the radio. There
packets, each one originating from one of the nodes.
are two main phases for transmission. First, the data is
In this example we had nodes with ids 1 and 4 partic-
transferred to the radio via the SPI bus, and then, after a
ipate. All of the work done by node 1 to receive, pro-
backoff interval, the actual transmission happens. When
cess, and send node 4’s original packet is attributed to
the transmission is done, the CPU then turns the LED off
the ’4:BounceApp’ activity. Although this is a trivial ex-
and sets its activity to idle.
ample, the same idea applies to other scenarios, like pro-
tocol beacon messages and multihop routing of packets.
Figure 12 shows a 2-second trace from node 1 of a 4.3 Case Studies
run of Bounce. The log at the other node is symmetrical.
On part (a) we see the entire window, and the activities Quanto allows a developer to precisely understand and
by the CPU, the radio, and two LEDs that are on when quantify the effects of design decisions, and we discuss
the node has “possession” of each packet. In this figure, three case studies from the TinyOS codebase.
node 1 receives a packet which carries the 4:BounceApp The first one is an investigation of the effect of inter-
activity, and turns LED1 on because of that. The energy ference from an 802.11 b/g network on the operation of
spent by this LED will be attributed to node 4’s original low-power listening [25]. Low-power listening (LPL) is
activity. The node then receives another packet, which a family of duty-cycle regimes for the radio in which the
carries its own 1:BounceApp activity. LED2’s energy receiver stays mostly off, and periodically wakes up to
spending will be assigned to node 1’s activity, as well as detect whether there is activity on the channel. If there
the subsequent transmission of this same packet. is, it stays on to receive packets, otherwise it goes back
Figures 12(b) and (c) show in detail a packet reception to sleep. In the simplest version, a sender must transmit
and transmission, and how activity tracking takes place a packet for an interval as long as the receiver’s sleep in-
in these two operations. Again, we keep the interrupt terval. A higher level of energy in the channel, due to
proxy activities separated, although when accounting for interference from other sources, can cause the receiver to
resource consumption we should assign the consumption falsely detect activity, and stay on unnecessarily. Since
80 70
Radio
60 Energy detected:
10
Energy detected: 0
10 Time (ms)
Channel 17
Proxy Receive VTimer
Channel 26
0
0 2 4 6 8 10 12 14
Time(s) Figure 14: Detail of a normal wake-up period with no
activity, in which the radio wakes up and returns to sleep,
Figure 13: 802.11 b/g interference on the mote 802.15.4 and of a false-positive activity detection. In the latter, the
radio. In the top curve the mote was set to the 802.15.4 CPU keeps the radio on for about 100 ms, and turns it off
channel 17, and in the bottom curve, to channel 26. when the timer expires and no packet was received.
These are, respectively, the closest to and furthest from
the 802.11 b channel 6 which was used in the experiment.
LED2
802.11 b/g and 802.15.4 radios share the 2.4 GHz band, TimerA DCO
and the former generally has much higher power than Resources LED0 Calibration
the latter, this scenario can be quite common. We used ... ...
Quanto to measure the impact of such interference. We
set an 802.11 b access point to operate on channel 6, CPU
DMA
uint32_t time; // local time of the node
uint32_t ic; // icount: cumulative energy
union {
uint16_t act; //for ctx changes
uint16_t powerstate; //for powerstate changes
Normal };
} entry_t;
MAC protocol. If two nodes A and B receive the same Table 4: Costs associated with logging to RAM.
packet from a third node, and need to respond to it imme-
diately, and if A uses DMA while B uses the interrupt-
driven communication, A will gain access to the medium messages over 48 seconds. The total time spent on the
more often than B, subverting MAC fairness. logging itself was 60.71 ms, corresponding to 71.05% of
the active CPU time, but only 0.12% of the total CPU
time. The total energy spent with logging, assuming that
4.4 Costs logging is using the CPU and the Constant terms in the
regression results, was 0.41 mJ, or 0.08% of the total en-
We now look at some of the costs associated with our ergy spent. Although the 71% number is high, the ma-
prototype implementation of Quanto. These are summa- jority of applications in these sensor network platforms
rized in Table 4. strive to reduce the CPU duty cycle to save energy, and
we expect the same trend of long idle periods to amortize
Cost of logging. The design of Quanto decouples gen- the cost of logging.
erating event information, like activity and power state The above numbers only concern the synchronous
changes, from tracking the events. We currently record part. We still have to get the data out of the node for
a log of the events for offline processing. The cost of the current approach of offline analysis. We have two
logging is divided in two parts, one synchronous and one implementations for this. The first records messages to a
asynchronous. Recording the time and energy for each fixed buffer in RAM that holds 800 log entries, periodi-
event has to be done synchronously, as close to the event cally stops the logging, and dumps the information to the
as possible. Dealing with the recorded information can serial port or to the radio. The advangate of this is that
be done asynchronously. the cost of logging, during the period being monitored,
It is very important to minimize the cost of syn- is only the cost of the synchronous part.
chronously recording each sample, as this both limits the The second approach allows continuous logging. The
rate at which we can capture successive events, and de- processor still collects entries to the memory buffer, and
lays operations which must be processed quickly. Our schedules a low priority task to empty the log. This hap-
current implementation records a 12-byte log entry for pens only when the CPU would otherwise be idle. Mes-
each event, described in Figure 17. We measured the sages are written directly to an output port of the micro-
cost of logging to RAM to be 101.7 µs, using the same processor, which drives an external synchronous serial
technique as in [9]. At 1MHz, this translates to 102 cy- interface. Like the Unix top application, Quanto can ac-
cles. This time includes 24 µs to read the iCount value, count for its own logging in this mode as its own activity.
and 19 µs to read a timer value. For the applications we instrumented, it used between 4
Because Quanto uses the CPU to keep track of state and 15% of the CPU time.
and to log changes to state, using it incurs a cost by de- The rate of generated data from Quanto largely de-
laying operations on the CPU, and spending more en- pends on the nature of the workload of the application.
ergy. For the run of Blink in Section 4.2, we logged 597 For the classes of applications that are common in em-
Files Diff LOC in many cases. The design, however, clearly separates
Modified Code
Tasks 2 25 Concurrency
the event generation from the event consumption. An
Timers 2 16 Deferral alternative would be to maintain a set counters on the
Arbiter 5 34 Locks nodes, accumulating time and energy spent per activity.
Interrupts 11 88 In our initial exploration we decided to examine the full
Active Msg. 2 8 Link Layer
LEDs 2 33 Device Driver
dataset offline, and leave as future work to explore per-
CC2420 Radio 11 105 Device Driver forming the regression and accounting of resources on-
SHT11 3 10 Sensor line, which would make the memory overhead fixed and
New code 28 1275 Infrastructure practically eliminate the logging overhead.
Activity model. An important design decision in
Table 5: The cost of instrumenting most core primi- Quanto is that activities are not hierarchical. While giv-
tives for activity and power tracking in TinyOS, as well ing more flexibility, representing hierarchies would mean
as some representative device drivers, is low in terms that the system would propagate stacks of activity labels
of lines of code. New code represents the infrastruc- instead of single labels, a significant increase in over-
ture code for keeping track of and logging activities and head and complexity. If a module C does work on behalf
power states. of two activities, A and B, the instrumenter has two op-
tions: to give C its own activity, or to have C’s operations
assume the activity of the caller.
bedded sensor networks, of low data-rate and duty cycle, Platform hardware support. All of the data in this
we believe the overheads are acceptable. paper were collected using the HydroWatch platform but
our experiences suggested that a more tailored design
would be useful. In particular, we had the options of stor-
Instrumentation costs. Finally, we look at the burden
ing the logs in RAM, which has little impact but limited
to instrument a system like TinyOS to allow tracking and
space, or logging to a processor port, which has a slightly
propagation of activities and power states. Table 5 lists
higher cost and can be intrusive at very high loads. We
the main abstractions we had to instrument in TinyOS
have designed a new platform tailored for profiling with
to achieve propagation of activity labels in our platform,
a fast, 128 KB-deep FIFO for full speed logging with
and shows that the changes are highly localized and rel-
very little overhead, which we plan to use on future ex-
atively small in number of lines of code.
periments.
The complexity of the instrumentation task varies, and
some device drivers with shadowed state that represents
volatile state in peripherals can be more challenging to 5.2 Limitations
instrument. The CC2420 radio is a good example, as it
has several internal power states and does some process- Constant per-state power draws. The regression tech-
ing without the CPU intervention. Other devices, like niques used to estimate per-component energy usage as-
the LEDs and simple sensors, are quite easier. We found sume the power draw of a hardware component is ap-
that once the system is instrumented, the burden to the proximately constant in each power state. Fortunately,
application programmer is small, since all that needs to we verified that this assumption largely holds for the
be done is marking the beginning of relevant activities, platform we instrumented, by looking at different length
which will be tracked and logged automatically. sampling intervals for each state. The regression may
not work well when this assumption fails, but we leave
quantifying this for future work.
5 Discussion Linear independence. The regression techniques also
assume that tracking power states over time produces a
We now discuss some of the the design tradeoffs and lim- set of linearly independent equations. If this is not the
itations of the approach, and some research directions en- case, for example if unrelated actions always occur to-
abled by this work. gether, then regression is unlikely to disambiguate their
energy usage. As a work around, custom routines can be
5.1 Design Tradeoffs written to exercise different power states independently.
Modifications to systems. Quanto requires the OS,
Logging vs. counting. Quanto currently logs every including device drivers, and applications, to be modified
power state and activity context change which can re- to perform tracking. The modifications to the system,
sult in large volume of trace data. The data are useful for however, can be shared among all applications, and the
reconstructing a fine-grained timeline and tracing causal modifications to applications are, in most cases, simple.
connections, but this level of detail may be unnecessary Device drivers have to be modified so that they expose
the power states of the underlying hardware components. 6 Related Work
If hardware power states are not observable, estimation
errors may occur. Our techniques borrow heavily from the literature on
Energy usage visibility. Our approach may not gener- energy-aware operating system, power simulation tools,
alize to systems with sophisticated power supply filtering power/energy metering, power profiling, resource con-
(e.g. power factor correction or large capacitors) because tainers, and distributed tracing.
these elements introduce a potentially non-linear phase ECOSystem [37] proposes the Currentcy model which
delay between real and observed energy usage over short treats energy as a first class resource that cuts across all
time scales, making it difficult to correlate short-lived ac- existing system resources, such as CPU, disk, memory,
tivities with their energy usage. and the network in a unified manner. Quanto leverages
Hardware energy metering. Our proposed approach many of the ideas developed in ECOSystem, like track-
requires hardware support for energy metering, which ing power states to allocate energy usage or employing
may not be available on some platforms. Fortunately, the resource containers as the principal to which resource
energy meter design we use may be feasible on many usage is charged. But there are important differences as
systems that use pulse-frequency modulated switching well. ECOSystem uses offline profiling to relate power
regulators. However, even if hardware-based energy me- state and power draw, and uses a model for runtime oper-
tering is not available, a software-based approach using ation. In contrast, Quanto tracks the actual energy used at
hardware power models may still provide adequate visi- runtime, which is useful when environmental factors can
bility for some applications. affect energy availability and usage. While ECOSystem
tracks energy usage on a single node, Quanto transpar-
ently tracks energy usage across the network, which al-
lows network-wide effects to be measured. Finally, the
5.3 Enabled Research focus of the two efforts is different although similar tech-
niques are used in both system.
Finding energy leaks. A situation familiar to many Eon is a programming language and runtime system
developers is discovering that an application draws too that allows paths or flows through the program to be an-
much power but not knowing why. Using Quanto, devel- notated with different energy states [29]. Eon’s runtime
opers can visualize energy usage over time by hardware then chooses flows to execute, and their rates of execu-
component, allowing one to work backward to find the tion, to maximize the quality of service under available
offending code that caused the energy leak. energy constraints. Eon, like Quanto, uses real-time en-
ergy metering but attributes energy usage to flows, which
Tracking butterfly effects. In many distributed ap-
are similar to the activities that Quanto uses.
plications, an action at one node can have network-wide
effects. For example, advertising a new version of a code Several power simulation tools exist that use
image or initiating a flood will cause significant network- empirically-generated models of hardware behavior.
wide action and energy usage. Even minor local actions, PowerTOSSIM [28] uses same-code simulation of
like a routing update, can ripple through the entire net- TinyOS applications with power state tracking, com-
work. Quanto can trace the causal chain from small, lo- bined with a power model of the different peripheral
cal cause to large, network-wide effect. states, to create a log of energy usage. PowerTOSSIM
provides visibility into the power draw based on its
Real time tracking. An extension of the framework model of the hardware, but it does not capture the vari-
can include performing the regression online, and replac- ability common in real hardware or operating environ-
ing the logging with accumulators for time and energy ments, or simulate a device’s interactions with the real
usage per activity. This approach would have signifi- world. Quanto also addresses a different problem than
cantly reduced bandwidth and storage requirements, and PowerTOSSIM: tracing the energy usage of logical ac-
could be used as an always on, network-wide energy pro- tivities rather than the time spent in software modules.
filer analogous to top.
The challenge in taking measurements in low-power,
Enery-Aware Scheduling. Since Quanto already embedded systems that exhibit bursty operation is that
tracks energy usage by activity, an extension to the oper- until recently, the performance of available metering
ating system scheduler would enable energy-aware poli- options was simply too poor, and the power cost was
cies like equal-energy scheduling for threads, rather than simply too high, to use in actual deployments. Tradi-
equal-time scheduling. tional instrument-based power measurements are use-
Continuous Profiling. Quanto log entries are ful for design-time laboratory testing but impractical
lightweight enough that continuous profiling is possible for everyday run-time use due to the cost of instru-
with even a modest speed logging back-channel [1]. ments, their physical size, and their poor system integra-
tion [14, 12, 35]. Dedicated power metering hardware unprecedented visibility into energy usage will enable
can enable run-time energy metering but they too come empirical evaluation of the energy-efficiency claims in
with the expense of increased hardware costs and power the literature, provide ground truth for lightweight ap-
draws [5, 18]. Using hardware performance counters as a proximation techniques like counters, and enable energy-
proxy power meter is possible on high-performance mi- aware operating systems research.
croprocessors like the Intel Pentium Pro [20] and embed-
ded microprocessors like the Intel PXA255 [7]. Quanto
addresses these challenges with a new design based on a 8 Acknowledgments
switching regulator [9].
Of course, if a system employs only one switching reg- This material is based upon work supported by the
ulator, then the energy usage can be measured only in the National Science Foundation under grants #0435454
aggregate, rather than by hardware component. This ag- (“NeTS-NR”) and #0454432 (“CNS-CRI”). This work
gregated view of energy usage can present some track- was also supported by a National Science Foundation
ing challenges as well. One way to track the distinct Graduate Research Fellowship and a Microsoft Research
power draws of the hardware components is to instru- Graduate Fellowship as well as generous gifts from
ment their individual power supply lines [34, 30]. These Hewlett-Packard Company, Intel Research, Microsoft
approaches, however, are best suited to bench-scale in- Corporation, and Sharp Electronics.
vestigations since they require extensive per-system cal-
ibration and the latter requires considerable additional References
hardware which would dominate the system power bud-
[1] A NDERSON , J. M., B ERC , L. M., D EAN , J., G HEMAWAT, S.,
get in our applications. H ENZINGER , M. R., L EUNG , S.-T. A., S ITES , R. L., VANDE -
The RIALTO operating system [19] introduced activ- VOORDE , M. T., WALDSPURGER , C. A., AND W EIHL , W. E.
ities as the abstraction to which resources are allocated Continuous profiling: where have all the cycles gone? ACM
and charged. Resource Containers [2] use a similar no- Trans. Comput. Syst. 15, 4 (1997), 357–390.
tion, and acknowledge that there is a mismatch between [2] BANGA , G., M OGUL , J. C., AND D RUSCHEL , P. Resource con-
traditional OS resource principals, namely threads and tainers: A new facility for resource management in server sys-
tems. In Proceedings of the Third Symposium on Operating Sys-
processes, and independent activities, especially in high tems Design and Implementation (OSDI) (February 1999).
performance network servers. Quanto borrows the con-
[3] BARHAM , P., D ONNELLY, A., I SAACS , R., AND M ORTIER ,
cept of activities and extends them across all hardware R. Using magpie for request extraction and workload modelling.
components and across the nodes in a network. In OSDI’04: Proceedings of the 6th conference on Symposium
Several previous works have modeled the behavior of on Opearting Systems Design & Implementation (Berkeley, CA,
distributed systems as a collection of causal paths includ- USA, 2004), USENIX Association, pp. 18–18.
ing Magpie [3], Pinpoint [6], X-Trace [15], and Pip [27]. [4] C HANDA , A., E LMELEEGY, K., C OX , A. L., AND
These systems reconstruct causal paths using some com- Z WAENEPOEL , W. Causeway: operating system support for con-
trolling and analyzing thexecution of distributed programs. In
binations of OS-level tracing, application-level annota- HOTOS’05: Proceedings of the 10th conference on Hot Topics in
tion, and statistical inference. Causeway [4] instruments Operating Systems (Berkeley, CA, USA, 2005), USENIX Asso-
the FreeBSD OS to automatically carry metadata with ciation, pp. 18–18.
the execution of threads and across machines. Quanto [5] C HANG , F., FARKAS , K., AND R ANGANATHAN , P. Energy-
borrows from these earlier approaches and applies them driven statistical profiling: Detecting software hotspots. In Work-
shop on Power-Aware Computer Systems (feb 2002).
to the problem of tracking network-wide energy usage in
embedded systems, where resource constraints and a en- [6] C HEN , M. Y., ACCARDI , A., K ICIMAN , E., L LOYD , J., PAT-
TERSON , D., F OX , A., AND B REWER , E. Path-based faliure
ergy consumption by hardware devices raise a number of
and evolution management. In NSDI’04: Proceedings of the 1st
different design tradeoffs. conference on Symposium on Networked Systems Design and Im-
plementation (Berkeley, CA, USA, 2004), USENIX Association,
pp. 23–23.
7 Conclusion
[7] C ONTRERAS , G., AND M ARTONOSI , M. Power prediction for
intel xscale processors using performance monitoring unit events.
The techniques developed and evaluated in this paper – In ISLPED ’05: Proceedings of the 2005 international sympo-
breaking down the aggregate energy usage of a system sium on Low power electronics and design (New York, NY, USA,
by hardware component, tracking causally-connected en- 2005), ACM, pp. 221–226.
ergy usage of programmer-defined activities, and track- [8] D EAN , J., H ICKS , J. E., WALDSPURGER , C. A., W EIHL ,
ing the network-wide energy usage due to node-local ac- W. E., AND C HRYSOS , G. Profileme: hardware support for
instruction-level profiling on out-of-order processors. In MICRO
tions – collectively provide visibility into when, where,
30: Proceedings of the 30th annual ACM/IEEE international
and why energy is consumed both within a single node symposium on Microarchitecture (Washington, DC, USA, 1997),
and across the network. Going forward, we believe this IEEE Computer Society, pp. 292–302.
[9] D UTTA , P., F ELDMEIER , M., PARADISO , J., AND C ULLER , D. [23] M US ĂLOIU -E., R., J IANG , C.-J. M., AND T ERZIS , A. Koala:
Energy metering for free: Augmenting switching regulators for Ultra-Low Power Data Retrieval in Wireless Sensor Networks. In
real-time monitoring. In IPSN’08: International Conference on Proceedings of the 7th International Symposium on Information
Information Processing in Sensor Networks (2008), pp. 283–294. Processing in Sensor Networks (IPSN) (2008).
[10] D UTTA , P., TANEJA , J., J EONG , J., J IANG , X., AND C ULLER , [24] P ETROVA , M., R IIHIJARVI , J., M AHONEN , P., AND L ABELLA ,
D. A building block approach to sensornet systems. In Pro- S. Performance study of ieee 802.15.4 using measurements and
ceedings of the Sixth ACM Conference on Embedded Networked simulations. In Wireless Communications and Networking Con-
Sensor Systems (SenSys’08) (Nov. 2008). ference 2006 (WCNC 2006) (April 2006), vol. 1, pp. 487–492.
[25] P OLASTRE , J., H ILL , J., AND C ULLER , D. Versatile low power
[11] E LSON , J., G IROD , L., AND E STRIN , D. Fine-grained net- media access for wireless sensor networks. In Proceedings of the
work time synchronization using reference broadcasts. In OSDI Second ACM Conference on Embedded Networked Sensor Sys-
’02: Proceedings of the 5th symposium on Operating systems de- tems (SenSys) (November 2004).
sign and implementation (New York, NY, USA, 2002), ACM,
pp. 147–163. [26] P OLASTRE , J., H UI , J., L EVIS , P., Z HAO , J., C ULLER , D.,
S HENKER , S., AND S TOICA , I. A unifying link abstraction for
[12] FARKAS , K. I., F LINN , J., BACK , G., G RUNWALD , D., AND wireless sensor networks. In SenSys ’05: Proceedings of the 3rd
A NDERSON , J. M. Quantifying the energy consumption of a international conference on Embedded networked sensor systems
pocket computer and a java virtual machine. SIGMETRICS Per- (New York, NY, USA, 2005), ACM, pp. 76–89.
form. Eval. Rev. 28, 1 (2000), 252–263. [27] R EYNOLDS , P., K ILLIAN , C., W IENER , J. L., M OGUL , J. C.,
[13] F LINN , J., AND S ATYANARAYANAN , M. Energy-aware adapta- S HAH , M. A., AND VAHDAT, A. Pip: detecting the unexpected
tion for mobile applications. In Symposium on Operating Systems in distributed systems. In NSDI’06: Proceedings of the 3rd con-
Principles (SOSP’99) (1999), pp. 48–63. ference on 3rd Symposium on Networked Systems Design & Im-
plementation (Berkeley, CA, USA, 2006), USENIX Association,
[14] F LINN , J., AND S ATYANARAYANAN , M. Powerscope: A tool pp. 9–9.
for profiling the energy usage of mobile applications. In WMCSA [28] S HNAYDER , V., H EMPSTEAD , M., RONG C HEN , B., W ERNER -
’99: Proceedings of the Second IEEE Workshop on Mobile Com- A LLEN , G., AND W ELSH , M. Simulating the power consump-
puter Systems and Applications (Washington, DC, USA, 1999), tion of large-scale sensor network applications. In Proceedings
IEEE Computer Society, p. 2. of the Second ACM Conference on Embedded Networked Sensor
[15] F ONSECA , R., P ORTER , G., K ATZ , R. H., S HENKER , S., Systems (SenSys’04) (2004).
AND S TOICA , I. X-trace: A pervasive network tracing frame- [29] S ORBER , J., KOSTADINOV, A., G ARBER , M., B RENNAN , M.,
work. In NSDI’07: Proceedings of the 4th USENIX/ACM Sympo- C ORNER , M. D., AND B ERGER , E. D. Eon: a language and run-
sium on Networked Systems Design and Implementation (2007), time system for perpetual systems. In SenSys ’07: Proceedings of
USENIX. the 5th international conference on Embedded networked sensor
systems (2007), pp. 161–174.
[16] H EMPSTEAD , M., T RIPATHI , N., M AURO , P., W EI , G.-Y.,
AND B ROOKS , D. An ultra low power system architecture for [30] S TATHOPOULOS , T., M C I NTIRE , D., AND K AISER , W. The en-
sensor network applications. In ISCA’05: 32nd International ergy endoscope: Real-time detailed energy accounting for wire-
Symposium on Computer Architecture (2005). less sensor nodes. In IPSN’08: International Conference on In-
formation Processing in Sensor Networks (2008), pp. 383–394.
[17] H ILL , J., AND C ULLER , D. E. Mica: a wireless platform for [31] S ZEWCZYK , R., P OLASTRE , J., M AINWARING , A., AND
deeply embedded networks. IEEE Micro 22, 6 (nov/dec 2002), C ULLER , D. Lessons From A Sensor Network Expedition. In
12–24. Proceedings of the First European Workshop on Wireless Sensor
[18] J IANG , X., D UTTA , P., C ULLER , D., AND S TOICA , I. Micro Networks (EWSN) (2004).
power meter for energy monitoring of wireless sensor networks [32] TALZI , I., H ASLER , A., G RUBER , S., AND T SCHUDIN , C. Per-
at scale. In IPSN ’07: Proceedings of the 6th international con- maSense: Investigating Permafrost with a WSN in the Swiss
ference on Information processing in sensor networks (New York, Alps. In Proceedings of the Fourth Workshop on Embedded Net-
NY, USA, 2007), ACM Press, pp. 186–195. worked Sensors (EmNets) (2007).
[19] J ONES , M. B., L EACH , P. J., D RAVES , R. P., AND BARRERA , [33] T OLLE , G., AND C ULLER , D. Design of an Application-
J. S. Modular real-time resource management in the rialto operat- Cooperative Management System for Wireless Sensor Networks.
ing system. In HOTOS ’95: Proceedings of the Fifth Workshop on In Proceedings of the Second European Workshop of Wireless
Hot Topics in Operating Systems (HotOS-V) (Washington, DC, Sensor Netw orks (EWSN) (2005).
USA, 1995), IEEE Computer Society, p. 12. [34] V IREDAZ , M. A., AND WALLACH , D. A. Power evaluation of
a handheld computer. IEEE Micro 23, 1 (2003), 66–74.
[20] J OSEPH , R., AND M ARTONOSI , M. Run-time power estima-
tion in high performance microprocessors. In Proceedings of the [35] W ERNER -A LLEN , G., S WIESKOWSKI , P., AND W ELSH , M.
International Symposium on Low Power Electronics and Design Motelab: a wireless sensor network testbed. In IPSN ’05: Pro-
(2001), pp. 135–140. ceedings of the 4th international symposium on Information pro-
cessing in sensor networks (Piscataway, NJ, USA, 2005), IEEE
[21] K LUES , K., H ANDZISKI , V., L U , C., W OLISZ , A., C ULLER , Press, p. 68.
D., G AY, D., AND L EVIS , P. Integrating concurrency control and [36] Y E , W., H EIDEMANN , J., AND E STRIN , D. An energy-efficient
energy management in device drivers. In SOSP ’07: Proceedings mac protocol for wireless sensor networks. In INFOCOM’02:
of twenty-first ACM SIGOPS symposium on Operating systems The 21st Annual Joint Conference of the IEEE Computer and
principles (New York, NY, USA, 2007), ACM, pp. 251–264. Communications Societies (June 2002).
[22] M ADDEN , S., F RANKLIN , M. J., H ELLERSTEIN , J. M., AND [37] Z ENG , H., E LLIS , C. S., L EBECK , A. R., AND VAHDAT, A.
H ONG , W. Tag: a tiny aggregation service for ad-hoc sensor Ecosystem: Managing energy as a first class operating system re-
networks. In OSDI ’02: Proceedings of the 5th symposium on source. In Tenth International Conference on Architectural Sup-
Operating systems design and implementation (New York, NY, port for Programming Languages and Operating Systems (ASP-
USA, 2002), ACM, pp. 131–146. LOS X) (2002), pp. 123–132.