Vous êtes sur la page 1sur 3

Each Image Matters, Even Among

Millions: Scaling up QoE-driven Delivery


of Image-rich Web Applications
By Parvez Ahammad

It comes as no surprise that when it comes to image-rich web applications, every single image
matters in defining the quality of experience (QoE) for the end user. So how does one offer
individually-tuned settings for optimal image delivery while being able to scale up to millions
of images across the entire web delivery pipeline? Theres some fun math that goes into
answering this question; at Instart Logic, we call it SmartVision technology. Today, aligned
with the public release of our first formal academic publication describing SmartVision
technology, let me use this blog post to give you the basic ideas behind the technical core of
this technology and how it enables optimized delivery of image-rich web applications as a
whole, while selecting individually-tuned settings for each image within a given web
application.
Intuitively speaking, the key to optimal delivery of an image is to have a content-dependent
signature (or hash code) for computing the impact of web delivery on the given image, and
using said signature to prioritize various constituent parts of the image file. In our work, we
developed a simple computational signature that captures the impact of web delivery
pipeline on image quality; we call it VoQS (variation of quality signature). In our experiments,
we also discovered that large corpuses of images can be effectively split into coherent
clusters based on the VoQS similarity. Taken together, these two simple insights combine to
result in an efficient algorithmic approach SmartVision for finding adaptive settings for
each individual image, delivered via a web delivery service.
For technical details on the algorithm and experimental results on empirical datasets, please
see the academic publication that we are presenting today at the ACM (Association for
Computing Machinery) Multimedia Conference. While there is a large body of research out
there on the topics of image categorization and computer vision-based image content
analysis, our paper is one of the first publications (to our knowledge) where qualitydependent image categorization in the context of web delivery is directly addressed.
The following flowchart shows how the SmartVision algorithm works:

As you can see in the flowchart, the categorization part can be done offline (with
intermittent updates) to adapt to a changing image corpus pooled across the web delivery
service. The real-time aspect simply depends on efficient computation of VoQS and a nearestneighbor lookup against the pre-stored exemplars, that were estimated during the offline
categorization step.
While message-passing algorithms such as Affinity Propagation [Frey & Dueck, 2007] offer the
advantage that one doesnt need to pre-specify the number of expected clusters as well as
get the cluster-specific exemplars as a side product, the algorithmic complexity of Affinity
Propagation makes it impractical for really large image datasets (such as the ones we
encounter with our Software-Defined Application Delivery service). In scenarios where the
image corpus is very large, one can use faster algorithms such as K-means (with appropriate
care and safety checks) for clustering, and choose the image exemplars by minimizing
aggregate distance in the VoQS metric space. It is worth noting that the entire algorithmic
flow (and the categorization aspect) happens in an unsupervised fashion so it is highly
amenable to automation in the context of an always-on web delivery service. In our

experiments, we found that we could find optimal delivery thresholds for a large corpus of
images quickly, while minimizing the loss of visual quality (see Figure-3 in our ACMMultimedia paper). In addition, our approach is not really dependent on any particular image
format; thus, we can apply it for most of the popular image formats used by the web
community.
At Instart Logic, we use the SmartVision algorithmic pipeline in two related but different
contexts. One application scenario (termed True Fidelity Image Streaming) is to divide the
image into parts such that most relevant bits of the image file useful for optimizing users
quality of experience (QoE) are delivered up-front in a first-pass. This quick first-pass allows
an image-rich web application to load quickly and delivers fast user interaction. Meanwhile,
Instart Logics client-cloud architecture continually works in the background to enable a
seamless backfill so that the remaining details are incorporated into the image quickly,
without impacting the interaction time, while ensuring that the full-quality of the original
image is delivered. (Note though, that such a streaming approach requires the user to have
our thin JavaScript-based client Nanovisor.js running in their web browser.)
So what can you do when the client isnt installed on the target device, such as is the case
with a native mobile application?
For users who do not have an environment that can run our JavaScript client, we can use the
SmartVision technology to automatically determine the optimal threshold on the server-side,
and just send the part of the image file that delivers a good QoE compared to the original. In
congested mobile networks, or for users with low-complexity user-devices, or other scenarios
where network footprint comes at a premium, such a server-side approach can deliver
dramatic improvement in web application interactivity without significantly sacrificing the
visual quality-of-experience (QoE). We term this application scenario Image Transcoding with
SmartVision. This approach allows us to improve application delivery performance through a
server-side transformation.
For further technical details and empirical experimental results, click on this link to access
our ACM Multimedia publication.

Vous aimerez peut-être aussi