5 Jackson Azure

UNIVERSIT DT FRANCO-ALLEMANDE
POUR JEUNES CHERCHEURS 2011
DEUTSCH-FRANZSISCHE SOMMERUNIVERSITT
FR NACHWUCHSWISSENSCHAFTLER 2011
CLOUD COMPUTING :
DFIS ET OPPORTUNITS
CLOUD COMPUTING :
HERAUSFORDERUNGEN UND MGLICHKEITEN
Windows Azure as a Platform as a Service (PaaS)

17.7. 22.7. 2011
Jared Jackson
Microsoft Research
Before we begin Some Results

Ice Cream Consumption
Favorite Ice Cream

Cookies and
Cream
3%
Cheescake
Walnut
3%
Cinamon
3%
3%
Vanilla
23%
Other
29%
Vanilla
33%
Stratiatella
10%
Tiramisu
3%
Cherry
3%
Chocolate
13%
Chocolate Chip
4%
Malaga
3%
Amarena
3%
Mango
3%
Strawberry
10%
Pistachio
7%
Banana Coffee
3%
3%
Neapolitan
4%
Chocolate
11%
Butter Pecan
7%
Cookies and Cream

4%
Cherry Coffee Strawberry

5%
2%
2%
Source: International Ice Cream Association (makeicecream.com)
Windows Azure Overview
Web Application Model Comparison

Ad Hoc Application Model
Machines Running
IIS / ASP.NET
Machines Running
Windows Services
Machines Running
SQL Server
Web Application Model Comparison

Ad Hoc Application Model Windows Azure Application Model
Machines Running
IIS / ASP.NET
Machines Running
Web Role Instances
Windows Services
Machines Running
SQL Server
Worker Role
Instances
Azure Storage
Blob / Queue / Table
SQL Azure
Key Components
Fabric Controller
Manages hardware and virtual machines for service
Compute
Web Roles
Worker Roles
Web application front end

Utility compute
VM Roles
Custom compute role;

You own and customize the VM
Storage
Blobs
Tables
Entity storage
Queues
Binary objects
Role coordination
SQL Azure
SQL in the cloud
Key Components
Fabric Controller
Think of it as an automated IT department

Cloud Layer on top of:
Windows Server 2008
A custom version of Hyper-V called the Windows Azure Hypervisor
Allows for automated management of virtual machines
Key Components
Fabric Controller
Think of it as an automated IT department

Cloud Layer on top of:
Windows Server 2008
A custom version of Hyper-V called the Windows Azure Hypervisor
Allows for automated management of virtual machines
Its job is to provision, deploy, monitor, and maintain applications in data centers
Applications have a shape and a configuration.
The configuration definition describes the shape of a service
Role types
Role VM sizes
External and internal endpoints
Local storage
The configuration settings configures a service

Instance count
Storage keys
Application-specific settings
Key Components
Fabric Controller
Manages nodes and edges in the fabric (the hardware)
Power-on automation devices

Routers / Switches
Hardware load balancers
Physical servers
Virtual servers
State transitions
Current State
Goal State
Does what is needed to reach and maintain the goal state
Its a perfect IT employee!

Never sleeps
Doesnt ever ask for raise
Always does what you tell it to do in configuration definition and settings
Creating a New Project
Windows Azure Compute
Key Components Compute

Web Roles
Web Front End
Cloud web server
Web pages
Web services
You can create the following types:

ASP.NET web roles
ASP.NET MVC 2 web roles
WCF service web roles
Worker roles
CGI-based web roles

Worker Roles
Utility compute
Windows Server 2008
Background processing
Each role can define an amount of local storage.
Protected space on the local drive, considered volatile storage.
May communicate with outside services
Azure Storage
SQL Azure
Other Web services
Can expose external and internal endpoints
Suggested Application Model

Using queues for reliable messaging
Scalable, Fault Tolerant Applications

Queues are the application glue
Decouple parts of application, easier to scale independently;
Resource allocation, different priority queues and backend servers
Mask faults in worker roles (reliable messaging).

VM Roles
Customized Role
You own the box
How it works:
Download Guest OS to Server 2008 Hyper-V
Customize the OS as you need to
Upload the differences VHD
Azure runs your VM role using
Base OS
Differences VHD
Application Hosting
Grokking the service model

Imagine white-boarding out your service architecture with boxes for nodes and arrows
describing how they communicate
The service model is the same diagram written down in a declarative format
You give the Fabric the service model and the binaries that go with each of those nodes
The Fabric can provision, deploy and manage that diagram for you
Find hardware home
Copy and launch your app binaries

Monitor your app and the hardware
In case of failure, take action. Perhaps even relocate your app
At all times, the diagram stays whole
Automated Service Management

Provide code + service model
Platform identifies and allocates resources, deploys the service, manages
service health
Configuration is handled by two files
ServiceDefinition.csdef
ServiceConfiguration.cscfg
Service Definition
Service Configuration
GUI
Double click on Role Name in Azure Project
Deploying to the cloud

We can deploy from the portal or from script
VS builds two files.
Encrypted package of your code
Your config file
You must create an Azure account, then a service, and then
you deploy your code.
Can take up to 20 minutes
(which is better than six months)
Service Management API

REST based API to manage your services
X509-certs for authentication
Lets you create, delete, change, upgrade, swap,.
Lots of community and MSFT-built tools around the API
- Easy to roll your own
The Secret Sauce The Fabric

The Fabric is the brain behind Windows Azure.
1. Process service model
1. Determine resource requirements
2. Create role images
2. Allocate resources
3. Prepare nodes
1. Place role images on nodes
2. Configure settings
3. Start roles
4. Configure load balancers

5. Maintain service health
1. If role fails, restart the role, based on policy
2. If node fails, migrate the role, based on policy
Storage
Durable Storage, At Massive Scale

Blob
- Massive files e.g. videos, logs
Drive
- Use standard file system APIs
Tables
- Non-relational, but with few scale limits
- Use SQL Azure for relational data
Queues
- Facilitate loosely-coupled, reliable, systems
Blob Features and Functions

Store Large Objects (up to 1TB in size)
Can be served through Windows
Azure CDN service
Standard REST Interface
PutBlob
Inserts a new blob, overwrites the existing blob
GetBlob
Get whole blob or a specific range
DeleteBlob
CopyBlob
SnapshotBlob
LeaseBlob
Two Types of Blobs Under the Hood

Block Blob
Targeted at streaming
workloads
Each blob consists of a
sequence of blocks
Each block is identified by a Block ID
Size limit 200GB per blob

Page Blob
Targeted at random read/write
workloads
Each blob consists of an array
of pages
Each page is identified by its offset
from the start of the blob
Size limit 1TB per blob
Windows Azure Drive

Provides a durable NTFS volume for Windows Azure
applications to use
Use existing NTFS APIs to access a durable drive
Durability and survival of data on application failover
Enables migrating existing NTFS applications to

the cloud
A Windows Azure Drive is a Page Blob
Example, mount Page Blob as X:\
http://<accountname>.blob.core.windows.net/<containername>/<blobname>
All writes to drive are made durable to the Page Blob

Drive made durable through standard Page Blob replication
Drive persists even when not mounted as a Page Blob
Windows Azure Tables

Provides Structured Storage
Massively Scalable Tables
Billions of entities (rows) and TBs of data
Can use thousands of servers as traffic grows
Highly Available & Durable

Data is replicated several times
Familiar and Easy to use API

WCF Data Services and OData
.NET classes and LINQ
REST with any platform or language
Windows Azure Queues

Queue are performance efficient,
highly available and provide reliable
message delivery
Simple, asynchronous work dispatch
Programming semantics ensure that a
message can be processed at least
once
Access is provided via REST
Storage Partitioning
Understanding partitioning is key to understanding performance
Every data object has a partition key
Partition key is unit of scale
System load balances
Server Busy
Different for each data type (blobs, entities, queues)
A partition can be served by a single server

System load balances partitions based on traffic pattern
Controls entity locality
Load balancing can take a few minutes to kick in
Can take a couple of seconds for partition to be available on a different
server
Use exponential backoff on Server Busy
Our system load balances to meet your traffic needs
Single partition limits have been reached
Partition Keys In Each Abstraction

Blobs Container name + Blob name
Every blob and its snapshots are in a single partition
Container Name
Blob Name
image
annarbor/bighouse.jpg
image
foxborough/gillette.jpg
video
annarbor/bighouse.jpg
Entities TableName + PartitionKey
Entities w/ same PartitionKey value served from same partition
PartitionKey (CustomerId)
RowKey
(RowKind)
Name
CreditCardNumber
Customer-John Smith
John Smith
xxxx-xxxx-xxxx-xxxx
Order 1
Customer-Bill Johnson
Order 3
Messages Queue Name
OrderTotal
$35.12
Bill Johnson
xxxx-xxxx-xxxx-xxxx
$10.00
All messages for a single queue belong to the same partition
Queue
Message
jobs
Message1
jobs
Message2
workflow
Message1
Scalability Targets
Storage Account
Capacity Up to 100 TBs
Transactions Up to a few thousand requests per second
Bandwidth Up to a few hundred megabytes per second
Single Blob Partition

Throughput up to 60 MB/s
Single Queue/Table Partition

Up to 500 transactions per second
To go above these numbers, partition between multiple storage accounts and partitions
When limit is hit, app will see 503 server busy: applications should implement exponential backoff
Partitions and Partition Ranges

PartitionKey
PartitionKey
(Category)
(Category)
RowKey
RowKey
(Title)
(Title)
Timestamp
Timestamp
ReleaseDate
ReleaseDate
Action
Action
Fast & Furious
2009
Action
Action
The
The Bourne
Bourne Ultimatum
Ultimatum
2007
2007
Animation
Open Season 2
Open Season 2
The Ant Bully
The Ant Bully
2009
2009
2006
2006
PartitionKey
(Category)
Comedy
Timestamp
ReleaseDate
Comedy
RowKey
(Title)
Office Space
Space
Office
1999
1999
SciFi
SciFi
X-Men
X-Men Origins:
Origins: Wolverine
Wolverine
2009
2009
Defiance
Defiance
2008
2008
Animation
Animation
Animation
War
War
Key Selection: Things to Consider

Scalability
Distribute load as much as possible

Hot partitions can be load balanced
PartitionKey is critical for scalability
Query Efficiency & Speed
Avoid frequent large scans

Parallelize queries
Point queries are most efficient
Entity group transactions
Transactions across a single partition

Transaction semantics & Reduce round trips
See http://www.microsoftpdc.com/2009/SVC09
and http://azurescope.cloudapp.net
for more information
Expect Continuation Tokens Seriously!
Maximum
rowsinina aresponse
response
Maximumof
of1000
1000 rows
At the
endofofpartition
partition range
At the
end
rangeboundary
boundary
Maximum of 5 seconds to execute the query
Tables Recap
Select PartitionKey and RowKey
that help scale
Avoid Append only patterns
Always Handle
continuation tokens
OR predicates are not optimized
Efficient for frequently used queries

Supports batch transactions
Distributes load
Distribute by using a hash etc. as prefix
Expect continuation tokens for range queries
Execute the queries that form the OR predicates as separate queries
Implement back-off
strategy for retries
Server busy
Load balance partitions to meet traffic needs
Load on single partition has exceeded the limits
WCF Data Services
Use a new context for each logical operation

AddObject/AttachTo can throw exception if entity is already being tracked
Point query throws an exception if resource does not exist. Use
IgnoreResourceNotFoundException
Queues
Their Unique Role in Building Reliable, Scalable Applications
Want roles that work closely together, but are not bound together.
Tight coupling leads to brittleness
This can aid in scaling and performance
A queue can hold an unlimited number of messages

Messages must be serializable as XML
Limited to 8KB in size
Commonly use the work ticket pattern
Why not simply use a table?
Queue Terminology
Message Lifecycle
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Content-Type: application/xml
Date: Tue,
09 Dec 2008 21:04:30 GMT
PutMessage
Server: Nephos Queue Service Version 1.0 Microsoft-HTTPAPI/2.0
Msg 1
RemoveMessage
GetMessage
(Timeout)
Worker Role
<?xml version="1.0" encoding="utf-8"?>

<QueueMessagesList>
POST http://myaccount.queue.core.windows.net/myqueue/messages
Msg 2
Msg 2
1
<QueueMessage>
Web Role
DELETE <MessageId>5974b586-0df3-4e2d-ad0c-18e3892bfca2</MessageId>
MsgGMT
3 </InsertionTime>
<InsertionTime>Mon, 22 Sep 2008 23:29:20
http://myaccount.queue.core.windows.net/myqueue/messages/messageid?popreceipt=YzQ4Yzg1MDI
<ExpirationTime>Mon, 29 Sep 2008 23:29:20 GMT</ExpirationTime>
GM0MDFiZDAwYzEw
<PopReceipt>YzQ4Yzg1MDIGM0MDFiZDAwYzEw
</PopReceipt>
Msg
4
Worker Role
<TimeNextVisible>Tue, 23 Sep 2008 05:29:20GMT</TimeNextVisible>
<MessageText>PHRlc3Q+dG...dGVzdD4=</MessageText>
</QueueMessage>
</QueueMessagesList>
Msg 2
Queue
Truncated Exponential Back Off Polling

Consider a backoff
polling approach
Each empty poll
increases interval by 2x
A successful sets the
interval back to 1.
Removing Poison Messages

Producers
Consumers
C1
P2
4
0
33
0
1
01
01
C2
P1
2. GetMessage(Q, 30 s) msg 2
44

Producers
Consumers
P2
4
0
33 0
21
2
1
P1
3. C2 consumed msg 2
4. DeleteMessage(Q, msg 2)
45
5. C1 crashed
6. msg1 visible 30 s after Dequeue
1
1
C1
C2

Producers
Consumers
C1
P2
4
0
33 0
1
32
1
P1
2. Dequeue(Q, 30 sec) msg 2
3. C2 consumed msg 2
4. Delete(Q, msg 2)
8. C2 crashed
46

5. C1 crashed
10. C1 restarted
12. DequeueCount > 2
13. Delete (Q, msg1)
C2
6. msg1 visible 30s after Dequeue
9. msg1 visible 30s after Dequeue
Queues Recap
Make message
processing idempotent
Do not rely on order
Use Dequeue count to remove

poison messages
Use blob to store
message data with
reference in message
Use message count
to scale
No
No need
need to
to deal
deal with
with failures
failures
Invisible
Invisible messages
messages result
result in
in out
out of
of order
order
Enforce
Enforce threshold
threshold on
on messages
messages dequeue
dequeue count
count
Messages > 8KB
Batch messages
Garbage collect orphaned blobs
Dynamically
Dynamically increase/reduce
increase/reduce workers
workers
Windows Azure Storage Takeaways

Blobs
Drives
Tables
Queues
http://blogs.msdn.com/windowsazurestorage/
http://azurescope.cloudapp.net
A Quick Exercise
Then lets look at some code and some tools

49
Code AccountInformation.cs
public class AccountInformation
{
private static string storageKey = tHiSiSnOtMyKeY";
private static string accountName = "jjstore";

private static StorageCredentialsAccountAndKey credentials;
internal static StorageCredentialsAccountAndKey Credentials
{
get
{
if (credentials == null)
credentials = new StorageCredentialsAccountAndKey(accountName, storageKey);
return credentials;
}
}
}
}
50
Code BlobHelper.cs
public class BlobHelper
{
private static string defaultContainerName = "school";
private CloudBlobClient client = null;
private CloudBlobContainer container = null;
private void InitContainer()
{
if (client == null)
client = new CloudStorageAccount(AccountInformation.Credentials, false).CreateCloudBlobClient();
container = client.GetContainerReference(defaultContainerName);
container.CreateIfNotExist();
BlobContainerPermissions permissions = container.GetPermissions();
permissions.PublicAccess = BlobContainerPublicAccessType.Container;
container.SetPermissions(permissions);
}
51
Code BlobHelper.cs
public void WriteFileToBlob(string filePath)
{
if (client == null || container == null)
InitContainer();
FileInfo file = new FileInfo(filePath);
CloudBlob blob = container.GetBlobReference(file.Name);

blob.Properties.ContentType = GetContentType(file.Extension);
blob.UploadFile(file.FullName);
// Or if you want to write a string replace the last line with:
// blob.UploadText(someString);
// And make sure you set the content type to the appropriate MIME type (e.g. text/plain)
}
52
Code BlobHelper.cs
public string GetBlobText(string blobName)
{
if (client == null || container == null)
InitContainer();
CloudBlob blob = container.GetBlobReference(blobName);
try
{
return blob.DownloadText();
}
catch (Exception)
{
// The blob probably does not exist or there is no connection available
return null;
}
}
53
Application Code - Blobs

private void SaveToCloudButton_Click(object sender, RoutedEventArgs e)
{
StringBuilder buff = new StringBuilder();
buff.AppendLine("LastName,FirstName,Email,Birthday,NativeLanguage,FavoriteIceCream,YearsInPhD,Graduated");
foreach (AttendeeEntity attendee in attendees)
{
buff.AppendLine(attendee.ToCsvString());
}
blobHelper.WriteStringToBlob("SummerSchoolAttendees.txt", buff.ToString());
}
The blob is now available at:

http://<AccountName>.blob.core.windows.net/<ContainerName>/<BlobName>
Or in this case:
http://jjstore.blob.core.windows.net/school/SummerSchoolAttendees.txt
54
Code - TableEntities
using Microsoft.WindowsAzure.StorageClient;
public class AttendeeEntity : TableServiceEntity
{
public string FirstName { get; set; }
public string LastName { get; set; }
public string Email { get; set; }
public DateTime Birthday { get; set; }
public string FavoriteIceCream { get; set; }
public int YearsInPhD { get; set; }
public bool Graduated { get; set; }
55
Code - TableEntities
public void UpdateFrom(AttendeeEntity other)
{
FirstName = other.FirstName;
LastName = other.LastName;
Email = other.Email;
Birthday = other.Birthday;
FavoriteIceCream = other.FavoriteIceCream;
YearsInPhD = other.YearsInPhD;
Graduated = other.Graduated;
UpdateKeys();
}
public void UpdateKeys()
{
PartitionKey = "SummerSchool";
RowKey = Email;
}
56
Code TableHelper.cs
public class TableHelper {
private CloudTableClient client = null;
private TableServiceContext context = null;
private Dictionary<string,AttendeeEntity> allAttendees = null;
private string tableName = "Attendees";
private CloudTableClient Client {
get {
if (client == null)
client = new CloudStorageAccount(AccountInformation.Credentials, false).CreateCloudTableClient();
return client;
}
}
private TableServiceContext Context {
get {
if (context == null)
context = Client.GetDataServiceContext();
return context;
} } }
57
Code TableHelper.cs
private void ReadAllAttendees()
{
allAttendees = new Dictionary<string, AttendeeEntity>();
CloudTableQuery<AttendeeEntity> query =
Context.CreateQuery<AttendeeEntity>(tableName).AsTableServiceQuery();
try
{
foreach (AttendeeEntity attendee in query)
{
allAttendees[attendee.Email] = attendee;
}
}
catch (Exception)
{
// No entries in table - or other exception
}
}
58
Code TableHelper.cs
public void DeleteAttendee(string email)
{
if (allAttendees == null)
ReadAllAttendees();
if (!allAttendees.ContainsKey(email))
return;
AttendeeEntity attendee = allAttendees[email];
// Delete from the cloud table
Context.DeleteObject(attendee);
Context.SaveChanges();
// Delete from the memory cache

allAttendees.Remove(email);
}
59
Code TableHelper.cs
public AttendeeEntity GetAttendee(string email)
{
if (allAttendees == null)
ReadAllAttendees();
if (allAttendees.ContainsKey(email))
return allAttendees[email];
return null;
Remember that this only works for tables (or queries on tables) that easily fit in memory
This is one of many design patterns for working with tables
60
Pseudo Code TableHelper.cs

public void UpdateAttendees(List<AttendeeEntity> updatedAttendees) {
foreach (AttendeeEntity attendee in updatedAttendees) {
UpdateAttendee(attendee, false);
}
Context.SaveChanges(SaveChangesOptions.Batch);
}
public void UpdateAttendee(AttendeeEntity attendee) {
UpdateAttendee(attendee, true);
}
private void UpdateAttendee(AttendeeEntity attendee, bool saveChanges) {
if (allAttendees.ContainsKey(attendee.Email)) {
AttendeeEntity existingAttendee = allAttendees[attendee.Email];
existingAttendee.UpdateFrom(attendee);
Context.UpdateObject(existingAttendee);
} else {
Context.AddObject(tableName, attendee);
}
if (saveChanges) Context.SaveChanges();
}
61
Application Code Cloud Tables

private void SaveButton_Click(object sender, RoutedEventArgs e)
{
// Write to table
tableHelper.UpdateAttendees(attendees);
}
Thats it! Now your tables are accessible using REST service calls or any cloud storage tool.
62
Tools Fiddler2
63
Best Practices
Picking the Right VM Size

Having the correct VM size can make a big difference in costs
Fundamental choice larger, fewer VMs vs. many smaller instances
If you scale better than linear across cores, larger VMs could save you money
Pretty rare to see linear scaling across 8 cores.
More instances may provide better uptime and reliability (more failures
needed to take your service down)
Only real right answer experiment with multiple sizes and instance counts
in order to measure and find what is ideal for you
Using Your VM to the Maximum

Remember:
1 role instance == 1 VM running Windows.
1 role instance != one specific task for your code
Youre paying for the entire VM so why not use it?
Common mistake split up code into multiple roles, each not using up CPU.
Balance between using up CPU vs. having free capacity in times of need.
Multiple ways to use your CPU to the fullest
Exploiting Concurrency
Spin up additional processes, each with a specific task or as a unit of
concurrency.
May not be ideal if number of active processes exceeds number of cores
Use multithreading aggressively
In networking code, correct usage of NT IO Completion Ports will let the
kernel schedule the precise number of threads
In .NET 4, use the Task Parallel Library
Data parallelism
Task parallelism
Finding Good Code Neighbors

Typically code falls into one or more of these categories:
Memory
Intensive
CPU
Intensive
Network IO
Intensive
Storage IO
Intensive
Find code that is intensive with different resources to live together

Example: distributed network caches are typically network- and memoryintensive; they may be a good neighbor for storage IO-intensive code
Scaling Appropriately
Monitor your application and make sure youre scaled appropriately (not over-scaled).
Spinning VMs up and down automatically is good at large scale.
Remember that VMs take a few minutes to come up and cost ~$3 a day (give or take) to
keep running.
Being too aggressive in spinning down VMs can result in poor user experience.
Trade-off between risk of failure/poor user experience due to not having excess
capacity and the costs of having idling VMs.
Performance
Cost
Storage Costs
Understand an applications storage profile and how storage
billing works
Make service choices based on your app profile
E.g. SQL Azure has a flat fee while Windows Azure Tables charges per transaction.
Service choice can make a big cost difference based on your app profile
Caching and compressing. They help a lot with storage costs.
Saving Bandwidth Costs

Bandwidth costs are a huge part of any popular web apps
billing profile
Saving bandwidth costs often lead to savings in
other places
Sending fewer things over the wire often means getting fewer
things from storage
Sending fewer things means your VM has time to do other
tasks
All of these tips have the side

benefit of improving your web
apps performance and user
experience
Compressing Content
1. Gzip all output content
All modern browsers can decompress on the fly.

Compared to Compress, Gzip has much better
compression and freedom from patented algorithms
2.Tradeoff compute costs for storage size
3.Minimize image sizes
Use Portable Network Graphics (PNGs)
Crush your PNGs
Strip needless metadata
Make all PNGs palette PNGs
Uncompressed
Content
Gzip
Minify JavaScript
Minify CCS
Minify Images
Compressed
Content
Best Practices Summary

Doing less is the key to saving costs
Measure everything
Know your application profile in and out
Research Examples in the Cloud
on another set of slides
Map Reduce on Azure

Elastic MapReduce on Amazon Web Services has traditionally
been the only option for Map Reduce jobs in the web
Hadoop implementation
Hadoop has a long history and has been improved for stability
Originally Designed for Cluster Systems
Microsoft Research this week is announcing a project code

named Daytona for Map Reduce jobs on Azure
Designed from the start to use cloud primitives
Built-in fault tolerance
REST based interface for writing your own clients
Project Daytona - Map Reduce on Azure
http://research.microsoft.com/en-us/projects/azure/daytona.aspx
76
Questions and Discussion
Thank you for hosting me at the Summer School

77
BLAST (Basic Local Alignment Search Tool)
The most important software in bioinformatics

Identify similarity between bio-sequences
Computationally intensive
Large number of pairwise alignment operations

A BLAST running can take 700 ~ 1000 CPU hours
Sequence databases growing exponentially
GenBank doubled in size in about 15 months.
It is easy to parallelize BLAST

Segment the input
Segment processing (querying) is pleasingly parallel
Segment the database (e.g., mpiBLAST)

Needs special result reduction processing
Large volume data

A normal Blast database can be as large as 10GB
100 nodes means the peak storage bandwidth could reach to 1TB
The output of BLAST is usually 10-100x larger than the input
Parallel BLAST engine on Azure

Query-segmentation data-parallel pattern
split the input sequences
query partitions in parallel
merge results together when done
Follows the general suggested application model

Web Role + Queue + Worker
With three special considerations

Batch job management
Task parallelism on an elastic Cloud
Wei Lu, Jared Jackson, and Roger Barga, AzureBlast: A Case Study of Developing Science Applications on the Cloud, in Proceedings of the 1st Workshop
on Scientific Cloud Computing (Science Cloud 2010), Association for Computing Machinery, Inc., 21 June 2010
A simple Split/Join pattern

Leverage multi-core of one instance
argument a of NCBI-BLAST
1,2,4,8 for small, middle, large, and extra large instance size
BLAST task
Task granularity
Large partition load imbalance
Small partition unnecessary overheads
BLAST task
Splitting task
NCBI-BLAST overhead
Data transferring overhead.
Best Practice: test runs to profiling and set size to mitigate the overhead
BLAST task
BLAST task
Value of visibilityTimeout for each BLAST task,

Essentially an estimate of the task run time.
too small repeated computation;
too large unnecessary long period of waiting time in case of the instance failure.
Best Practice:
Estimate the value based on the number of pair-bases in the partition and test-runs
Watch out for the 2-hour maximum limitation
Merging Task
Task size vs. Performance

Benefit of the warm cache effect
100 sequences per partition is the best
choice
Instance size vs. Performance
Super-linear speedup with larger size
worker instances
Primarily due to the memory capability.
Task Size/Instance Size vs. Cost
Extra-large instance generated the best
and the most economical throughput
Fully utilize the resource
BLAST task
BLAST task
Splitting task
BLAST task
Merging Task
Worker
BLAST task
Web Role
Web
Portal
Job Management Role

Scaling Engine
Worker
Job registration
Job Scheduler
Web
Service
Global
dispatch
queue
Job Registry
NCBI databases
Azure Table
Database
updating Role
Blast databases,
temporary data, etc.)
Azure Blob
Worker
ASP.NET program hosted by a web role

instance
Submit jobs
Track jobs status and logs
Authentication/Authorization based on
Live ID
The accepted job is stored into the job
registry table
Fault tolerance, avoid in-memory states
Job Portal
Web
Portal
Scaling Engine
Job registration
Job Scheduler
Web
Service
Job Registry
R. palustris as a platform for H2 production

Eric Shadt, SAGE
Sam Phattarasukol Harwood Lab, UW
Blasted ~5,000 proteins (700K sequences)

Against all NCBI non-redundant proteins: completed in 30 min
Against ~5,000 proteins from another strain: completed in less than 30 sec
AzureBLAST significantly saved computing time
Discovering Homologs
Discover the interrelationships of known protein sequences
All against All query

The database is also the input query
The protein database is large (4.2 GB size)
Totally 9,865,668 sequences to be queried
Theoretically, 100 billion sequence comparisons!
Performance estimation
Based on the sampling-running on one extra-large Azure instance
Would require 3,216,731 minutes (6.1 years) on one desktop
One of biggest BLAST jobs as far as we know

This scale of experiments usually are infeasible to most scientists
Allocated a total of ~4000 instances
8 deployments of AzureBLAST
Each deployment has its own co-located storage service
Divide 10 million sequences into multiple segments
475 extra-large VMs (8 cores per VM), four datacenters, US (2), Western and North Europe
Each will be submitted to one deployment as one job for execution

Each segment consists of smaller partitions
When load imbalances, redistribute the load manually

6
2
5
0
62
62
6
2
6
2
6
2
5
0
Total size of the output result is ~230GB

The number of total hits is 1,764,579,487
Started at March 25th, the last task completed on April 8th (10 days compute)
But based our estimates, real working instance time should be 6~8 day
Look into log data to analyze what took place
6
2
5
0
62
62
6
2
6
2
6
2
5
0
A normal log record should be

3/31/2010 6:14
3/31/2010 6:25
3/31/2010 6:25
3/31/2010 6:44
3/31/2010 6:44
3/31/2010 7:02
RD00155D3611B0
RD00155D3611B0
RD00155D3611B0
RD00155D3611B0
RD00155D3611B0
RD00155D3611B0
Executing the task 251523...

Execution of task 251523 is done, it took 10.9mins
Execution of task 251553 is done, it took 19.3mins
Execution of task 251600 is done, it took 17.27 mins
Otherwise, something is wrong (e.g., task failed to complete)

3/31/2010 8:22
RD00155D3611B0
3/31/2010 9:50
RD00155D3611B0
3/31/2010 11:12
RD00155D3611B0
Execution of task 251895 is done, it took 82 mins
North Europe Data Center, totally 34,256 tasks processed
All 62 compute nodes lost tasks

and then came back in a group.
This is an Update domain
~ 6 nodes in one group
~30 mins
West Europe Datacenter; 30,976 tasks are completed, and job was killed
35 Nodes experience blob

writing failure at same
time
A reasonable guess: the

Fault Domain is working
MODISAzure :
Computing Evapotranspiration (ET) in The Cloud
You never miss the water till the well has run dry
Irish Proverb
Evapotranspiration (ET) is the release of water to the atmosphere by evaporation from

open water bodies and transpiration, or evaporation through plant membranes, by plants.
+
=
( + 1 + )
Penman-Monteith (1964)
ET = Water volume evapotranspired (m3 s-1 m-2)
= Rate of change of saturation specific humidity with air temperature.(Pa K-1)
v = Latent heat of vaporization (J/g)
Rn = Net radiation (W m-2)
cp = Specific heat capacity of air (J kg-1 K-1)
a = dry air density (kg m-3)
q = vapor pressure deficit (Pa)
ga = Conductivity of air (inverse of ra) (m s-1)
gs = Conductivity of plant stoma, air (inverse of rs) (m s-1)
= Psychrometric constant ( 66 Pa K-1)
Lots of inputs: big data reduction

Some of the inputs are not so simple
Estimating resistance/conductivity across a

catchment can be tricky
FLUXNET curated
sensor dataset
(30GB, 960 files)
Climate
classification
~1MB (1file)
Vegetative
clumping
~5MB (1file)
NCEP/NCAR
~100MB
(4K files)
NASA MODIS
imagery source
archives
5 TB (600K files)
20 US year = 1 global year
FLUXNET curated
field dataset
2 KB (1 file)
Data collection (map) stage

Downloads requested input tiles
from NASA ftp sites
Includes geospatial lookup for
non-sinusoidal tiles that will
contribute to a reprojected
sinusoidal tile
Reprojection (map) stage

Converts source tile(s) to
intermediate result sinusoidal tiles
Simple nearest neighbor or spline
algorithms
Derivation reduction stage

First stage visible to scientist
Computes ET in our initial use
Source Imagery Download Sites
Request
Queue
...
Download
Queue
Source
Metadata
Data Collection Stage
Scientists
AzureMODIS
Service Web Role Portal
Scientific
Results
Download
Reprojection
Queue
Science
results
Reprojection Stage
Derivation Reduction Stage
Analysis Reduction Stage
Analysis reduction stage

Optional second stage visible to
scientist
Enables production of science
analysis artifacts such as maps,
tables, virtual sensors
Reduction #1
Queue
Reduction #2
Queue
http://research.microsoft.com/en-us/projects/azure/azuremodis.aspx
<PipelineStage>Job Queue
Persist
<PipelineStage>
Request
MODISAzure Service
(Web Role)
Service Monitor
(Worker Role)
Parse & Persist
<PipelineStage>JobStatus
<PipelineStage>TaskStatus
Dispatch
<PipelineStage>Task Queue
ModisAzure Service is the Web

Role front door
Receives all user requests
Queues request to appropriate
Download, Reprojection, or
Reduction Job Queue
Service Monitor is a dedicated

Worker Role
Parses all job requests into tasks
recoverable units of work
Execution status of all jobs and
tasks persisted in Tables
Service Monitor
(Worker Role)
Parse & Persist
<PipelineStage>TaskStatus
Dispatch
<PipelineStage>Task Queue
GenericWorker
(Worker Role)
<Input>Data Storage
All work actually done by a Worker Role

Dequeues tasks created by the
Service Monitor
Retries failed tasks 3 times
Maintains all task status
Sandboxes science or other

executable
Marshalls all storage from/to Azure
blob storage to/from local Azure
Worker instance files
Reprojection Request
Job Queue
Service Monitor
(Worker Role)
Persist
Parse & Persist

Dispatch
Task Queue
Reprojection Data
Storage
ReprojectionJobStatus
ReprojectionTaskStatus
Query this table to get the

list of satellite scan times
that cover a target tile
GenericWorker
(Worker Role)
Each entity specifies a

single reprojection task (i.e.
a single tile)
Points to
ScanTimeList
Each entity specifies a

single reprojection job
request
SwathGranuleMeta
Swath Source
Data Storage
Query this table to get

geo-metadata (e.g.
boundaries) for each swath
tile
Computational costs
driven by data scale and
need to run reduction
multiple times
Storage costs driven by
data scale and 6 month
project duration
Small with respect to
the people costs even
at graduate student
rates !
Source Imagery Download Sites

Request
Queue
...
Download
Queue
Source
Metadata
Data Collection Stage

400-500 GB
60K files
10 MB/sec
$50 upload
$450 storage 11 hours
<10 workers
Scientific
Results
Download
Reprojection
Queue
Reprojection Stage
400 GB
45K files
$420 cpu
3500 hours
$60 download 20-100
workers
Derivation Reduction Stage

5-7 GB
5.5K files
$216 cpu
1800 hours
$1 download
20-100
$6 storage
workers
Reduction #1
Queue
Total: $1420
Scientists
AzureMODIS
Service Web Role Portal
Analysis Reduction Stage

<10 GB
~1K files
$216 cpu
1800 hours
$2 download 20-100
$9 storage
workers
Reduction #2
Queue
Clouds are the largest scale computer centers ever constructed and have the
potential to be important to both large and small scale science problems.
Equally import they can increase participation in research, providing needed
resources to users/communities without ready access.
Clouds suitable for loosely coupled data parallel applications, and can
support many interesting programming patterns, but tightly coupled lowlatency applications do not perform optimally on clouds today.
Provide valuable fault tolerance and scalability abstractions
Clouds as amplifier for familiar client tools and on premise compute.
Clouds services to support research provide considerable leverage for both
individual researchers and entire communities of researchers.

5 Jackson Azure

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

5 Jackson Azure

Transféré par

Droits d'auteur :

Formats disponibles

UNIVERSIT DT FRANCO-ALLEMANDE

POUR JEUNES CHERCHEURS 2011

Windows Azure as a Platform as a Service (PaaS)

Before we begin Some Results

Favorite Ice Cream

Cookies and Cream

Cherry Coffee Strawberry

Source: International Ice Cream Association (makeicecream.com)

Windows Azure Overview

Web Application Model Comparison

Web Application Model Comparison

Manages hardware and virtual machines for service

Web application front end

Custom compute role;

SQL in the cloud

Think of it as an automated IT department

Think of it as an automated IT department

The configuration definition describes the shape of a service

The configuration settings configures a service

Manages nodes and edges in the fabric (the hardware)

Power-on automation devices

Its a perfect IT employee!

Creating a New Project

Windows Azure Compute

Key Components Compute

You can create the following types:

Key Components Compute

Suggested Application Model

Scalable, Fault Tolerant Applications

Key Components Compute

Grokking the service model

Copy and launch your app binaries

At all times, the diagram stays whole

Automated Service Management

Deploying to the cloud

Service Management API

The Secret Sauce The Fabric

4. Configure load balancers

Durable Storage, At Massive Scale

Blob Features and Functions

Two Types of Blobs Under the Hood

Size limit 200GB per blob

Size limit 1TB per blob

Windows Azure Drive

Enables migrating existing NTFS applications to

All writes to drive are made durable to the Page Blob

Windows Azure Tables

Highly Available & Durable

Familiar and Easy to use API

Windows Azure Queues

Access is provided via REST

Partition key is unit of scale

System load balances

Different for each data type (blobs, entities, queues)

A partition can be served by a single server

Partition Keys In Each Abstraction

Every blob and its snapshots are in a single partition

Entities TableName + PartitionKey

Entities w/ same PartitionKey value served from same partition

Messages Queue Name

All messages for a single queue belong to the same partition

Single Blob Partition

Single Queue/Table Partition

Partitions and Partition Ranges