Vous êtes sur la page 1sur 44

 SVC08

Building Scalable and


Reliable Applications with
Windows Azure
Brad Calder
Director/Architect
Microsoft Corporation
Challenges for Building Scalable
Cloud Services
> High Availability
> Application and hardware failures

> Scalability
> Scale out to meet peak traffic demands

> Lifecycle management


> Upgrading, monitoring and debugging
Agenda
> Data Scalability
>
> Scalable Computation and Workflow
>
> Lifecycle Management –
Upgrade and Versioning
Data Building Blocks
> Volatile Storage
> Local storage
> Caches (e.g., AppFabric Cache and
memcached)
> Persistent Storage
> Windows Azure Storage
> Blobs
> Tables
> Queues
> Drives
> SQL Azure
> Relational DB
Fundamental Storage Abstractions
> Blobs – Provide a simple interface for
storing named files along with metadata
for the file
>
> Tables – Provide structured storage. A
Table is a set of entities, which contain a
set of properties
>
> Queues – Provide reliable storage and
delivery of messages for an application
>
Storage Account Performance at
Commercial Availability
> Capacity
> 100 TB
> Throughput
> Up to a hundreds megabytes per second
> Transactions
> Up to thousands requests per second

> For high-throughput content use


Windows Azure CDN for Blobs
> 18 locations globally (US, Europe, Asia, Australia
and South America)

Scalable Storage:
Partitioning and Load Balancing
> We group your Blobs, Entities, and
Messages into Partitions
>
> Automatic load balance partitions
across our servers
> Monitor the usage patterns of partitions
and servers
> Adjust what objects are grouped
together as needed to further split
the load across servers
Automatic Load Balancing -
Assignment VIP
Legend
- Partition
- Server Load
FE FE FE

MasterSystem
Master System
Reassign Partitions Master System
Offload Partitions

B E1 B E2 B E3 B E4

Time between offload to reload is on the order of seconds


Time to decide to load balance is on order of minutes
Goal is to only reassign a partition only if the system has
to
Automatic Load Balancing - Split
VIP
Legend
- Partition
- Server Load
FE FE FE

MasterSystem
Master System
Split and Offload Master System
Assign Partition

B E1 B E2 B E3 B E4
Partitioning of Data Objects
> Load balancing is an internal concept
to Windows Azure Storage
> Partitioning enables scalability
>
> What matters to the application is the
partitioning key used for objects
> All objects with the same partition key
value are always grouped into the
same partition
> Partition Key used
> Blobs – Blob Name
> Entities – Application defined Partition
Key for Table
> Messages – Queue Name
Choosing a Table Partition Key
> Granularity of Entity Group
Transactions
> Make the partition key only as big as
you need it for atomic batch
transactions
> Spread out load across partitions
> More partitions – makes it easier to
automatically balance load
> The two extremes
> Store all entities with same Partition
Key value
Per Object/Partition Performance at
Commercial Availability
> Throughput
> Single Queue and Single Table Partition
> Up to 500 transactions per second
> Single Blob
> Small reads/writes up to 30 MB/s
> Large reads/writes up to 60 MB/s
>
> Latency
> Typically around 100ms (for small trans)
> But can increase to a few seconds
during heavy spikes while load
balancing kicks in

Improving Latency
> Use a cache in your application layer
to provide 10 ms latencies
> Can be very beneficial for user
interactive apps
>
> Have caching layer serve dominate
requests (e.g., AppFabric Cache,
memcached)
> You control the size and customize the
cache
> Fill cache misses from cloud storage
>
Agenda
> Data Scalability
>
> Scaling Computation and
Workflow
>
> Lifecycle Management –
Upgrade and Versioning
Compute Service Model – VIP

What is Described?
> The topology of your service
Web Role
> Types of roles and their binaries
> How the roles are connected
Worker Role
>
> Configuration of the service
> How many instances of each role type
> Application specific configuration
settings
> How many update domains you need
>
>
Best Practices for Scaling Out
Compute
> Due to application/node failure or
roles being upgraded
> Use multiple instances of each role
type so availability is not affected
>
> Scaling out means deploying more
roles as load increases
> Each instance of a role type performs
the same task and looks identical
>
>

Web + Worker Role Service Model
VIP

Web Role Worker Role


Web + Worker Role Service Model
Worker Role

Worker Role
Web Role
VIP

Worker Role
Web Role
Worker Role

Worker Role
Basic Workflow Pattern
> Break job into work items
(optional “Map” step)
> Feed the work items to the worker
roles
> Worker resolves the work item
> Aggregate work item results
(optional “Reduce” step)
Loosely Coupled Work with Queues
> Worker-Queue Model
> Load work in a queue
> Many workers consume the queue
Worker Role
Web Role

Input Queue (Work Items) Worker Role


Web Role Azure Queue

Worker Role

Web Role

Worker Role
Queue Workflow Concepts
Worker Role
Web Role

Input Queue (Work Items) Worker Role


Web Role Azure Queue
Worker Role

Web Role

Worker Role

> Windows Azure Queue Provides


> Guarantee delivery (two-step
consumption)
1. Worker Dequeues Message and mark it as
Invisible
2. Worker Deletes Message when finished
processing it
 If Worker role crashes, message becomes visible
for another Worker to process
> Doesn’t guarantee “only once” delivery
> Doesn’t guarantee ordering
> Best effort FIFO

> Make work items idempotent


Basic Workflow Pattern

Worker Role
Web Role
Worker Role

Input Queue (Work Items) Worker Role


Input Queue (Work Items) Worker Role
Web Role Azure Queue Azure Queue

Worker Role
Worker Role

Web Role Worker Role

Worker Role

Job Manager
Workflow Job Manager
> Job Manager
> Generating the Load
> Divide the job into work items
> Distributing the load
> Send work items to Workers via a Queue
> Monitor progress
> Monitor the load distribution
> Manage resources
> Number of workers, queues, etc
> Aggregate results
> Take individual work item results and
aggregate
>
Job Manager Workflow Pattern
Input Queue (Work Items)
Azure Queue
WorkerRole
Worker Role
Worker Roles

Input Blob Store


Jo b M a n a g e r

Large
Job

Output Queues (Item done)

Azure Queue

Output Blob Store


RiskMetrics Case Study

> Focused on financial risk management


> Need to run daily financial and market
simulations
>
> They use the Job Manager Workflow model
> Currently feed the work items to 2,000
Worker roles
> Plan to run 10,000+ Worker roles next year
> Results are queued back to the Job
Manager, aggregated, and sent back to
company
>
> They needed higher throughput from a
Scaling Queue Throughput
> Batch Work Items into Blobs
> Group together many work items into a
Blob
> Queue up pointer to blob
OR

> Use Multiple Queues


> Job Manager
> Responsible for adding and removing
queues
> Workers
> Determine what queues to use
> Random via List Queues or assign queues
by Job Manger
Continuation for Long Running Jobs
> Want to continue on
failover

> High level approach Progress Table


Intermediate persistent
> Break into smaller and
repeatable steps
> Record progress after
each step
> Query progress after
failover
> Resume from the failed
step
Continuation for Long Running Jobs
Upon Failover:

> Want to continue on Read Progress


and resume
failover

> High level approach Progress Table


Intermediate persistent
> Break into smaller and
repeatable steps
> Record progress after
each step
> Query progress after
failover
> Resume from the failed
step
Agenda
> Data Scalability
>
> Scaling Computation and Workflow
>
> Lifecycle Management –
Upgrade and Versioning
In-Place Rolling Upgrades
> Upgrade domains
> Breaks your roles evenly over a set of
upgrade domains
> SERVICE UD0 UD1
Web Role – 6 instances 3 3
> Rolling Upgrade Workers – 9 instances 5 4
> Walk each upgrade
domain one at a time Upgrade Domains
> Upgrade just the roles
in the current domain
>
> Benefits
> Minimizes availability loss
> Only one domain of roles restarted at a time
> Allows local state to persist across upgrade
> Catches application upgrade issues early
> Detect upgrade issues after first few domains
Versioning with Rolling Upgrades
> Always assume you will have old and
new running side by side in your
service

> Version everything


> Protocols, Schemas, Messages, Data
Objects, etc
>
> Two common scenarios
> Protocol change between two roles
> Table schema change
Protocol Change with Rolling
Upgrade
> Have 2 roles talking protocol V1
> Want to switch them over to protocol V2
without losing availability when using
rolling upgrade
>
> Two step process
1. Upgrade roles to understand new and old
protocols
> Once done all nodes know how to speak the
old and new version.
> All nodes still initiate contact sending old
protocol version
> But if they receive the new version they will
respond with it
2. Then trigger the use of the new version,
either:
a. Release an upgrade that starts speaking the
new version
Protocol Change via Rolling
Upgrade
> Step 1: Upgrade roles to understand both
versions, and initiate only old version
> Step 2: Trigger the use of the new version
Binary Versions :

Version 1

Version 1.5 Web Role


Web Role Web Role
Web Role Web Role
Web Role

Version 2

Protocol Versions :

Version 1
Cache Role
Cache Role Cache
Cache Role
Cache Role
Role Cache Role
Cache Role
Version 2

UD0 UD1 UD2


Table Schema Change
> Have a version property in each entity

> Types of Schema Change


> Add Non-key Properties
> Perform two step upgrade process
> Use “IgnoreMissingProperties”
> Remove Non-key Properties
> Perform two step upgrade process
> Use “IgnoreMissingProperties” and
“ReplaceOnUpdate”
> Change in Partition Key or Row Key
> Copy all entities to new primary key

Adding Additional Property
Partition Key Row Version ….. Property N NEW
Key Property
Client
PK1 RK1 1 V1
PK2 RK2 1
……. ……. 1
……. ……. 1 Client
……. ……. 1
V1.5
V1

> Release a new version V1.5 of client


> Use the new class with additional
properties
> Automatically populates the new
property with default value on
insert/update
Schema Change – Upgrade to V1.5
Client
Partition Key Row
Key
Version

Client
….. Property N NEW
Property
Client
PK1 RK1 1 V1
V1.5
PK2 RK2 1 Defaul
……. ……. 1 t
……. ……. 1 Client
……. ……. 1
V1.5
V1

> V1.5 Client


> Has class with new property in it
> If Entity version is V1
> Store the default value in the new property
> Do not upgrade the version of the entity
> V1 Client
> Ignores the new property, since it using
“IgnoreMissingProperties”
Schema Change – Upgrade to V2
Client
Partition Key Row
Key
Version

Client
….. Property N NEW
Property
Client
PK1 RK1 1 Defaul V2.5
V1
V1
t
PK2 RK2 12 Value
Defaul
Value
……. ……. 1 21 t
……. ……. 1 Client
……. ……. 1
V1
V1
V2.5

> V2 Client
> Update all entities to V2 and start putting
real values in new property
> V1.5 Client
> If Entity version is V1
> Store the default value in the new property,
and don’t change version
> If Entity version is V2
> Use the new value and update it
Table Schema Rolling Upgrade
Summary
> Code V1
> Always uses version 1

> Code V1.5


> Creates version 1
> Processes an existing entity based on
its current version 1 or 2, and doesn’t
convert any entities
> Inserts default value for property for
version 1
>
> Code V2
> Converts to version 2 and always
version 2
Takeaways
> Data Performance
> Leverage partitioning
> Scaling Computation
> Loosely coupled workflow with queues
> Upgrade and Versioning
> With in-place rolling upgrades, always
assume old and new running side by
side
> Version everything and use the two step
process
Call To Action
> Sign up for the Windows Azure CTP
> Go to https://windows.azure.com
> Redeem your CTP token
> Visit the Windows Azure developer
web site
> Go to http://dev.windowsazure.com
> Go to the Windows Azure lounge
> Try out the Hands on Labs
> Meet members of the Windows Azure
team

YOUR
FEEDBACK IS
IMPORTANT TO
US! Please fill out session
evaluation forms
online at
MicrosoftPDC.com
Learn More On Channel 9
> Expand your PDC experience through
Channel 9

> Explore videos, hands-on labs, sample


code and demos through the new
Channel 9 training courses

 channel9.msdn.com/learn
Built by Developers for Developers….
© 2009 Microsoft Corporation. All rights reserved. Microsoft, Windows, Windows Vista and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other
countries.
The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to
changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the
date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Vous aimerez peut-être aussi