Vous êtes sur la page 1sur 31

Introduction to Windows Azure

Cloud Computing Futures Group, Microsoft Research


Roger Barga, Jared Jackson, Nelson Araujo,
Dennis Gannon, Wei Lu, and Jaliya Ekanayake

Range in size from edge


facilities to megascale.
Economies of scale
Approximate costs for a small size
center (1000 servers) and a larger,
100K server center.
Technology

Cost in smallsized Data


Center

Cost in Large
Data Center

Ratio

Network

$95 per Mbps/


month

$13 per Mbps/


month

7.1

Storage

$2.20 per GB/


month

$0.40 per GB/


month

5.1

Administration

~140 servers/
Administrator

>1000 Servers/
Administrator

7.1

Each data center is


11.5 times
the size of a football field

A bunch of machines in data centers


Fabric Controller
Owns all data center hardware
Uses inventory to host services
Deploys applications to free resources
Maintains the health of those applications
Maintains health of hardware
If the node goes offline, FC will try to recover it
If a failed node cant be recovered, FC migrates
role instances to a new node, A suitable
replacement location is found, Existing role
instances are notified of change
Manages the service life cycle starting from bare
metal

Highly-available
Fabric Controller (FC)

At Minimum
(Small)

Up to 7 Guest VMs

A Host Virtual Machine


An Optimized Hypervisor

CPU: 1.5-1.7 GHz


x64
Memory: 1.7GB
Network: 100+
Mbps
Local Storage:
500GB

Up to (Extra
Large)
CPU: 8 Cores
Memory: 14.2 GB
Local Storage: 2+
TB

At Minimum
CPU: 1.5-1.7 GHz x64
Memory: 1.7GB
Network: 100+ Mbps
Local Storage: 500GB

Up to
CPU: 8 Cores
Memory: 14.2 GB
Local Storage: 2+ TB

Azure Platform
Worker Role
Web Role

Compute

Blobs
Queues

Storage

Tables
Drives

A closer look

HTTP
Blobs

Application
Storage

Compute
Fabric

Drives

Tables

Queues

Access
Data is exposed via .NET and RESTful
interfaces
Data can be accessed by:
Windows Azure apps
Other on-premise applications or cloud
applications

Account

Container
images

jared

Blob
PIC01.JPG
PIC02.JPG

movies

MOV1.AVI

http://jared.blob.core.windows.net/images/PIC01.JPG

Number of Blob Containers


Can have has many Blob Containers as will fit within the
storage account limit

Blob Container
A container holds a set of blobs
Set access policies at the container level
Private or Public accessible

Associate Metadata with Container


Metadata are <name, value> pairs
Up to 8KB per container

Block Blob
Targeted at streaming workloads
Each blob consists of a sequence of blocks
Each block is identified by a Block ID

Size limit 200GB per blob

Page Blob
Targeted at random read/write workloads
Each blob consists of an array of pages
Each page is identified by its offset from the start of the blob

Size limit 1TB per blob

Account

Container
images

jared

Blob
PIC01.JPG
PIC02.JPG

movies
MOV1.AVI

Block or
Page
Block or
Page 1
Block or
Page 2
Block or
Page 3

Producers
Scalable message
paths
Provides loose
synchronization
Any number of
messages
One week of
persistence
Maximum size 8KB
Visibility timeout

Consumers
C1

P2
4
P1

1
C2

Provides Structured Storage


Massively Scalable Tables
Billions of entities (rows) and TBs of data
Can use thousands of servers as traffic grows
Data is replicated several times
Table
A storage account can create many tables
Table name is scoped by account
Set of entities (i.e. rows)
Entity
Set of properties (columns)
Required properties
PartitionKey, RowKey and Timestamp

Partition 1

Partition 2

Source : Windows Azure Table Programming Table Storage

A Windows Azure Drive is a Page Blob formatted as a NTFS


single volume Virtual Hard Drive (VHD)
Drives can be up to 1TB

A VM can dynamically mount up to 8 drives


A Page Blob can only be mounted by one VM at a time for
read/write
Remote Access via Page Blob
Can upload the VHD to its Page Blob using the blob interface, and then
mount it as a Drive
Can download the Drive through the Page Blob interface

A closer look
Web Role

HTTP
Load
Balancer

IIS

Worker Role

ASP.NET, WCF,
etc.
Agent

main()
{ }
Agent

Fabric

VM

Using queues for reliable messaging


To scale, add more of either

1) Receive work

Worker Role

Web Role

main()
{ }

ASP.NET, WCF,
etc.
2) Put work in
queue

3) Get work
from queue

Queue

4) Do
work

Queues are the application glue


Decouple parts of application, easier to scale independently;
Resource allocation, different priority queues and backend servers
Mask faults in worker roles (reliable messaging).

Use Inter-role communication for performance


TCP communication between role instances
Define your ports in the service models

Points of interest
Access
Data is exposed via .NET and RESTful interfaces
Data can be accessed by:
Windows Azure apps
Other on-premise applications or cloud applications

Work

Home

Develop
Development Fabric
Develop

Your
App

Run
Development Storage

Source
Control

Version
Local

Application Works Locally

What the Value Add ?


Provide a platform that is scalable and available

Services are always running, rolling upgrades/downgrades


Failure of any node is expected, state has to be replicated
Failure of a role (app code) is expected, automatic recovery
Services can grow to be large, provide state management
that scales automatically
Handle dynamic configuration changes due to load or failure
Manage data center hardware: from CPU cores, nodes, rack,
to network infrastructure and load balancers.

Key takeaways

Cloud services have specific design considerations


Always on, distributed state, large scale, fault tolerance
Scalable infrastructure demands a scalable architecture
Stateless roles and durable queues

Windows Azure frees service developers from


many platform issues
Windows Azure manages both services and servers

Worker
Web Role
Web
Portal
Web
Service

Job
registration

Job Management Role


Scaling
Engine
Job
Scheduler

Job
Registry

NCBI
databas
es
Database
updating
Role

Azure Table

Worker
Global
dispatch
queue

Blast
databases,
temporary
data,
etc.)
Azure
Blob

Worker

Always design with failure in mind


- On large jobs it will happen, and it can happen anywhere

Factoring work into optimal sizes has large performance impacts


- The optimal size may change depending on the scope of the job

Test runs are your friend


- Blowing $20,000 of computation is not a good idea

Make ample use of logging features


- When failure does happen, its good to know where

Cutting 10 years of computation down to 1 week is great!!


- Little Cloud development headaches are probably worth it

Thank you!

Vous aimerez peut-être aussi