Académique Documents
Professionnel Documents
Culture Documents
About HyperStratus
Silicon Valley-based cloud computing consultancy Founded by executives with deep experience in corporate IT, enterprise software, and global consultancy We assist clients in establishing cloud computing strategies, cloud application architectures, system selection and implementations We also provide cloud computing training and workshops
Topics Covered
Introduction to Cloud Architecture Basic Amazon AWS Concepts and Considerations AWS Cloud Application Design and Best Practices
The elimination of an up-front commitment by Cloud users, thereby allowing companies to start small and increase hardware resources only when there is an increase in their needs The ability to pay for use of computing resources on a short-term basis as needed (e.g., processors by the hour and storage by the day) and release them as needed
IT agility as systems can be sized to meet demand -as load scales, system resources are easily obtained to ensure SLAs can be met
No longer face the tradeoff between overprovisioning (waste of capital) and underprovisioning (waste of users)
Move IT payments from CAPEX to OPEX. Pay only for actual resources consumed. Tie IT cost to business benefit received
Less Structured
More Control
Public
Virtual Private Cloud Internal Private Cloud IBM HP Cisco/VMware Microsoft 3Tera Eucalyptus
Private
Isolated
Shared
Grows from 1MM to 100+ MM insurance claims/day in one week Traditional solution: $750K new hardware + $30K/month maintenance/hosting Cloud solution: $600/month Amazon Web Services
Cloud Taxonomy
Source: Christofer Hoff, Cloud Security Alliance Security Guidance for Critical Areas of Focus in Cloud Computing, Page 22
IaaS/Paas in Detail
Components Providers
Amazon AWS EC2 is an IaaS environment with RESTful Web Services API to allocate & manage resources
IaaS/PaaS in Detail
Components Providers
AWS SQS, SimpleDB, and CloudFront are PaaS Middleware Google AppEngine and Microsoft Azure are PaaS AppServers
IaaS Taxonomy :
AWS Components
VM Images - Gold-Master Amazon Machine Images (AMI) VM Compute - EC2 Instance Types VM Storage - Default Local Disks, EBS, S3 Network Regions, Availability Zones, Virtual NICs IPAM/DNS (Internet Protocol Address Management) Dynamic internal & external IP Addresses and fixed Elastic IP Addresses (Domain Name System) Automatic AWS DNS name assignment
IaaS Taxonomy :
AWS Components (cont)
Security Network Firewall Security Groups S3 file ACLs IAM/Auth (Identity Access Mgmt) AWS Credentials & X.509 Certificates VMM (Virtual Machine Mgmt) Self-Discovery, AutoConfiguration LB & Transport (Load Balancing) AWS Auto-Scaling API Web API, Command-Line Tools Mgmt - AWS Mgmt Console, Firefox Elasticfox plug-in
PaaS Taxonomy :
AWS Components
Availability Zones are distinct datacenter locations that are engineered to be insulated from failures in other Zones and provide inexpensive, low latency network connectivity to other Availability Zones in the same Region
E.g. us-east-1a, us-east-1b,
Traffic between Availability Zones in a single region is on AWS-controlled redundant infrastructure All traffic between Regions is across a multiple Tier-1 Public Internet infrastructure
Several AWS benchmarks and tests manage the consistency and predictability of the performance of an EC2 Compute Unit Over Time, there may be several different types of physical commodity hardware underlying EC2 instances, but EC2-CU performance should remain constant
Storage (unformatted) 170GB instance storage (160GB plus 10GB root partition, 1 spindle) 910GB instance storage (2 x 450 GB plus 10GB root partition, 3 spindles).
Platform
I/O
Moderate
AWS Name
Small
32-bit
m1.small
Large
64-bit
High
m1.large
$0.34
$2978 a year or $1961 a year Reserved
Extra Large
15 GB (No swap)
64-bit
High
m1.xlarge
$0.68
$5957 a year or $3922 a year Reserved
Storage (unformatted) 370 GB instance storage (360 GB plus 10 GB root partition, 1 spindle)
Platform
I/O
Moderate
AWS Name
5 EC2-CU (2 virtual cores with High2.5 EC2 CPU Medium Compute Units each) 20 EC2-CU (8 virtual cores with 2.5 EC2 Compute Units each)
32-bit
c1.medium
High
c1.xlarge
$0.68
$5957 a year or $3922 a year Reserved
EC2, EBS, S3
EC2 Instance Default Local Storage ephemeral virtual disks that are integral part of EC2 VM instance
Range from 170GB to 1.8TB total space, 1 to 5 disks
Elastic Block Storage EC2 Additional persistent disk volumes that can be attached and mounted on a running VM.
1TB max per volume, default quota of 20 volumes
S3 File storage Reliable web URL accessible file storage (e.g. <bucket>.s3.amazonaws.com/file_1.mpg). Buckets are created in user assigned Regions (e.g. us-east-1, eu-west-1) Unlimited number of index folders and files (i.e. objects) per bucket, 5GB max per file Files in a bucket are replicated to dispersed Zones in the buckets Region
All Default Local instance storage devices (I.e. nonEBS EC2 volumes) are ephemeral and all data on them is lost when the instance is terminated (or crashes and cannot be rebooted). Use S3, EBS, or SDB for permanent data. Analogous to the file system lifecycle of a Linux Live-CD that uses RAM drives However, default instance storage data is retained on reboot. This is a major EC2 constraint that must be taken into consideration in an applications design.
EC2 saves a bootable VM root image as an Amazon Machine Image (AMI). An AMI is digitally signed and encrypted by the owner using private x.509 key. AWS has a copy of the corresponding public X.509 certificate for decrypting an AMI at EC2 Instance launch time An AMI is equivalent to a Gold Master image of the configured VM for an EC2 instance Multiple EC2 instances can be launched from the same AMI
Access to each S3 file is controlled by its own Access Control List (ACL).
ACL allows READ, WRITE, and FULL CONTROL (includes access to ACL) privileges on: Everyone Authenticated Users (only valid AWS users) A list of individual AWS users or groups
PaaS Messaging/Queuing Component : AWS SQS Highly Reliable Message Queuing Service with built-in redundancy within user assigned Regions Messages accessible from anywhere via Web API Up to 8 KB of Unicode data per message Messages can be retained in queues for up to 4 days Messages can be sent and read simultaneously but FIFO not guaranteed Queues can be securely shared with other AWS accounts and Anonymously. Queue sharing can also be restricted by IP address and time-of-day.
Enhanced MyISAM-like database service Simple web services interface to create and store multiple data sets and query your data Data is automatically indexed Data stored in Region and automatically replicated to dispersed Zones Requests originating from an application running in same Amazon Region will have near-LAN latency.
PaaS Database Component : AWS SimpleDB Beta (cont) Similar to MyISAM with enhanced features
No SQL grammar support No table JOIN Simple WHERE criteria
100 domains (tables) quota per account, max 10GB per domain, max 256 attributes (columns) per row, max 1KB data per attribute (cell) Typically used to store App logs, EC2 Instance configurations, Application state, Instance status, analytics, indexes to S3 data Scale-out is as simple as creating new domains, rather than building out new servers.
Focus on your needs, not on hardware specs. As your needs change, so should your resources. Ask for what you need, exactly when you need it. Get rid of it when you dont need. Design should allow for resources to scale up or down depending on usage needs. No contracts or long-term commitments. Pay only for what you use but design for the possibility of enhanced resource usage. Each machine instance must be capable of dynamically identifying its configuration and relationship to other resources in the system.
On-Demand Provisioning
Scalability
No Up-Front Costs
Dynamism
Best Practices:
Dont Just Build apps in the cloud
Business tier
Backup
Source: GigaSpace, Practical Guide for Developing Enterprise Application on the Cloud
Backup
Back-up Back-up
Dont simply port traditional Apps to the Cloud Traditional Application Stacks are architected in functional silos Each silo has its own machines, network, management, and support
DB
Source: GigaSpace, Practical Guide for Developing Enterprise Application on the Cloud
Re-factor to use standardized VM containers. Each instance should use self-discovery, be self configurable, and network independent Use cloud standardized Messaging & DB when possible Leverage inherent EBS replication and snapshots for DBMS
If OK to recover only from most recent backup, consider restoring data from S3 at boot-up and backing-up current data to S3 at shutdown. If not OK, use EBS attached volumes for all persistent file data. DBMS should always use EBS volumes
If only small chunks of persistent storage is needed for each Instance, consider using EBS volumes exported on EC2 NFS servers.
Attach ElasticIP for Internet-facing EC2 instances (e.g. the HAProxy load-balancer instance) Use dynamic DNS registration of EC2 instance internal IP address or use SDB EC2 instances should only use the internal IP address for communicating with each other (free!).
Best Practices:
Design for Failure "Everything fails, all the time, Werner Vogels, CTO Amazon.com Avoid single points of failure Assume everything fails, and design backwards Design for failure and your App wont fail
Create multiple DBMS slaves across Availability Zones Use real-time monitoring (Amazon CloudWatch or RightScale)
Best Practices:
Design for Scalability A scalable architecture is critical to take advantage of a scalable infrastructure No central point of data storage contention
Shared Nothing Sharding Distributed Caching
Design for Scalability : Use AWS Elastic Features Use Load Balancing on multiple layers: either your own (e.g. HAProxy EC2 instance) or AWS Elastic Load Balancing Use Cloud monitoring systems: either your own (e.g. CollectD) or AWS CloudWatch Use Auto-scaling technology (Free with CloudWatch)
Source: RightScale
Best Practices:
Build Loosely Coupled Systems Use Independent components Design everything as a Black Box with well defined inputs and outputs Use subsystem de-coupling for Hybrid models Use Load-balanced clusters of Black Boxes to maximize plug&play
Loose Coupling:
Use Message Queues
Tight Coupling Loose Coupling using Queues
Q 1
Controller Controller A Controller A A Controller A Controller B Controller C
Q 2
Controller Controller B Controller B B
Q 3
Controller Controller C Controller C C
Use MQueue system such as Amazon SQS or Gearman to pass along requests Each message queue consumer can be a cluster of EC2 instances
Best Practices:
Design for Dynamism
Dont assume health or fixed location of components Use designs that are resilient to reboot and relaunch Bootstrap your instances based on self-discovery (E.g. EC2 Metadata API)
Store configurations in SimpleDB to bootstrap instances
Best Practices:
Security in every component
Use de-perimiterized security model Create distinct network Security Groups for each Amazon EC2 instance cluster Use group-based network rules for controlling access between components Restrict external access to specific IP ranges Encrypt data at-rest in Amazon S3 Encrypt data in-transit (SSL) Consider encrypted EBS file systems for sensitive data
Best Practices:
Leverage Storage Solutions Amazon S3: large static objects Amazon CloudFront: content distribution Amazon SimpleDB: simple data indexing/querying Amazon EC2 local disc drive : transient data Amazon EBS: RDBMS persistent storage + S3 Snapshots
Best Practices:
Leverage Best AWS Mgt Tools
Management of any but the simplest cloud application configurations is very cumbersome without advanced tools. RightScale is a script-based instance provisioning, monitoring, & auto-scaling system
Supports collaborative sharing & reuse of scripts
Kaavo Infrastructure & Middleware On Demand (IMOD) is an Application Centric Management System
manages a multitier cloud application system as though it were a monolithic application
Best Practices:
Don't fear cloud constraints Think out of the box and leverage cloud features to solve EC2 constraints Components expect Static IP addresses?
Boot script for software reconfiguration from SimpleDB or use Dynamic DNS
CloudBerry Explorer Windows S3 file upload/download application, slightly better than S3 Organizer
RightScale Lifecycle Mgmt Pattern RightScale uses an Injection Pattern to push individual command scripts into a running EC2 instance or an entire deployed cluster of instances Boot Scripts are automatically run at Instance Launch after OS boot_finished event Operational Scripts are run during automated Event Handling or manual operations Decommissioning Scripts are automatically run prior to Instance Termination
www.hyperstratus.com
White Paper: Migrating Applications to the Cloud: An Amazon Web Services Case Study
Cloud Computing Workshops (via Unitek Education) Jorge.Noa@hyperstratus.com