Vous êtes sur la page 1sur 30

Scaling out MySQL:

Hardware today
and tomorrow

Jeremy Cole, Eric Bergen


{jeremy,eric}@provenscaling.com
Overview
• A look at hardware out there today

• What’s important for MySQL?


The big questions
What about 64-bit?
• Make absolutely everything 64-bit
• Every server you buy now will have 64-bit CPUs
• Except in a few corner cases, it won’t hurt, but may
not help
• For MySQL servers, it will absolutely make your life
easier
• Caveat: If you use third-party software, it may not
work properly due to library issues etc.
How many cores?
• MySQL has problems scaling on many-core CPUs
• Peter Zaitsev and Mark Callaghan have addressed
the issues many times in blog posts and
conference sessions
• We normally recommend dual dual core or dual
quad core
• Unless you are highly concurrent and CPU-bound,
dual dual core at a faster clock speed should
perform better than dual quad core at a slower
clock speed
How much memory?
• As much as you can!
• Memory is quite cheap these days, as 4GB DIMMs
have come down in price by about 80% or more
• Typical servers can hold up to 32GB, go for it!
Shared storage?
• This is usually the biggest question: Should I buy
this big expensive SAN, or should I put some disks
in RAID in each server?
• Shared storage places a lot of trust in a single
system
• Reliability can be more difficult to achieve when a
single system failure affects multiple other
systems
• Storage shared across many tasks will make it very
difficult to provide reliable service to MySQL
• I/O latency is much higher on SAN or NAS systems
Which vendor?
• Major server vendors: Dell, HP, IBM, Sun
• Smaller server vendors: SuperMicro, Rackable,
Silicon Mechanics, iX Systems, etc.

• Bigger vendors can generally provide equipment


much faster in a pinch
• Bigger vendors will have an easier time providing
the same type of machines over a longer period of
time
• Smaller vendors may be more willing to work with
you on custom configurations or special needs
Acronym Soup
RAID
• “Redundant Array of Inexpensive Disks”
• Different RAID levels: 0, 1, 5, 10 are common
• For databases, 5 and 10 are the most common
• Can be connected via IDE, SATA, SCSI, SAS
• Can be internal or external (“shelf”)
• Can be implemented in hardware (LSI, 3ware,
Adaptec, etc.) or software (Linux kernel, etc.)
RAID: Common Levels
• RAID 0 - Striping
• RAID 1 - Mirroring

• RAID 10 (1+0) - Mirroring + Striping


• RAID 0+1 - Striping + Mirroring

• RAID 5 - Distributed Parity


DAS
• “Direct-Attached Storage”
• Usually refers to a set of many RAIDed disks
• RAID isn’t necessarily a prerequisite to being DAS,
you could have a JBOD DAS
• “Direct-Attached” because it’s attached to the host
that will use the disks, not to a “headend” or other
interim host
JBOD
• “Just a Bunch of Disks”
• Disks that are not RAIDed or part of a SAN or NAS
system
• The OS will see each individual disk and is
responsible for combining them if necessary
(using e.g. software RAID or LVM)
BBWC, TB[B]U
• “Battery-Backed Write Cache”
• “Transportable Battery [Backup] Unit”
• A cache to hold writes while queuing them to be
written to the actual disks
• Usually present in RAID cards
• Almost always present in SAN or other solutions
BBWC:
Write Back vs. Through
• A BBWC can be in “write back” or “write through”
mode
• “Write Back” uses the cache without writing the
data to physical disk immediately (very dangerous
without working battery) -- but drastically increases
performance on sequential, individually committed
writes (such as binary logs, InnoDB logs)
• “Write Through” requires data to be written to the
physical disk before acknowledging writes -- but is
slow
SAN
• “Storage Area Network”
• Generally either FC (Fibre Channel) or iSCSI (SCSI
over IP, often via Gigabit Ethernet)
• Provides a volume to the host as a block device
• SANs are typically shared by many machines, but
each volume on a SAN is normally only used by
one host (“initiator”) at a time
• SANs may provide the ability to take copy-on-write
snapshots to the host
NAS
• “Network Attached Storage”
• Generally NFS and/or CIFS
• Provides the host a view of files via a high-level
export protocol
• NASes are typically shared by many machines, and
a single volume may be shared by many hosts
• NASes coordinate access to files
Out with the old:
PATA, SCSI
• “Parallel ATA”
• Older host interface, primarily used in desktop
machines

• “SCSI” :) (ok, technically “Small Computer System


Interface”)
• Older host interface, primarily used in servers
• Allows for hot swapping
• High pin count, requires terminators, etc.
In with the new:
SATA, SAS
• “Serial ATA”
• New version of ATA using a serial protocol at 1.5
Gbps and 3.0 Gbps
• Very low pin count, simple cables, hot swappable

• “Serial Attached SCSI”


• Same basic host interface as SATA
• SAS hosts can connect to SATA disks seamlessly
• SAS has additional features, such as multiple
attachment, and a richer command set
SSD
• “Solid State Disk”
• Uses flash memory to store data
• Capable of very low latency for random “seek”
• Commercially available versions are much better
suited to high random read environments than
random writes
• Kevin Burton did lots of research on available
SSDs, conclusion:
 Not fast enough for high random write environments
yet
 InnoDB needs work to really take advantage
MySQL Stuff
Typical MySQL
Requirements
• Assuming high write needs, fairly large database

• BBWC to allow InnoDB to commit without disk


head movement
• Lots of memory to allow for a large InnoDB buffer
pool
• Storage with low latency and high random write
throughput
• Decent (but not awesome) CPUs
Memory Allocation
• Assuming an InnoDB-only system

• Normally recommend system memory minus


perhaps 2GB should be allocated to InnoDB buffer
pool
• Very little memory needed for anything else -
really!
Shared vs. Independent
• Shared storage systems can be used in
combination with Linux HA to achieve failover
• Independent storage can be used in combination
with MySQL replication to achieve failover
• On shared storage systems, failover will require a
recovery of MySQL databases
• On independent storage, failover can be nearly
instantaneous
An Example Machine
• Dell PowerEdge 2950
• Dual Quad Core E5430 @ 2.66Ghz
• 32GB 667Mhz RAM
• 8 x 73GB 15k RPM 2.5” SAS
• Dual power supplies
• Rack mount kit

• List price: $8400


• Real price: ~$6k
• Power consumption: typical 440W (3.83A @ 115V)
Special Hardware
Kickfire
• Execute queries on a SQL-processing custom chip
• Massive access to large memories

• Very cool tech!


Violin Memory
• Half a TB of DRAM in a 2U
• Accessible as a block device

• Very cool tech as well!


High-speed
Interconnects
• InifiniBand
• Dolphin Interconnect

• Both very interesting for clustered systems,


providing low latency high throughput network
access
• Software has to be written specifically to use either