Vous êtes sur la page 1sur 20

My Data Warehouse Dream Machine: Building the Ideal

Data Warehouse
Michael R. Ault, Oracle Guru
Texas Memory Systems, Inc

Introduction
Before we can begin to discuss what is needed in a data warehouse (DWH) system, we
need to pin down the definition of exactly what a data warehouse is and what it is not.
Many companies accumulate a large amount of data and when they reach a certain
threshold of either size or complexity they feel they have a data warehouse. However, in
the pure sense of the term, they have a large database, not a data warehouse. However, a
large non-conforming database masquerading as a data warehouse will probably have
more severe performance issues than a database designed from the start to be a data
warehouse.

Most experts will agree that a data warehouse has to have a specific structure, though the
exact structure may differ. For example, the star schema is a classic data warehouse
structure. A star schema uses a central “fact” table surrounded by “dimension” tables.
The central fact table contains the key values for each of the dimension tables that happen
to correspond to a particular set of dimensions. For example, the sales of pink sweaters in
Pittsburgh, PA, on Memorial Day in 2007 is an intersection of a STORES, ITEMS,
SUPPLIERS, and DATE dimension and a central SALES table as in Figure 1.

Figure 1: Example Star Schema


Generally speaking, data warehouses will be either of the Star or Snowflake (essentially a
collection of related Star schemas) design.

Of course we need to understand the IO and processing characteristics of the typical


DWH system before we describe an ideal system. We must also understand that the ideal
system will also depend on the projected size of the DWH. For this paper we will assume
a size of 300 gigabytes, which is actually small compared with many companies’ terabyte
or multi-terabyte data warehouses.

IO Characteristics of a Data Warehouse


The usual form of access for a data warehouse in Oracle will be through a bitmap join
across the keys stored in the fact table, followed by a specific retrieval of both the related
data from the dimension table and the data at the intersection of the keys in the fact table.
This outside-to-inside (dimension to fact) access path is called a Star Join.

Many times the access to a data warehouse will be through index scans followed by table
scans and involve large amounts of scanning, generating large IO profiles. Access will be
to several large objects at once: the indexes, the dimensions, and the fact(s) in a DWH
system. This access to many items at once will lead to large numbers of input and output
operations per second (IOPS).

For example, in a 300 gigabyte TPC-H test (TPC-H is a DSS/DWH benchmark) the IOs
per Second (IOPS) can exceed 200,000 IOPS. A typical disk drive (15K, 32-148
gigabytes, Fibre Channel) will allow a peak of around 200 random IOPS per second. It is
not unusual to see a ratio of provided storage capacity to actual database size of 30-40 to
ensure that the required number of IOPS can be reached to satisfy the performance
requirements of a large DWH. The IO profile for a 300GB system with temporary
tablespace on solid state disks (SSD) is shown in Figure 2.

HD IOPS

100000.00

10000.00

1000.00
IOPS

100.00

10.00

1.00
0 5000 10000 15000 20000
Seconds Perm IO
Temp IO
Total IO

Figure 2: IOPS for a 300GB TPC-H on Hard Drives


The same 300GB TPC-H will all tablespaces (data, index, and temporary) on SSD is
show in Figure 3.

IOPS-SSD

1000000.00

100000.00

10000.00
IOPS

1000.00

100.00

10.00

1.00
0 500 1000 1500 2000 2500 3000 3500 4000 4500
Seconds
D-DI
D-TI
D-Total

Figure 3: IOPS from a 300GB TPC-H on SSD

Note that the average IOPS in Figure 2 hover around 1-2000 IOPS with peaking loads
(mostly to the SSD temporary file) of close to 10,000 IOPS, while the SSD-based test
hovers around 10,000 IOPS with peaking loads nearing 100,000 or more IOPS. The
numbers in the charts in Figures 2 and 3 were derived from the GV$FILESTAT and
GV$TEMPSTAT views in a 4 node RAC cluster. It was assumed that the raw IOPS as
recorded by Oracle underwent a 16 fold reduction due to IO grouping by the HBA and IO
interfaces. The hard drive arrays consisted of 2-14 disk 15K RPM 144 GB sets in two
RAID5 arrays. The SSD subsystem consisted of a single 1 terabyte RamSan-500, a single
128 GB RamSan-400, and a single 128 GB RamSan-320.

To get 100,000 IOPS a modern disk drive based system may require up to 500 disk
drives, not allowing for mirroring (RAID). To put this in perspective: to properly spread
the IOPS in a 300 GB data warehouse (actually close to 600 GB when indexes are added)
you will require 300*40 or 12,000 GB (12 terabytes) of storage to meet IO requirements.
At 200 IOPS/disk that maps to 500 disk drives for 100,000 IOPS if no caching or other
acceleration technologies are utilized. In actual tests, EMC reached 100,000 IOPS with
495 (3 CX30 cabinets worth) disks in a RAID1 configuration with no mirroring(
http://blogs.vmware.com/performance/2008/05/100000-io-opera.html.) The EMC results
are shown in Figure 4. Assuming they get linear results by adding disks (and HBAs,
cabinets, and controllers) they should be able to get close to 200,000 IOPS with 782
disks, which is pretty close to our 1,000 disk estimate. However, their latency will be
pretty close to 15 or more milliseconds if the trend shown in the graph in Figure 4
continues.
Figure 4: EMC IOPS and Latency (From:
http://blogs.vmware.com/performance/2008/05/100000-io-opera.html)

Since most database systems such as Oracle will use a standard IO profile, increasing the
amount of data in the warehouse means we must increase the available IOPS and
bandwidth as the database increases in size. Most companies project that their data
warehouse will usually double in size within 3 years or less.

Of course the amount of IOPS will decide the response time for your queries. If you can
afford high response times then your IOPS can be lower; conversely, if you need low
response times then your IOPS must be higher. In today’s business environment the
sooner you can get answers to the typical DWH queries the sooner you can make
strategic business decisions. This leads to the corollary that the DWH should have the
highest possible IOPS. Figure 5 shows a comparison between various SAN systems for
IO latency.

Figure 5: IOPS Comparison for Various SANs


(Source: www.storageperformance.org)
As you can see, the IOPS and latency numbers from an SSD-based SAN are better than
those for more expensive hard disk based SANs. Even with disk form-factor SSDs, an
SSD system designed from the ground up for performance is still superior. As shown in
Figure 6, the latency from the new EMC Flash based drives still cannot compete with
SSDs built from the start to perform.

Figure 6: EMC SSD Response time: 1-2 MS EMC HDD Response time: 4-8 MS
(Source: ”EMC Tech Talk: Enterprise Flash Drives”, Barry A. Burke, Chief
Strategy Officer, Symmetrix Product Group, EMC Storage Division, June 25, 2008)

Processing Characteristics of a Data Warehouse System


DWH systems usually provide summarizations of data, total sales, total uses, or the
number of people doing X at a specific point in time and place. This use of aggregation in
a DWH system leads to requirements for large amounts of sort memory and temporary
tablespace areas, generally speaking. Thus the capability to rapidly sort, summarize, and
characterize data is a key component of a data warehouse system.

In all TPC-H tests we see the use of large numbers of CPUs, large core memories, and
parallel query and partitioning options to allow DWH systems to process the large
amounts of data. Most TPC-H tests are run using clustered systems of one type or
another. For a 300 GB TPC-H we usually see a minimum of 32 CPUs spread evenly
amongst several servers.

Technologies such as blade servers offer great flexibility but also tie us to a specific
vendor and blade type. In addition, you will eventually be limited by the blade system
enclosure to the capabilities of expansion of the system due to the underlying bus
structures of the blade cabinet backplane.

What Have We Found So Far?


So far we have defined the following general characteristics for an ideal DWH system:

1. Large data storage capacity


2. Able to sustain large numbers of IOPS
3. Able to support high degrees of parallel processing (supports large numbers of
CPUs)
4. Large core memory for each server/CPU
5. Easily increase data size and IOPS capacity
6. Easily increase processing capability

The above requirements call for the following in an Oracle environment:


1. Real Application clusters
2. Partitioning
3. Parallel Query

Given the Oracle requirements, the system server requirements would be:

1. Multi-high speed CPU servers


2. Multiple servers
3. High speed interconnect such as Infiniband between the servers
4. Multiple high bandwidth connections to the IO subsystem

Given the IO subsystem requirements, the IO subsystem should:

1. Be easily expandable
2. Provide high numbers of low latency IOPS
3. Provide high bandwidth

Notice we haven’t talked about network requirements for a DWH system. Generally
speaking DWH systems will have a small number of users in comparison to an online
transaction processing system, so a single 1 Gigibit Ethernet type connection is generally
sufficient for user access.

Software
Of course, since we are working with Oracle it is assumed that we will stay with Oracle.
But for long term planning the idea that we might one day move away from Oracle
should be entertained. Therefore our system should support multiple solutions, should the
need arise. In the days where processers seemed to be increasing in speed on a daily basis
and we were jumping from 8 to 16 to 32 to 64 bit processing the idea of keeping a system
much beyond three years was virtually unheard of, unless it was a large SMP machine or
a mainframe.

While processors are still increasing in speed, we aren’t seeing the huge leaps we used to.
Now we are seeing the core wars. Each manufacturer seems to be placing more and more
cores in a single chip footprint. Note the dual and quad core chips already available. Of
course it seems as the number of cores on a single chip increase the number of operations
that the individual cores can do actually decreases. For example, a single processor chip
can do 4 GHz while a dual core may only do 2 GHz per chip. However, software will
usually take advantage of the CPUs offered, so as far as the CPUs and their related
servers are concerned, usually just by choosing the best high speed processors in a
supported platform we can run most any software our operating system will support.

Of course disk-based systems used to be fairly generic and only required reformatting or
erasure of existing files to be used for a new database system. Now we are seeing the
introduction of database specific hardware such as the Exadata cell from Oracle that
requires Oracle parallel query software at the cell level in order to operate. Needless to
say, using technology that locks you into a specific software vendor may be good for the
software vendor but it may not be best in the long run for a company that buys it.

Let’s Build the Ideal System


Let’s start with the servers for the ideal system.

Data Warehouse Servers


We want the system to be flexible as far as upgrades, so while blade systems have a lot to
offer, you are locked into a specific blade cabinet and blades so we will use individual
rack mount servers instead. The use of individual rack mount servers gives us the
flexibility to change our servers without having to re-purchase support hardware such as
the blade cabinet and other support blades.

The server I suggest is the DELL R905 PowerEdge . The DELL R905 supports 4-
quadcore Opteron 8393™, 3.1GHz processors, arguably the fastest quadcore chips and all
around best processors available for the money. The complete specifications are shown in
Appendix A for the suggested dual socket, 16 core configuration, which includes a dual 1
GB NIC and 2 dual channel 4 GB Fibre Channel connections. Also included is a 10Ghz
NIC for the Real Application Cluster crossconnect. Since we will want the capability to
parallelize our queries, we will also want more than 16 CPUs, so for our ideal
configuration I suggest starting at 2 servers, giving us 32 – 3.1 GHz processors. To
maximize our SGA sizes for the Oracle instances I suggest the 32 gigabyte memory
option with the fastest memory available. With currently available pricing this 2 server
configuration will cost just under $35K with 3 years of maintenance.

IO Subsystem
Call me a radical but rather than go the safe route and talk about a mixed environment of
disks and solid state storage, I am going to go out on a limb and propose that all active
storage be on a mix of Flash and DDR memory devices. We will have disks, but they will
be in the backup system. Figure 7 shows the speed/latency pyramid with Tier Zero at the
peak.

Figure 7: The Speed/Latency Pyramid

First, let’s look at what needs to be the fastest, lowest latency storage, Tier Zero.

Tier Zero
As a Tier Zero device I propose a RAM based solid state system such as the RamSan-440
to provide storage for temporary tablespaces, redo logs, and undo segments, as well as
any indexes that will fit after the write dependent objects have been placed. The prime
index candidates for the Tier Zero area would be the bitmap indexes used to support the
star or snowflake schema for fast star join processing. I propose a set of 4-RamSan-440s
with 512 gigabytes of available storage each in a mirrored set to provide us with a 1
terabyte Tier Zero storage area. At current costs this would run $720K. The RamSan-440
provides up to 600,000 IOPS at .015 millisecond latency. Now let’s move on to the next
fastest storage, Tier 1.

Tier 1
Tier 1 storage will hold the data and index areas of the data warehouse. Tier 1 of this
ideal system will be placed on Flash memory. A Flash system such as the RamSan-620
will provide up to 5 terabytes of Flash with RAM-based cache in front of it to enhance
write performance.

We would utilize 2-2TB RamSan-620s in a mirrored configuration. In our 300 gigabytes


of data and around 250 gigabytes of indexes configuration this would provide for 2
terabytes of mirrored space to allow for growth and reliability. At current costs this
would be $202K (2 TB option with 3 years maintenance and 1 extra dual port FC card).
The RamSan-620 provides 250,000 IOPS with a worst-case 0.25 millisecond read
latency.

Assuming we could add enough HBAs, we can achieve 2.9 million low latency IOPS
from our Tier 0 and Tier 1 systems using the above configuration.

Tier 2
Tier 2 storage would be our backup area. I suggest using compression and de-duplication
hardware/software to maximize the amount of backup capability while minimizing the
amount of storage needed. The manufacturer that comes to the top of the pile for this type
of backup system is DataDomain. The DD120 system would fulfill our current needs for
backup on this project system. The list price for the DataDomain DD120 appliance is
$12.5K.

All of this tier talk is fine, but how are we going to hook it all together?

Switches
As a final component we need to add in some Fibre Channel switches, probably 4-16
channel 4 GB switches to give us redundancy in our pathways. A QLogic SanBox 5600Q
provides 16-4GB ports. Four 5600Q’s would give us the needed redundancy and provide
the needed number of ports at a cost of around $3,275.00 each for a total outlay on SAN
switches of $13.1K. The cost of the XG700 10 gigbit Ethernet 16 port switch from
Fujitsu is about $6.5K, so our total outlay for all switches is $19.6K

Final Cost of the Dream Machine


Let’s run up the total bill for the data warehouse dream machine:

Servers: 36,484.00
RamSan-440: 720,000.00
RamSan-620: 202,000.00
DataDomain: 12,500.00
Switches: 19,600.00
Misc. 1,500.00 (cables, rack, etc.)
Total 992,084.00

So for $992K we could get a data warehouse system capable of over 2,000,000 IOPS
with 32 – 3.1Ghz CPUs, a combined memory capability of 64 gigabytes, and an online
available storage capacity of 3 terabytes of low latency storage that is database and
(generally speaking) OS agnostic, expandable, and provides its own backup, de-
duplication, and compression services. Not bad.

What about Oracle?


I am afraid Oracle licenses are a bit confusing, depending on what existing licenses you
may have, time of year, where you are located, and how good a negotiator your buyer is.
The best I can do is an estimate based on sources such as TPC-H documents
(www.tpc.org). A setup similar to what we have outlined with RAC and Parallel Query
will cost about $440K as of June 3, 2009 for three years of maintenance and initial
purchase price. I took out the advanced compression option since we really don’t need it.

So adding Oracle licenses into the cost brings our total for the system and software to
slightly less than 1.5 million dollars ($1,442,084.00).

How does this compare to the Oracle Database


Machine?
To match the ideal solutions IOPS would require 741 Exadata cells at 18 mil hardware
cost, 44 mil software cost plus support and switches. Since that would be an unfair
comparison, we will use data volume instead of IOPS even though it puts the ideal
solution at a disadvantage.

The Exadata based ODM has a quoted hardware price of $600K; for a full system with
license costs it could run a total of anywhere from 2.3 to over 5 million dollars. However,
this is for a fully loaded 14-42 terabyte usable capacity machine. We could probably get
by with 2 Exadata cells offering between 1-2 terabytes of storage for each cell on the
storage side of it. This would provide 2 terabytes of high speed (with the 1TB disks)
mirrored Tier Zero and Tier 1 space. We would still need a second set of Exadata cells or
some other solution for the backup Tier 2. Each cell with the high-speed disks is only
capable of 2700 IOPS so our total IOPS will be 5400.

Essentially we would be using a little more than a quarter size ODM, cutting back to 4-8
CPU servers from the full size ODM total of 8 – 8 CPU servers, and only using 4 instead
of 14 Exadata cells (2 for system. 2 for backup.)

The best price estimates I have seen come from the website:
http://www.dbms2.com/2008/09/30/oracle-database-machine-exadata-pricing-part-2/
that have been blessed by various ODM pundits such as Kevin Clossin. Table 1
summarizes the spreadsheet found at the above location. Note that I have added in new
costing figures from TPC-H documents which may be lower than posted prices for the
per-disk license costs for the Exadata cells and Oracle software. The actual price is
somewhere between what I have here and essentially double the license cost per disk for
the Exadata cells taking their total from 240K to 480K. The general Oracle software
pricing is based on a 12 named user license scheme rather than per processor. The
additional cost of support for the Exadata cells is somewhere between $1100-2200 per
cell so that also adds an additional $12-24K to the three year cost.

Config Partial ODM Full ODM


Exadata Server 4 14
Small DB Server (4 core) 0 0
Medium DB Server (8 core) 4 8
Large DB Server (16 core) 0 0
Total Server 4 8
Total Cores 32 64

1 Exadata Server cost 24,000 24,000


Total Storage cost 96,000 336,000
1 DB server cost 30,000 30,000
Total DB servers cost 120,000 240,000
Other items 50,000 74,000
Total HW price 266,000 650,000

Software price
Exadata server software per
drive 5,000 5,000
Exadata server software per
cell 60,000 60,000
Total Exadata server software 240,000 840,000

Oracle Licenses
Oracle database, enterprise
edition 11,875 11,875
RAC option 5,750 5,750
Partitioning option 2,875 2,875
Advanced compression option 2,875 2,875
3,5
Tuning pack 3,500 00
3,5
Diagnostics pack 3,500 00
Total price per processor 30,375 30,375
After Dualcore discount
(50%) 15,188 15,188
$972,03
Oracle License Cost $486,016 2

$1,812,0
Total Software price $725,760 32
$2,902,
Total System price $992,769 032
Table 2: ODM Solution Cost Projections

So it appears that the Exadata solution will cost less initially ($450K) for fewer total
IOPS (5,700 verses 2,000,000) with higher latency (5-7 ms verses .015-.25 ms) and less
capable servers. However, some of the latency issues are mitigated by the advanced
query software that is resident in each cell.

When looking at costing you need to remember that the software support for the Exadata
cell software is going to be paid yearly in perpetuity, so you need to add that cost into the
overall picture.
Green Considerations
The energy consumption of an Exadata cell is projected to be 900 watts. For 4-cells that
works out to 3.6KW compared to 600W for each of the RamSan-440 systems and about
325W for each RamSan-620 for a total of 3.05KW. Over one year of operation the
difference in energy and cooling costs could be close to $2K all by themselves. Once all
of the license and long term ongoing costs are considered, the ideal solution provides
dramatic savings.

Note that aggressive marketing techniques from Oracle sales may do much to reduce the
hardware and initial licensing costs for the Exadata solution; however, the ongoing costs
will still play a significant factor. Expansion of the Exadata solution is by sets of 2 cells
for redundancy. The ideal solution can expand to 5 terabytes on each of the RamSan620s
by adding cards. Additional sets of RamSan620’s can be added in mirrored sets of 2-2
GB base sets.

You must use ASM and RMAN as a backup solution with the Exadata Database
Machine. If your projected growth exceeds the basic machine we have proposed then you
will have to add in the cost of more Exadata Cells and associated support costs, driving
the Exadata solution well above and beyond the RamSan solution in initial and ongoing
costs.

Remember that with the Exadata solution you must run Oracle11g version 11.1.0.7 at a
minimum and for now it is limited to the Oracle supported Linux OS, so you have also
given up the flexibility of the first solution.

Score Card
Let’s sum up the comparison between the ideal system and the Oracle Data Warehouse
Machine. Look at the chart in Table 2.

Consideration Ideal Configuration Oracle DWHM


OS Flexible Yes No
DB Flexible Yes No
Expandable Yes Yes
High IO Bandwidth Yes Yes
Low Latency Yes No
High IOPS Yes No
Initial cost Higher Best
Long term cost Good Poor

Table 2: Comparison of Ideal with Oracle DWHM

From the chart in Table 2 we can see that the ODM is only on par with a few of the total
considerations for our ideal system. However, the ODM does offer great flexibility and
expandability as long as you stay within the supported OS and with Oracle11g databases.
The ODM also offers better performance than standard hard disk arrays within its
limitations.

Summary
In this paper I have shown what I would consider to be the ideal data warehouse system
architecture. One thing to remember is that this type of system is a moving target, as
technologies and definitions of what a data warehouse is supposed to be change. An ideal
architecture allows high IO bandwidth, low latency, capability for high degree of parallel
operations, and flexibility as far as future database system and OS are concerned. The
savvy system purchaser will weigh all factors before selecting a solution that may block
future movement to new OS or databases as they become available.
Appendix A: Server Configuration
PowerEdge R905
Starting Price $20,224
Instant Savings $1,800
Subtotal $18,424
Preliminary Ship Date: 7/30/2009
Date: 7/23/2009 9:36:05 AM Central Standard Time
Catalog Number 4 Retail 04
Catalog Number / Description Product Code Qty SKU Id PowerEdge R905:
R905 2x Quad Core Opteron 8393SE, 3.1Ghz, 4x512K Cache, HT3 90531S 1 [224-5686] 1
Additional Processor:
Upgrade to Four Quad Core Opteron 8393SE 3.1GHz 4PS31 1 [317-1156] 2
Memory:
32GB Memory, 16X2GB, 667MHz 32G16DD 1 [311-7990] 3
Operating System:
Red Hat Enterprise Linux 5.2AP, FI x64, 3yr, Auto-Entitle, Lic & Media
R52AP3 1 [420-9802] 11
Backplane:
1X8 SAS Backplane, for 2.5 Inch SAS Hard Drives only, PowerEdge R905 1X825HD 1 [341-6184] 18
External RAID Controllers:
Internal PERC RAID Controller, 2 Hard Drives in RAID 1 config
PRCR1 1 [341-6175][341-6176]
27
Primary Hard Drive:
73GB 10K RPM Serial-Attach SCSI 3Gbps 2.5-in HotPlug Hard Drive
73A1025 1 [341-6095] 8
2nd Hard Drive:
73GB 10K RPM Serial-Attach SCSI 3Gbps 2.5-in HotPlug Hard Drive
73A1025 1 [341-6095] 23
Rack Rails:
Dell Versa Rails for use in Third Party Racks, Round Hole
VRSRAIL 1 [310-6378] 28
Bezel:
PowerEdge R905 Active Bezel BEZEL 1 [313-6069] 17
Power Cords:
2x Power Cord, NEMA 5-15P to C14, 15 amp, wall
2WL10FT 1
[310-8509][310-38
snCFG6 plug, 10 feet / 3 meter 8509]
Integrated Network Adapters:
4x Broadcom® NetXtreme II 5708 1GbE Onboard NICs with TOE
4B5708 1 [430-2713] 41
Optional Feature Upgrades for Integrated NIC Ports:
LOM NICs are TOE, iSCSI Ready (R905/805) ISCSI 1 [311-8713] 6
Optional Network Card Upgrades:
Intel PRO 10GbE SR-XFP Single Port NIC, PCIe-8 10GSR 1 [430-2685] 613
Optional Optical Drive:
DVD-ROM Drive, Internal DVD 1 [313-5884] 16
Documentation:
Electronic System Documentation, OpenManage
DVD Kit with DMC
EDOCSD 1
[330-0242][330-5280]
21
Hardware Support Services:
3Yr Basic Hardware Warranty Repair: 5x10 HWOnly, 5x10 NBD Onsite
U3OS 1
[988-0072][988-4210][990-5809]
[990-6017][990-6038]
29
Appendix B: RamSan 440 Specs
RamSan-440 Details

RamSan-440 highlights:

• The World's Fastest Storage®


• Over 600,000 random I/Os per second
• 4500 MB/s random sustained external throughput
• Full array of hardware redundancy to ensure availability
• IBM Chipkill technology protects against memory errors up to and including loss of a memory
chip.
• RAIDed RAM boards protect against the loss of an entire memory board.
• Exclusive Active Backup® software constantly backs up data without any performance
degradation. Other SSDs only begin to backup data after power is lost.
• Patented IO2 (Instant-On Input Output) software allows data to be accessed during a recovery.
Customers no longer have to wait for a restore to be completed before accessing their data.

I/Os Per Second


600,000

Capacity
256-512 GB

Bandwidth
4500 MB per second

Latency
Less than 15 microseconds

Fibre Channel Connection

• 4-Gigabit Fibre Channel (2-Gigabit capable)


• 2 ports standard; up to 8 ports available
• Supports point-to-point, arbitrated loop, and switched fabric topologies
• Interoperable with Fibre Channel Host Bus Adaptors, switches, and operating systems

Management

• Browser-enabled system monitoring, management, and configuration


• SNMP supported
• Telnet management capability
• Front panel displays system status and provides basic management functionality
• Optional Email home feature

LUN Support

• 1 to 1024 LUNs with variable capacity per LUN


• Flexible assignment of LUNs to ports
• Hardware LUN masking
Data Retention

• Non-volatile solid state disk


• Redundant internal batteries (N+1) power the system for 25 minutes after power loss
• Automatically backs up data to Flash memory modules at 1.4 GB/sec

Reliability and Availability

• Chipkill technology protects data against memory errors up to and including loss of an entire
memory chip
• Internal redundancies
o Power supplies and fans
o Backup battery power (N+1)
o RAIDed RAM Boards (RAID 3)
o Flash Memory modules (RAID 3)
• Hot-swappable components
o Five Flash Memory modules (front access)
o Power supplies
• Active BackupTM
o Active BackupTM mode (optional) backs up data constantly to internal redundant Flash
Memory modules without impacting system performance making shutdown time
significantly shorter.
• IO2
o IO2 allows instant access to data when power is restored to the unit and while data is
synced from Flash backup.
• Soft Error Scrubbing
o When a single bit error occurs on a read, the RamSan will automatically re-write the
data to memory thus scrubbing soft errors. Following the re-write the system re-reads
to verify the data is corrected.

Backup Procedures
Supports two backup modes that are configurable per system or per LUN:

• Data Sync mode synchronizes data to redundant internal Flash Memory modules before shutdown
or with power loss
• Active BackupTM mode (optional) - backs up data constantly to internal redundant Flash Memory
modules without impacting system performance.

Size

7” (4U) x 24”

Power Consumption (peak)


650 Watts

Weight (maximum)
90 lbs
Appendix C: RamSan620 Specifications

RamSan-620 highlights:
• 2-5 TB SLC Flash storage
• 250,000 IOPS random sustained throughput
• 3 GB/s random sustained throughput
• 325 watts power consumption
• Lower cost
Features
• A Complete Flash storage system in a 2U rack
• Low overhead, low power
• High performance and high IOPS, bandwidth, and capacity
• Standard management capabilities
• Two Flash ECC correction levels
• Super Capacitors for orderly power down
• Easy installation
• Fibre Channel or Infiniband connectivity
• Low initial cost of ownership
• Easy incremental addition of performance and capacity
I/Os Per Second: 250,000 read and write
Capacity : 2-5 TB of SLC Flash
Bandwidth: 3 GB per second
Latency Writes: 80 microseconds
Reads: 250 microseconds
Fibre Channel Connection
• 4-Gigabit Fibre Channel
• 2 ports standard; up to 8 ports available
• Supports point-to-point and switched fabric topologies
• Interoperable with Fibre Channel Host Bus Adaptors, switches, and operating
systems
Management
• Browser-enabled system monitoring, management, and configuration
• SNMP supported
• Telnet management capability
• SSH management capability
• Front panel displays system status and provides basic management functionality
LUN Support
• 1 to 1024 LUNs with variable capacity per LUN
• Flexible assignment of LUNs to ports
Data Retention
• Completely nonvolatile solid state disk
• Reliability and Availability
• Flash Layer 1: ECC (chip)
• Flash Layer 2: board-level RAID
Internal redundancies
- Power supplies and fans
Hot-swappable components
- Power supplies
Size : 3.5" (2U) X 18"
Power Consumption (peak) : 325 Watts
Weight (maximum): 35 lbs
Appendix D: DataDomain DD120 Specifications
Remote Office Data Protection
> High-speed, inline deduplication
storage
> 10-30x data reduction average
> Reliable backup and rapid recovery
> Extended disk-based retention
> Eliminate tape at remote sites
> Includes Data Domain Replicator
software
Easy Integration
> Supports leading backup and
archive applications from:
Symantec EMC
HP IBM
Microsoft CommVault
Atempo BakBone
Computer Associates
> Supports leading enterprise
applications including:
> Database: Oracle, SAP, DB2
> Email: Microsoft Exchange
> Virtual environments: VMware
> Simultaneous use of NAS and
Symantec OpenStorage (OST)
Multi-Site Disaster Recovery
> 99% bandwidth reduction
> Consolidate remote office backups
> Flexible replication topologies
> Replicate to larger Data Domain
systems at central site
> Multi-site tape consolidation
> Cost-efficient disaster recovery
Ultra-Safe Storage for
Reliable Recovery
> Data Invulnerability Architecture
> Continuous recovery verification,
fault detection and healing
Operational Simplicity
> Lower administrative costs
> Power and cooling efficiencies for
green operation
> Reduced hardware footprint
> Supports any combination of
nearline applications in a
single system
SPECIFICATIONS DD120
Capacity: Raw 3 750 GB
Logical Capacity: Standard 1, 3 7 TB
Logical Capacity: Redundant 2, 3 18 TB
Maximum Throughput 150 GB/hr
Power Dissipation 257 W
Cooling Requirement 876 BTU/hr
System Weight
23 lbs (11 kg)
System Dimensions (WxDxH)
16.92 x 25.51 x 1.7 inches (43 x 64.8 x 4.3 cm) without
rack mounting ears and bezel.
19 x 27.25 x 1.7 inches (48.3 x 69.2 x 4.3 cm) with rack
mounting ears and bezel.
Minimum Clearances
Front, with Bezel: 1” (2.5 cm)
Rear: 5” (12.7 cm)
Operating Current
115VAC/230VAC
2.2/1.1 Amps
System Thermal Rating
876 BTU/hr
Operating Temperature
5°C to 35°C (41°F to 95°F)
Operating Humidity
20% to 80%, non-condensing
Non-operating (Transportation) Temperature
-40°C to +65°C (-40°F to +149°F)
Operating Acoustic Noise
Max 7.0 BA, at typical office ambient temperature
(23 +/- 2° C)
REGULATORY APPROVALS
Safety: UL 60950-1, CSA 60950-1, EN 60950-1,
IEC 60950-1, SABS, GOST, IRAM
Emissions: FCC Class A, EN 55022, CISPR 22, VCCI,
BSMI, RRL
Immunity: EN 55024, CISPR 24
Power Line Harmonics: EN 610003-2

Vous aimerez peut-être aussi