Académique Documents
Professionnel Documents
Culture Documents
Advanced Storage
June 2018
v1.0
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 5-2
OCI Storage Options
Local NVMe Lowest
Local NVMe SSD storage
Latency
• Persistent, high-performance, local to a compute instance
• Ideal for Big Data, OLTP, and high-performance workloads
File Service
File storage
• Durable, scalable, enterprise-grade network file system
• Ideal for Enterprise applications that need shared files (NAS)
Object
Object Storage
Highest • Internet-scale, high-performance, highly-durable storage
Durability • Ideal for storing unlimited amount of unstructured data
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 5-3
Local NVMe SSD Devices
• Some instance shapes in OCI include locally attached NVMe devices
• Local NVMe SSD can be used for workloads that have high storage performance requirements
• The acronym NVMe stands for Non-Volatile Memory Express
• NVMe is a high-performance, NUMA (Non Uniform Memory Access) optimized, and highly scalable
storage protocol, that connects the host to the memory subsystem
• Designed for high performance and non-volatile storage media, NVMe is the only protocol that stands
out in highly demanding and compute intensive enterprise, cloud and edge data ecosystems.
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 5-4
NVMe SSD Persisted - Reboot/Pause
101011010101010
101011010101010
100101010101010 Data saved on 100101010101010 Data deleted on
010101010010101 instance 010101010010101 instance reboot or
00010011110101 00010011110101
reboot or pause, not usable
pause for primary data
Local NVMe Local NVMe
SSD SSD
Instance Instance
(VM/BM) (VM/BM)
“With Oracle Cloud Infrastructure, companies can leverage NVMe for persistent storage to host databases and
applications. However, other cloud providers typically do not offer such a capability. In cases where NVMe
storage was an option with other vendors, it was not persistent. This meant that the multi-terabyte database
that researchers loaded to this storage was lost when the server stopped.
Accenture
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 5-5
SLA for NVMe Performance
Shape Minimum • OCI provides a service-level agreement
Supported IOPS (SLA) for NVMe performance
VM.DenseIO1.4 200k
VM.DenseIO1.8 250k • Measured against 4k block sizes with 100%
VM.DenseIO1.16 400k random write workload on Dense IO
BM.DenseIO1.36 2.5MM shapes where the drive is in a steady-state
of operation
VM.DenseIO2.8 250k
VM.DenseIO2.16 400k • Run test on Oracle Linux shapes with 3rd
VM.DenseIO2.24 800k party Benchmark Suites,
BM.DenseIO2.52 3.0MM https://github.com/cloudharmony/block-stor
age
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 5-6
NVMe Performance
Using a Bare Metal shape with 52 CPUs- BM.DenseIO2.52 the following command was executed following
the cloud harmony test suite and we are getting ~500K IOPS (50/50) on a Read and Write Mix test for a
single NVMe device.
# run.sh --target=/dev/nvme1n1 --test=iops --nopurge --noprecondition --fio_direct=1 --fio_size=10g
--skip_blocksize 512b --skip_blocksize 1m --skip_blocksize 8k --skip_blocksize 16k --skip_blocksize 32k
--skip_blocksize 64k --skip_blocksize 128k
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 5-7
NVMe Performance
Using a Bare Metal shape with 52 CPUs- BM.DenseIO2.52 the following command was executed following
the cloud harmony test suite and we are getting ~ 3.0MM IOPS (50/50) on a Read and Write Mix test for all
NVMe devices combined
# run.sh `ls /dev/nvme[0-9]n1 | sed -e 's/\//\--target=\//'` --test=iops --nopurge --noprecondition
--fio_direct=1 --fio_size=10g --skip_blocksize 512b --skip_blocksize 1m --skip_blocksize 8k --skip_blocksize
16k --skip_blocksize 32k --skip_blocksize 64k --skip_blocksize 128k
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 5-8
Block Volume
Copyright © 2017, Oracle and/or its affiliates. All rights reserved. 5-9
Block Volume Service
• Block Volume Service let you store data on block volumes independently and beyond the
lifespan of compute instances
• Block volumes operates at the raw storage device level and manages data as a set of
numbered, fixed-size blocks using a protocol such as iSCSI
• You can create, attach, connect, and move volumes, as needed, to meet your storage and
application requirements
• Typical Scenarios
– Persistent and Durable Storage
– Expand an Instance's Storage
– Instance Scaling
Get 60 IOPS per GB, up to a maximum of 25,000 IOPS per volume, backed by
Consistent High Performance
Oracle's first in the industry performance SLA.
Block and boot volumes can be backed up seamlessly to OCI Object Storage,
Integrated Data Protection
enabling frequent recovery points.
Dynamically detach and reattach up to 32 block storage volumes to any bare metal
Easily Scale Up or Down (BM) or virtual machine (VM) instance in your Virtual Cloud Network. That's up to 1
petabyte of remote block storage per instance.
Manageable and versatile boot volumes for compute instances, with all the
Boot Volumes advantages of block volumes including the backup and clone capabilities. Custom
size large boot volumes in 1 GB increments.
Group multiple block and boot volumes, and perform crash-consistent point-in-time
Volume Groups
coordinated backups and clones across all the volumes in the group.
• Create a backup
# oci bv volume-group-backup create --volume-group-id ocid1.volumegroup.oc1.phx.abyhqljthc33hrlrqwnlbmoxmep74ykli7mwdr2ukjvuxlaordndl2knpxma
• Create a Clone
# oci bv volume-group create --compartment-id ocid1.compartment.oc1..aaaaaaaa22azyzvp3et7tfvp2qdwgz7mwbyo6m5h4nk3nf6i64js3byiqwxa --availability-domain
AkfI:PHX-AD-1 --source-details '{"type": "volumeGroupId", "volumeGroupId":
"ocid1.volumegroup.oc1.phx.abyhqljthc33hrlrqwnlbmoxmep74ykli7mwdr2ukjvuxlaordndl2knpxma"}'
WARNING:
Before running any tests, protect your data by making a backup of your data and operating system environment to prevent
any data loss.
Do not run FIO tests directly against a device that is already in use, such as /dev/sdX. If it is in use as a formatted disk and
there is data on it, running FIO with a write workload (readwrite, randrw, write, trimwrite) will overwrite the data on the disk,
and cause data corruption.
Run FIO only on unformatted raw devices that are not in use.
Write-only:
# sudo fio --direct=1 --ioengine=libaio --size=10g --bs=4k --runtime=60 --numjobs=8 --iodepth=64 --time_based
--rw=randwrite --group_reporting --filename=/dev/sdb --name=iops-test
Read/write Mix:
# sudo fio --direct=1 --ioengine=libaio --size=10g --bs=4k --runtime=60 --numjobs=8 --iodepth=64 --time_based
--rw=randrw --group_reporting --filename=/dev/sdb --name=iops-test
Note: In read/write case, you need to add the read result and write result for duplex traffic.
Also, please note that all volumes attached to an instance share the same network bandwidth with the
instance. If there is heavy network traffic or other volumes are under I/O pressure, the apparent performance
of single volume may look degraded
--runtime=60 Tell fio to terminate processing after the specified period of time. This value is interpreted in seconds
--numjobs=8 Create the specified number of clones of this job. Each clone of job is spawned as an independent thread or process.
If set, fio will run for the duration of the runtime specified even if the file(s) are completely read or written. It will simply loop over
--time_based the same workload as many times as the runtime allows
It may sometimes be interesting to display statistics for groups of jobs as a whole instead of for each individual job. This is
--group_reporting especially true if numjobs is used
--name ASCII name of the job. This may be used to override the name printed by fio for this job
--size=10g The total size of file I/O for each thread of this job. Fio will run until this many bytes has been transferred
• In order to help with this replication, we have Terraform modules responsible to replicate the
data across two Oracle Cloud Infrastructure File Storage Service (FSS) shared File
Systems.
• The module is responsible for launching hosts and copying the data directly from the source
FSS File System (or Snapshot folder) to a destination FSS File System using cron job in
conjunction with rsync,
https://orahub.oraclecorp.com/pts-cloud-dev/terraform-modules/tree/master/terraform-oci-fss
4. No specific configuration is required for using a local file system with rclone. You can simply
choose a local directory as your source using:
# export SOURCE=/Users/flperei/Data
• Describe and validate storage performance; both NVMe and Block Volumes
• Use volume groups to manage snapshot and cloning activities for logical volumes spanning
multiple block volumes
• Understand the multi-attach block volume feature for connecting the same block volume to
multiple hosts in the same Availability Domain
• Implement data replication to increase durability of File Storage Service file systems
• Utilize the S3 Compatibility API to enable interoperability with Amazon S3