E-guide
Why RDBMS Siill
Rules Database
Roost
Your essential guide to RDBMS
on
ORs a a aa) Ye
E af
a ar Sal Sr) =]
Fl A Es msl an thd
x oN a =) ey 5, i =Pr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
Page 60
p60
MMM
E-guide Content
VMOU LLL
In this e-guide:
Relational databases have enjoyed a long run as the database
mainstay across a wide variety of businesses, and for good
reasons. However, they haven't necessarily adapted well to
changes in the types and quantities of data now being
generated, such as the unstructured data that is prevalent
inbig data applications. In addition, expanding traditional
databases to accommodate rapid growth is costly.
Asa result, NoSQL database technologies are challenging the
monopoly of the relational database management system. Yet,
despite their modern designs and efficiency in managing large
data sets, NoSQL databases aren't the right fit for all projects.
Depending on your business goals, traditional databases,
NoSQL databases or a hybrid of the two may be best to deliver
the most value. The articles in this guide examine these
technologies from different perspectives and explore the case
for the ongoing relevance of relational databases.E-guide Content
MMLC
oO io
eer
ininise-aude Section 1: Relational databases
seston Agatonal Large Internet companies lke Facebook, Twitter, Linkedin and Netflix are
jatavoves Ls well-known users of NoSQL database technology, as it works well with the
—____ large data sets they need to manage. However, many organizations find that
Msecton 2:NoSOL databases traditional databases are still best for their business needs. In this section,
336 learn how relational database technologies are holding their own inthe
database world by evolving to meet higher levels of efficiency as well as
specific business needs for various companies --even Facebook.
a LALILLLLLLLLLLLLLLALLLLLLLLLLLLILLLDLALALLLLLLLLLLLLOLALALLLLLLLDLLLLALALLLLLLLO
‘Ww Next article
[AGetting more PRO+ essential
content pet
Page 0160Pr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
Pageot60
p60
MMMM
E-guide Content
VMOU LLL
J Don't get distracted by new database
technology
Joshua Greenbaum Princinal Enterprise Appcations Consulting
New database technologies are coming tomarket with increasing regularity,
and if these products live up to the hype as superfast and crazy cheap,
hundreds of thousands of workhorse relational databases in use today will
be put out to pasture. Who needs a 20th-century relational database when
you can have a decidedly more modern NoSQL, columnar or in-memory
database ~ or even the Hadoop Distributed File System?
Most organizations, it turns out. At least for now.
While the seductive powers of thenew database technology are not to be
denied, you should resist the siren song of the new, post-elational database
vendors, Not because the new database options lack merit on the
contrary-- but because making your company's next database move a
technology decision is the wrong way to go about it. The choice of database
should be secondary. Your business goal-- that comes first.
Avery good place to start
Consider a battery of practical questions about your project: Are you
creating net new applications in support of net new business processes ori PRO+
PRLS E-guide Content
MUA
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
Page 40160
merely upgrading the ones you already have? Engaging new types of users,
data or analysis? Supporting a new line of business or reinvigorating an
existing one? Answers to these questions will provide essential criteria for
understanding which new database technology, if any, to deploy.
‘Only then should you look around to see whether a new database is better
for the job than something you already have.
Implementing a database of any kind isn't cheap. While many of the new
varieties are open source, they aren't free-- and even more costs enter the
‘equation when a project involves migrating an existing relational database to
‘one of the newbies. Myriad complexity issues also stand in the way.
New database technologies, particularly in-memory ones, often need new
hardware. Many of the available options promise to lower total cost of
‘ownership over time -- but new hardware will have to be obtained, and that
up-front cost must be taken into consideration.
The fine print
Finding people with the right skills is an even bigger issue. The new models
may require fewer administrators-- most proponents insist that their
databases are less expensive to implement and manage than oldschool
relational databases are. And in many cases that's an easy argument to
make: Top-tier database administrators are some of the highest paid people
in the IT department, and their numbers ~ most relational databases areoO
aa
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
ages of 60
MMM
E-guide Content
MMM
rotorious for the number of admins required to keep them finely tuned --
clearly add significant costs.
But the likelihood of finding a Hadoop or columnar database expert in a
traditional relational database shop is slim, which means youll have to go out
land hire these in-demand people or get the required skills from a consulting
‘company.
‘And, as anyone who has worked to bring a major application project to
fruition can attest, the bulk of the complexity is centered on everything but
the cost ofthe software license. Creating new algorithms, analytical models,
transactional components and business processes that need to be
engineered and implemented is where the real expense is. Uni they're well
understood and the necessary stakeholder input and approvals have been
‘obtained, the choice of database technology is at best a distraction. At
worst, i's a great way to knock a project of its axis and send it spinning out
of control
Think big
This is particularly true in the era of big data, which is driving a considerable
percentage of the new application projects in organizations. For many, big
data projects involve data types that are new, unfamiliar and often
unstructured-- time-series data, Web server logs, text. While some new
database technology might eventually need to be deployed, figuring out
what the new data sources are and what the new algorithms should look like
must be the first order of business, right after you've reached agreement onoO Po PRO+
Se a E-guide ae
MMLC
LLL” ‘what the new business processes are all about. To do otherwise is to march
In thise-guide your company down the path of cost overruns, scope creep and eventual if
a not inevitable failure
section + Relational ULLILLLLLLLLLLLALLALLLLLLLLLLLLALLLLLLLLLLLLLLLLALLLLLLLLOLLLALLEELOLLLL
databases p2
‘SwNext article
INsecton 2: NoSOL databases
p36
[AGetting more PRO+ essential
content p60
agesot60< i PRO+
prea E-guide br
MMLC
Pr
In this e-guide 5
Relational databases are far from dead --
section + Relational just ask Facebook
databases p2
Nicole Laskowakl Snir News Weer
Section 2: NSQL databases: Hadoop is not enough! Just ask Ken Rubin, director of analytics for
936 Facebook Inc, who delivered what ClOs probably considera refreshing
message at the Strata Conference + Hadoop World 2013in New York:
Geting more PRO» escent Facebook needs the relational database.
content p60 "We're a young-enough company that we started by using Hadoopas our
core data technology rather than relational [databases}," he said. “As we
start thinking about big data from the perspective of business needs, we're
realizing that Hadoop isn't always the best tool for everything we need to
fe
When a Web 20 superstar --and Hadoop exemplar, at that ~ says there's a
time and place for relational technology, CIOs have another bit of proof, if
any were needed, that building for big dataisn't the black-and-white
proposition some Hadoop zealots make it out to be. It's shades of gray,
because what matters for the business at the end of the day is solving
business problems. Thinking about bg data in those terms rather than in
terms of tools or architecture “opens up the possibilities of using a much
broader range of technologies," Rubin said.
age7 of 60Pr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
Pageot60
p60
MMMM
E-guide Content
VMOU LLL
‘So when exactly does Facebook's analytics team use relational technology
rather than Hadoop? That depends on what they'e looking for and when
land how they want to see the data. “Exploratory analysis,” such as
pinpointing what metrics really matter, is done in Hadoop; “operational
analysis," such as slicing and dicing data, is done in a relational database,
Rubin said,
Particularity matters. “If we look at the granularity of the data, we keep the
lowest level of grain in our Hadoop system. So whenever you want to look at
‘something at the lowest level of detail, Hadoop is optimized for that.” he
‘said, "However, if we want to look at transformed data and aggregated data,
relational is easier for doing that.”
‘And timing is important. Al of Facebook's data streams directly into Hadoop,
‘which can be used for real-time monitoring. But if the analytics team wants
todo trending analysis over several days, weeks, months or years, “relational
is a better technology,” he said,
Social television
Not surprisingly, open data was a central theme at Strata Conference +
Hadoop World. Shawndra Hill assistant professor at the University of
Pennsylvania, and her work on the intersection of tweets and TV was a
prime example. Social television, according to Hil is going to be “a $256
billion business by 2017"Pr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
Page .ot60
p60
MMM
E-guide Content
MMM
‘She's looking into how Twitter can spur viewer engagement for television
shows and advertisers. She's also using datasets from GetGlue and Vigale,
‘apps that let viewers “check into" a television show the same way they
would check into a location on Foursquare Combining this kind of data with
tweets might just become the next generation of Nielsen ratings.
"Can we predict customer lifetime value for shows and the network? Can we
measure time shifting so for which shows are people checking in when the
‘show is aired for the first time and which shows are people waiting to
watch?" she said. And-- so critical to advertisers — can it be done “at the
individual level as opposed to the household level?” Stay tuned,
Say what!?!
“You can use science and technology and statistics to figure out what the
answers are, but is stil an art to figure out what the right questions are."
Ken Rubin director of analytics, Facebook
“If you have more eyeballs working on data, you're more likely to get better
insights and better analysis." ~ Michael Chui researcher, McKinsey Global
Institute
“Ittook Facebook around nine months to achieve the same number of
subscribers/users as it took the radio community 40 years to achieve
David Parker, vce president of big data technologies, SAPi oo
Se
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
content
Paget 0
p36
Jetting more PRO+ essential
p60
MMM
E-guide Content
MMM
"How much investment is going into big data? Venture capital money, last
‘count I saw, is about $2.6 billion, That's the equivalent of a Navy destroyer
‘coming after your wallet." -- John Choi, director of product management,
IBM
“Big data doesn't really exist. How do I know? It is along truth in technology
that anything that appears in the press in capital letters and surrounded by
‘quotes isn't real" Douglas Merrill, CEO and founder, ZestFinance
"When we're talking about data sclence-- and big data as well--one of the
fundamental principles we should keep in mind is that data should be
thought of as an asset" -- Foster Provost professor of information
systems, New York University's Stern School of Business
1 of the top 10 fastest growing technologies overall in terms of
Jack Norris chief marketing officer, MapR Technologies
(Of course, he would say that.)
‘Next article
*Hadoop is
job growth:Oo , PRO
Seas E-guide Content
MMLC
Pr
In this e-guide Cf
J In-memory technology gets the relational
Section Relational treatment
databases 2
oe ack Vaughan, Senor Nows We
Msecton 2:NoSOL databases {As i overnight, inmemory technology has crept out of the rare worlds of
936 high-performance computing and Wall Street trading and entered into the
mainstream,
‘Getting more PRO+ essential In-memory technology that bypasses disk drives and resides in main
content ps0 semiconductor memory got a big boost in recent years from SAP AG, which
loudly trumpeted its HANA in-memory database management system and
its use continues to widen.
‘The technique is seen in analytics appliances, as well as in Hadoop, NoSQL
and NewSAL territories. The activity is hard to overlook.
Incumbent relational database makers have also taken notice ~- adding in-
memory technology to their leading SQL products to improve performance.
IBM, Oracle and Microsoft have added in-memory traits to their flagship
offerings, in no small part to keep up with the high velocity of business
today.
‘Speed increases of 10 times or more for transaction processing have been
reported, with data warehouse analytics speed boosts going even higher.
PagetPr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
age t2ot60
p60
MMMM
E-guide Content
VMOU LLL
High-performance applications are the "sweet spot" for inmemory offerings
‘generally, and for in-memory relational database offerings specifically,
according to William McKnight, president of Plano, Texas-based McKnight
Consulting Group. The usefulness of faster in-memory performance can
‘come to play in both analyfical and operational applications, he said.
High performance gets the nod
‘Speed-sensitive applications are a good fit for in-memory relational
databases, sald Andrew Mendelsohn, executive vice president of servers
technologies at Oracle, especially ones that "require access to large:
amounts of data in order to answer business-driving questions.”
Oracle's in-memory lineage is deep. Since 2006, it has offered the TimesTen
in-memory database, which it acquired from HP Labs. Also, beginning in
2007, Oracle fietied the Coherence Java-based in-memory data grid for
middleware software object persistence. Last year at Oracle Open World
2013, the company announced the Oracle Database In-Memory option for
Oracle Database 12c, which is currently in beta,
Like McKnight, Mendelsohn sees benefits for both operations and analytics.
New classes of “hybrid applications’ that combine analytics with
transactions for real-time commerce can drive immediate returns, he said,Pr
eer
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
agers 60
MMM
E-guide Content
MMM
Am|1BLU and in-memory too?
IBM's DB2 BLU Acceleration software also got a notable in-memory
refresher in 2013, Like Oracle, IBM has offered a variety of inmemory
technologies across its middleware and data processing portfolios. Now, in-
memory data handling is one of the many enhancements that so-called BLU
acceleration brings to IBM's mainstay relational database.
“Anything that needs online analytical processing or [data] ‘cubing'is a
beneficiary of in-memory,” said Nancy Kopp, director of database software
and systems at IBM. "Reporting, data mining and data discovery ll benefit.”
What some viewers describe as “real-time analytics’ has been something of
‘holy grail, Kopp admitted, and it comes closer with the application of in-
memory methods. Often, data applications have been limited by /O latency
‘and that in turn may have limited what Kopp calls “the speed of thought" for
human analysts.
In-memory has special value where “latency is critical and the number of
users is really high,” she said.
“People want to get answers as fast as they can ask the questions. With in-
memory [technology], we can get more toward operational BI [business
intelligence)"Pr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
Page oO
p60
MMM
E-guide Content
MMM
Like others, she sees in-memory capabilities bringing a new blend of
‘applications to the relational database. Eventually, there willbe less of a line
between the transactional world and the analytical world, she said.
Batch is out the window
People used to waiting for overnight batch jobs will quickly become
accustomed to realtime execution as in-memory finds greater use in
relational databases, according to Tiffany Wissrer, director of product
marketing for SQL Server at Microsoft. Moreover, she said, such capabilities,
prepare customers for a move to larger-scale, cloud-style processing,
‘She said Microsoft has included in-memory of sorts as part of the basic SQL
Server database offering since 2008, when PowerPivot support allowed
people to analyze billions of rows of Excel in memory. "With SQL Server
2012, we expanded the footprint with an in-memory columnar store,” she
ssa,
This week, SQL Server 2014 became generally avalable, which has new in-
memory transaction-processing support. Wissner emphasized that, as part
of the core offering, SQL Server 2014 jobs can be optimized for oniine
transactional processing (OLTP) with high numbers of read/write
‘operations, or can be optimized to run in a datawarehouse-style column
store that i fine-tuned for high search query speed,PRO+
Pe Ek E-guide
MLL LLL
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
aget560
Placing a bet on in-memory technology
In-memory adaptations to relational database performance can reduce
stress on large-scale transactional systems, according to Wolfgang "Rick"
Kutschera, who is manager for database engineering at bwin.party in Vienna,
‘Austria, and whose team has gone into full production with Microsoft's latest
‘SQL Server incarnation.
Kutschera's data group was a beta user of "Hekaton," which was the pre-
release codename for the new version of SQL Server 2014 with in-memory
‘OLTP. For bwin party --which offers online gaming for soccer and tennis, as
‘well as poker and other casino games --Microsoft's latest SQL Server
version helped meet the need for transaction scalability and data
consistency. The transition was fairly straightforward, he said
“We started on an application that had hit an actual performance limit ~ it
could not scale up or out in an easy way. With Hekaton, it took us a day or
two to convert to the in memory technology, and once we did, we could
scale to a factor 20 times [faster] than what we had before," he sald. "Alot
of performance-ritical [application parts were converted” Now that itis
established, people are finding more things to do with it
Like other high-transaction websites, bwin.party has looked at in-memory
NoSAL alternatives to established relational systems, Kutschera said. But
there is a difference between a tweet and a bet.E-guide Content
MMLC
oO io
Se a
@LLLLLLLLLE™ “The main problem is the websites that use NoSQL in most cases have no
In thise-guide problem if they lose one record. If you lose, for example, one Twitter
ee message, nobody cares, but," he continued, “if you lose a bet that might be a
Msecton Relational {$20,000 or $30,000 return itis a big deal”
databases p2 The inmemory technology trend is ike a catchy song heard everywhere of
—____ late nmemory approaches are appearing in analytics engines of al kinds
Msecton 2: NoSOL detaboses ‘Their appearance In elational databases may soon turn out tobe one ofthe
p26 most influential of these uses,
eting more PRO esentia Nextarti
content p60 ‘wNext article
age 660i PRO+
BeCCR ea E-cuido m
Pr
MMLC
Inthise-ouide \ In relational database design, don't
Msecon ¢Reatonal shortchange requirements stage
sees “ Jack Vaughan, Senior News Writer
Msecton 2:NoSOL databases In many organizations, relational database design isan afterthought ora lost
336 art But Michae! J Hemandez considers it an important undertaking one in
which core principles still bear deep consideration. Hernandez is the author
of Database Design for Mere Mortals which was published in its third edition
[AGetting more PRO+ essential in February 201
content 60
‘A long-time database developer, Hernandez has worked as a program
‘manager at Microsoft and an instructor for companies such as AppDev
Training Co. and Deep Training, Originally published in 1996, his book focuses
‘on database design and configuration practicalities — from requirements-
gathering interviews on.
‘Hernandez champions the cause of flexible but well-structured relational
databases that can underlay quickly launched Web applications but that
‘support data growth and business changes. Ina world that often asks ifdata
‘modeling and full-fledged database planning and design are really necessary
~can't we just start coding? —his message has always been: Don't
sshortchange the design process. SearchDataManagement spoke recently to
Hernandez about database design best practices. Excerpts from the
interview follow.
Page 060E-guide Content
MMLC
Pr
eer
@LLLLLLLLLE™ In your book, you suggest that data professionals are often in too much
In thise-guide of a hurry to start coding, without doing the requirements gathering that
oo is part of good relational database design. Why is the requirements
section Relational ‘gathering stage so important? And how should it be approached?
databases p2
Michael Hernandez: Many times, people make a lot of assumptions and then
— create the database and rollit out. Later, there is pushback from the users.
IRSoction 2: NoSOL databases The fact is i's important to have conversations with the business users
p36 ahead of time and get a sense of what is going on and what they need. That
informs a ot of what is going to be built.
‘Getting more PRO+ essential Basically, you want to be talking to different individuals at different stages
content ps0 [of a project] so that you're sure you're capturing the proper concepts [and]
that you understand the ideas they have about their business. So, |
‘emphasize interviews.
To do this right, you need to understand the different relationships of
aspects of the organization and its processes. As you work on the
relationships, you find the details and concepts that have to be represented
in the database. You have a conversation with the users and understand
‘what they need, Then that informs what is going to be bull.
What are the ways toward effective interviewing?
Hernandez: Interviewing for database design isn't an exact science. But itis
‘skill that can be learned. The people who do it have to have very good
analytical skills and good people skills. You ask people how they define their
daily work. You ask them what is the first task that they do in the day and
age B60oO
aa
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
age oF60
MMM
E-guide Content
MMM
[about] what they are dealing with conceptually. Usually they're dealing with
‘customers. So you ask, ‘What i a customer to you?" The answer is different
for afferent companies and different departments within companies. You
have to learn their perspective, and their semantics.
High-level concepts Ike customers, visits, schedules, orders or tasks - those
are main concepts that have details around them that you have to record,
‘And what are some things that stand in the way of efforts to do the
requirements gathering that forms the basis for good database design?
Hernandez: I's all about time, Today, people want to do it quickly. They want
to getit out there, and then they/l see what happens. They say, if we get
pushback, well fix it as we go. To me, that is such a bad way to dot. You
‘can avoid a lot of headaches and problems ahead of time if you just invest
the time to plan.
That's one of the things | tell people: This is not a waste of time. You are:
investing the time to go through this process in a considered manner and to
create aqualily data product that probably has a higher success rate than if
you just rushed right through it.
‘So, personally, am not a big fan of Agile design, Agile computing and the
whole Agile concept | think that is the opposite way than the one we should
bbe going. A lot of people try to shortchange or avoid interviewing, But it
drives what you design. I's what makes it successful, what makes it usable
by the people that are going to work with [the dataOo or PRO+
Se ee E-guide Content
MMLC
a, ‘To make sure you establish the proper specification, you need to revisit the
In thise-guide design with the users and [business] managers. Users and managers are
a ‘going to have different perspectives on how the data is used, That's why
Niseoion t Relational discussing the evolution of the relational data structures with them is useful.
ae we People that don't do that often expect to fix things later with coding. What
they end up with is just a mess —a railroad wreck,
Socton 2 NoSdL databases ALLLLLILLLLLLLLLLLLLLLLLLLLLLLLLLALALALALLLLALALLLALLLLLLLLLELLLLLLLLLLALLLELELEDDDD
936 ‘SWNext article
[AGetting more PRO+ essential
content p60
age20ct60~ oa 7 PRO+
Cech femal E E-quide Content
VMOU LLLLLLLLLLLLLLLLLLLLLLO
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
age2tet60
SQL Server 2014 adds In-Memory OLTP
power boost, hybrid cloud support
Jessica Skin and Mark Fontecchio
Product of the Month
Product of the Month: SQL Server 2014, from Microsoft
Release date: April, 2014
What it does
‘SQL Server 2014 is the latest version of Microsoft's relational database
management system, released to general availabilty at the start of this
month. Among other enhancements, it offers increased processing speed,
{greater cloud connectivity and higher memory limits ~all part of Microsoft's
‘ongoing effort to improve SQL Server's ability to handle enterprise-class,
transaction processing and analytics applications.PRO+
Pe Ek E-guide
MLL LLL
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
age22 60
What sets it apart
‘SQL Server 2014 can be run on-premises, entiely in the cloud or in hybria
‘loud environments that include on-premises data. Its most-anticipated new
feature is In-Memory OLTP, a memory-optimized online transaction
processing engine integrated into the database that lets tables stored in
memory be processed alongside disk-based tables. Microsoft boasts that
using In-Memory OLTP can boost transaction processing performance by
‘as much as 30 times compared to conventional approaches with data
stored on disks. SQL Server 2014 also accelerates the InMemory
ColumnStore data warehousing technology introduced in the 2012 version,
‘iving the new database a powerful one-two punch on in-memory
processing
What users say
Wolfgang "Rick" Kutschera is team leader of database engineering at
Bwin.Party Digital Entertainment, a SQL Server 2014 beta tester. The
Gibraltar-based company, which specializes in online betting, has 180
servers with a total of 4,000 SQL Server instances. Kutschera said that
Using In Memory OLTP enabled BwinParty to scale up its processing
‘capacity to support business growth without spending money on more
hardware. He described the in-memory feature as “one of the most amazing
things Microsoft has done in a while."my ;
ee Eouide
MMLC
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
age2s.60
Organizations should go into SQL Server 2014 implementations “with open
eyes," Kutschera said. But, he added, the database “is so flexible and so
stable that we're comfortable using the beta in production”
Edgenet Inc, an Atlanta-based software and services provider for the retail
industry, has had a similar experience with SQL Server 2014. Vice President
of IT Michael Steineke said the in-memory computing capabilites allow
Edgenet to process product pricing and availabilty data from clients’ stores
innear real time,
"We needed to leverage the in-memory functionality to do continuous
updates to live systems without having a lot of latch contention of lock
contention,” Steineke said, “That way, we could update product availability
information from various retallers as quickly as they can provide it to us.”
Drilldown
‘+ Can be deployed ompremises, in the cloud or in hybrid environments.
+ Adds anew in’memory transaction processing engine to boost OLTP.
performance.
+ Improves on SQL Server's AlwaysOn Availability Groups high-
availabilty technology.PRO+
oO io
eer E-guide Content
MMLC
Inthise-guide Price
INSection : Relational ‘SQL Server 2014 has three main editions: Standard, Business Intelligence
databases p2 and Enterprise. Each edition has aset list of features for example, In-
Memory OLTP is available only in the Enterprise Edition, The editions are
sowvon 2 NoSdL databones priced either per CPU core or by server and client access licenses.
to Microsoft wouldn't disclose specific pricing, but a representative said there
" are "no pricing changes to SQL Server 2014" from SL Server 2012's
licensing costs
‘Ww Next article
[AGetting more PRO+ essential
content p60
age2d 60< i PRO+
prea E-guide br
Pr
MMLC
Inthise-ouide Oracle Database In-Memory option
Msecon ¢Reatonal something to remember
sees “ Jessica Sirkin, Associate Site Editor
MSecion2NoSOL databases “The Oracle Database InMemory option, released today, promises a 100x
936 ‘speed increase for analytics, an improvement that could help customers
provide nearrealtime information to its business users. That isa pretty tall
‘order, and Oracle customers will get to see ifthe add-on can live up to the
hype.
While conceptually similar to SAP HANA, the Oracle Database In-Memory
‘option is an add-on to the Oracle Database, and does not require alterations
to the database infrastructure or data migration for implementation.
Because of that, itis not confined to a specific platform, and can run on nore
Oracle systems.
[AGetting more PRO+ essential
content p60
Users of the in-memory add-on don't have to place the entire database in
memory, but can spread the database across clusters and only select,
specific clusters for in-memory, according to Tim Shetler, vice president of
product management at Oracle.
‘The Oracle Database In Memory option has a new fault tolerance system
‘also designed around Real Application Clusterswith inmemory data
distributed over multiple clusters. This way, when one cluster goes down,
Page2ot 60oO
aa
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
MMM
E-guide Content
MMM
there is an immediate transparent switch to another cluster. According to
‘Shetler, this will keep faults from interfering with database performance.
Christo Kutrovsky, senior consultant with the Pythian Group, emphasized the
importance of the Oracle Database In Memory option's compression
capabilities. He said it allows the Oracle Database to keep very large tables
in compressed memory. According to Kutrovsky, the add-on can compress a
100 GB table by 30x. That means a tremendous amount of compression
cover the whole database, which means more data can be loaded into the
database with inmemory.
“The compression is what makes it really worth ity he sald. "This feature
‘applies to more use cases than anything Oracle's released in a couple
years."
Real-time's one of the words Oracle has brought up again and again when
discussing the in memory add-on. But, when Oracle says real-time, it doesn't
mean instantaneous. ‘It really means ‘don't wait," said Shetler, “just do things
immediately. Maybe itl take a few seconds." He explained that processes
that used to take half the day were reduced to 10 minutes, and analytics give
responses in less than a second,
Holger Mueller, vice president and principal analyst at Constellation
Research defined real time as having no batch process, no storing of
aggregates and no intermediate steps. "You don't use other time delay
constructs," he said. You can go back to the data." Mueller described the
difference between previous processing speeds and real time as the
difference between the telegram and the emalPr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
age 270160
p60
MMM
E-guide Content
MMM
“If it used to take hours and now it takes minutes, then that is real time to the
people who used to wait hours,” Oracle's Shetler said. He added that with
the Oracle Database In-Memory option, analytics and transactions can be
run at the same time. Inmemory also opens the possibilty for running high-
performance transactions against production.
‘The Oracle Database In-Memory option has dua-format architecture, which
means both memory-optimized columnstore and row store. The optimizer
ssorts incoming data to in-memory columnstore and row store, Updates go to
the row format, while analytics go to the columnstore. "The whole dua
format architecture is what's really unique," Mueller said, In the Oracle
Database In Memory option, columnstore and row store are synchronized
land changes to row store are reflected in columnstore. The changes are
made in the background asynchronously, However, updates are made
immediately for needed data and queries. "You're never going to see old
data," Shetler said.
“The issues you generally have with inmemory-columnstore, compression
— these features let you work around these things," Kutrovsky sald,
‘Ww Next articlei PRO+
relia Manogret =
MMLC
Pr
Inthise-auide / DB2 BLU Acceleration boosts IBM's
INsection Relational flagship RDBMS
databases p2
— ‘Jack Vaughan, Senior News Writer
‘Section 2: NoSL databases Data managers have lately directed a lot of attention at advances in
36 specialized data warehouse engines and NoSQL databases, but flagship
relational databases are not standing stil, as IBM's DB2 BLU Acceleration
[AGetting more PRO+ essential software shows,
content ps0 IBM's stalwart DB2 relational database management system (RDBMS), for
‘example, has added numerous capabilities, including enhanced in-memory
data handling, data skipping, improved compression, support for columnar
analytical processing and more, Some of these traits are just the kind of
thing that has given new-generation relational analytic engines and NoSQL.
upstarts their allure
Columnar processing, often coupled with compression, has become
associated with the new breed analytical engines that arose from the tikes
of Aster Data (now part of Teradata), Vertica (now part of HP), ParAccel
(now part of Actian) and others. But several mainstay relational databases
have come out with columnar enhancements.
Columnar processing focuses processing efforts more narrowly on data
sets specifically needed for common queries. Ithas multiple advantages,
including reduced I/O and improved use of cache.
age2s.oi60Pr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
age29 60
p60
MMM
E-guide Content
MMM
Recent updates that arrived in DB2 10.5, known collectively as “BLU
‘Acceleration, can support sped-up “I/O bound!" operations while still
capitalizing on available in-house RDEMS skis, according to Kent Collins,
who is database engineer and architect with Burlington Northern Santa Fe
Railway (BNSF) Corp, based in Fort Worth, Texas.
Improved data compression has had an immediate helpful effect in cutting
memory requirements, he said
“It's been very positive for us. We just moved a 400 GB database, and when
‘we finished it was 80 GB," he said, BNSF has also seen speed increases of
‘as much as a hundredfold for some queries with BLU.
‘Stepping down big data and turbocharging queries is important to BNSF, a
railroad that is collecting more and more types of data on far-flung
‘operations that saw it in 2012 haul more than 1 million carloads of
agricultural commodities, 2.2 million coal shipments, 47 million trailer or
container shipments, and 1.7 milion carloads of industrial products.
Said Collins, whose data feeds include text messages, radio messages and
Video, "I am up to my elbows in unstructured data" He then quickly
recalibrated the estimate. "I am up to my eyeballs." He said columr-level
data processing that can be programmed using established SQL methods
has been a big step toward taming the unstructured data deluge.PRO+
Pe Ek E-guide
MMMM
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
agesat60
Take me out to the new RDBMS game
Ina way, additions to relational databases are mirroring larger changes in
data architecture, said Bernie Spang, director for strategy and marketing for
IBM Database Software and Systems,
"We've moved from the world where you defined your data problem and
then decided which relational database to use. Now the question is, What
data technology should | use?' And even in the RDBMSs, there is a
difference between the old generation and the new generation. I's a new
ball game"
IBM has applied some state-of-the-art data technology with DB2 BLU, said
IBM Distinguished Engineer Sam Lightstone. The compression is
“actionable,” he sald, meaning that the mode of compression adapts to the
kind of data being processed. It allows analytics to run on compressed data
directly without decompression steps that add processing overhead,
according to Lightstone.
“BLU is compression-optimized, in-memory-optimized and its columnar,” he
said. It supports data skipping (in which irrelevant data is ignored),
parallelism and vector-processing scans too. “Itis the combination of these
things that gives DB2 huge speedups," Lightstone said.E-guide Content
MMLC
Pr
eer
In thise-guide Narrowing the analytics gap
section + Relational Many advances in data technology in recent years have been inthe realm of
databases 92 ‘specialized analytical relational database management systems, according
to industry observer Curt Monash, president of Monash Research and editor
and publisher of DBMS2 and other blogs. But in general, flagship relational
Seeton 2 NoSaL databases databases are “narrowing the gap,” he sai,
p36
——__. Monash said that DB2 BLU could be seen as a first step. “Inits first iteration,
[NGetting more PRO+ essential itis a single-server product, and ‘in-memory single server’ is definitely a
‘content p60 limitation." As well, he points out that the first version of BLU is optimized for
10 TB databases, although it is capable of ramping up to 20 TB.
Monash noted that IBM has other specialized analytical RDMBS approaches
beyond DB2, one of which is its Netezza data warehouse appliance.
IBM is far from alone in the race to enhance the major RDBMSs. As data-
related challenges grow, resurgent RDBMS technology could well be
‘welcome by many.
‘Next article
age 60Pr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
p60
MMMM
E-guide Content
VMOU LLL
Experts debate big data vs. SQL
development
‘Mark Bruneli,Former News Director
With the rise of big data technologies like Apache Hadoop, MapReduce and
the bevy of open source products growing up around them, the good old
traditional SQL database has been losing mindshare with some fairly
influential application developers.
And its not difficult to understand why. Big data is hot, and there have been
plenty of headlines over the last few years questioning the long-term viability
of SQL in the era of unstructured data, It's no surprise that many developers
‘want to follow suit with the big data pioneers at Google and Facebook — but
the desire to go big isn't always a practical one.
Just ask Tim O'Brien, an author and independent consultant who specializes
inhelping companies work more effectively with developers, O’Brien, who
‘spoke about the future of relational databases at the recent O'Reilly Strata
Conference in Santa Clara, believes that when one looks at the history of IT
over the last several years, i's easy to understand why attitudes have
changed,
“There is a certain kind of developer that is really focused on the trends that
are being set by that group of 50 people that do big architecture at a place
like Facebook or Google,” O'Brien sald during a phone call after theoO
aa
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
MMM
E-guide Content
MMM
conference. “The conclusion that they came to in the last couple of years is
‘We would never use a relational database. Relational databases don't
wee
‘While well-funded startups and big data crunching organizations like the
Chicago Mercantile Exchange, NASDAQ, the Internal Revenue Service and
others will follow suit with the likes of Google, the average company will
continue to find that SQLs the right tool for most development projects for
the foreseeable future, according to O'Brien,
‘OBrien offers three main reasons why organizations in general ‘can't
‘escape SQL development. For starters, its a language that has a great deal
of inertia, he said. The majority of development tools and platforms, such as
Ruby on Rails, are using SQL. Secondly i's the best query language
available. Lastly, SQL was originally created as a way to help organizations
‘work more easily with multiple vendor’ databases —- and O'Brien predicts
that SQL's abiity to unify will continue to be important for years to come.
“I think the big data community is focused on creating this perception that
the world is changing right now, and if you continue to use that old relational
database technology, you're just going to be an old useless man working on.
old useless systems," O'Brien said. "And | think that's false."
‘OBrien went on to suggest that in the next few years the “traditional” SQL
database may evolve into something better and far more scalable -
‘something that blurs the lines between big data technology and more:
familiar database management systems. He pointed to Google's Spanner
database as one possible example of things to come.E-guide Content
MMLC
oO
aa
OL “I think Spanner points the way toward the future of big data for most
In thise-guide companies," he said. "The important thing about Spanner is that it's SQL-
a based, it provides transactions, itis horizontally scalable — and that's the big
section Relational alfference.
‘ataases e2 ‘Another company that offers a possible glimpse of how SQL fits into the
— future is Drawn to Scale, which bill its Spire product as “the first database
IRSoction 2: NoSOL databases for large, user-facing applications built on Hadoop." Spire supports SQL and
p36 MongoDB queries in addition to MapReduce, and is built to power large-
scale websites, mobile deployments and other applications.
INGetting more PRO+ essential “There is no reason why you can't use SQL to query everything, right? That
content p60 is already happening. People are using SQL to query Hadoop," O’Brien said
"Fast forward 20 years and | don't care how the database is deployed to me
‘as a developer. 'm just executing a SQL query and getting a result back. It's
lke the difference between a cloud-based Linux machine and a real Linux
machine, It's the interface that defines the experience.”
‘When it comes time to develop a big application or website, i's important to
avoid the hype and simply pick the right tool for the job. While there may be
temptation to discount relational altogether and go straight to big data
technologies, i's important to weigh both approaches against the need of
the job at hand. Conference attendee Felix Giguere Villegas, a distributed
systems specialist who runs the Big Data Montreal user group, said he
agrees with that point
“For analyzing logs, you're probably better off with a tool like Hadooy
said. "But for a lot of use cases, SQL does the trick quite well, especially at
Page24ot60Pr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
Pape.0160
p60
MMM
E-guide Content
MMM
the scale most of us run at and especially considering the skis that are
available in the marketplace at the moment.”
Giguere Villegas went on to say that he would welcome any big data
technologies that incorporate SQL. He said some of the SQL engines that
run on top of Hadoop— such as Cloudera Impala — are proving that
horizontal scalability for SQL is possible. The only problem is that these
offerings do not boast the same level of maturity as the popular relational
databases of today.
"SQL is a very useful abstraction and, of course, there is momentum behind
the fact that a bunch of people know it," Giguere Villegas said, “But it's not
just momentum that is going to keep it there. Itis genuinely useful to have
‘SQL, and if we can have a mature, working, interactive, scalable SQL
solution on top of a big data platform, that would be a big boon for
everyone”
‘“wNextsectionPr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
ages6t60
p60
MMMM
E-guide Content
VMOU LLL
Section 2: NoSQL databases
‘While myriad NoSQL database options have emerged to help businesses
address big data requirements and scalability concerns, they aren't full
replacements for traditional databases, Some companies are choosing
NoSQL systems to support big data applications in completely non-
relational environments, but others are combining them with a relational
database management system or data warehouse —-an approach that
ilustrates the frequent use of NoSQL to mean “not only SQL." The articles in
this section examine the varied roles of NoSQL technologies and how they
relate to mainstream relational databases,
WALLLLLLLLLLLLLLLLIALLALALALLILILIDLLLLLALLLLLLLLILLLALALALLLLLLLLALLLALLLLLLLLLOD
‘SwNext articlei PRO+
relia Manogret =
MMLC
Pr
Inthise-auide I NoSQL databases dent relational
Msecon ¢Reatonal software's data processing dominance
cee = Jack Vaughan, Senior News Writer
Msecton 2:NoSOL databases In 922, automaker Henry Ford famously wrote that his customers could
336 have a car painted any color they wanted-~as long as it was black. Unt
recently, IT managers, application developers and business executives
faced similarly limited choices in selecting database technologies. Relational
databases built on top of the SQL programming language were the
dominant engines powering corporate IT and business systems, with no real
challengers in sight.
[AGetting more PRO+ essential
content 60
But things have changed. Startingin the mid-2000s, SQLs absolute
‘supremacy was undone by the likes of Yahoo, Google, Facebook,
‘Amazon.com and eBay. At those Internet giants and other companies, the
need to run colossally scalable Web applications with varied and fast-
‘changing data requirements prompted efforts to findalternatives to
mainstream relational databases. That ushered in first a stream, and over
the past few yearsa torrent, of new technologies that eschewed rigid SQL
development principles in favor of more flexible and scalable data designs.
‘Those databases are spread across several distinct product categories
based on different data models. But they share a pithy umbrella term with a
stake-in-the-ground sound: NoSQL,E-quide PRO+
Zon
De
MMLC
Inthise-guide
Allin the NoSQL Family
ee org ties gue naa ue sae a Ree vo a on
section Relational (tedsynma rence tay ah tg sbonessaecendwihean dats abe ace
databases pe herhorsves yond we eben pay com
MSecion2NoSOL databases 0) ces ast cet eeceteeereveecs emote
Scone sever coon as anes
p36
[AGetting more PRO+ essential
content 60
‘The truth is, though, that the NoSQL movementisn't really an upagainst-
the-wall revolution seeking to eradicate relational databases. Yes, some
NoSQL vendors do talk lke that's their ultimate goal. But the term NoSQL
has been softened to also mean "not only SQL,” in recognition of the fact
that many of the databases do incorporate some elements of SQL. More
substantively, NoSQL technologies aren't positioned as wholesale
apes8ot60E-guide Content
MMLC
oO
aa
@LLLLLLLLLE™ replacements for relational software ~ they tend to be built for specific
In thise-guide uses, usually involving large data sets that need to be accessed and updated
a frequently. And that's how things are playing out on the ground thus far:
NoSQL databases have become must-have items for companies with fast-
‘growing vaults of Web, social media, demographic and machine data, but
often they're sharing data processing and analysis workloads with SQL-
based software.
INSection : Relational
databases p2
INsecton 2: NoSOL databases
736 For example, Crttercism In. is a startup that helps organizations monitor
the performance of their mobile applications, based on reabtime data
. collected from more than 800 million mobile devices. In application
Getting more PRO> essential performance management parlance, a user interaction with an app is called
content ps0 ‘a request; Crittercism pulls in information about more than 30,000 requests
per second, arate that adds up to nearly 3 billion a day. That has created a
pool of more than 20 terabytes of data~ and the total only keeps growing,
said Lars Kamp, vice president of business development at the San
Francisco company.
Included in the mix is data on application errors, crash diagnostics and what
Crittercism calls "network breadcrumbs" documenting the trail of network
calls and other processing events leading up to app problems. That data “is
very unstructured and non-uniform, and varies widely from customer to
‘customer and application to application," said Mike Chesnut, the company's
director of operations engineering,
ages00t60~~ ; PRO+
Se E-guide Content
MMLC
Inthise-guide Meeting the old way halfway
section + Relational “The sheer amount of information involved, and its variable nature, mandated
databases 92 1 fresh approach to formatting the data, Using relational software would
have required substantial processing overhead to maintain a database
‘schema that could accommodate all of theinformation, plus frequent
Seeton 2 NoSaL databases downtime for making changes to the schema, Chesnut said; he added that
p36 ‘the company had to be able to modify how it collects and stores data "on
the fly, often several times a day.” Kamp was even blunter: “Crittercism as a
Getting more PRO essential ‘compary would not have been possible 10 years ago." when SQL was the
content 60 only choice, he sald
Enter MongoDB, a NoSQL database running on the Amazon Web Services
cloud. Like other NoSQL technologies, it offered schema design flexibility.
That made it possible for Crittercism to store the error and crash data in a
single “collection” - the MongoDB equivalent of a relational table without
imposing a strict schema on the information. In turn, the lack of a fixed data
structure with uniform fields has enabled the company's performance
management service to "evolve organically" to meet the needs of different
‘customers, Chesnut said.
Crittercism also uses Amazon.com's DynamoDB NoSQL database to store
data on a specific request path that requires particularly fast performance,
‘according to Chesnut. But there's SQL in the company's database
architecture, too. A PostgreSQL open source database holds highly
relational operations data, and all of the information is summarized in a SQL-
based Amazon Redshift data warehouse for analysis and reporting. Chesnut
age 60Pr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
age 60
p60
MMM
E-guide Content
MMM
and his colleagues aren't NoSQL purists: "We're very engaged with exploring
‘any and all technology offerings that can help us solve our problems and
better serve our customers," he sald,
Recent surveys show that NoSQL databases are making inroads with big
data users but overall, adoption is stil relatively low. For example,
TechTarget’s 2013 Analytics & Data Warehousing Reader Survey found that,
21% of 222 respondents with active or in-the-works big data programs were
Using or planning to deploy NoSQL systems as part of the efforts. Another
survey conducted last year by Enterprise Management Associates Inc. and
sight Consulting produced an almost identical result: In that case, 22% of
the 259 respondents said they had NoSQL platforms in place. In a third
survey, done by The Data Warehousing Institute, 32% of 189 respondents
said their organizations were using NoSQL software. Even there, though,
NoSQL technology was last on the adoption lst, traling behind relational
databases, data appliances, columnar software and big-data fellow traveler
Hadoop (see Figure ).Pr
Seas
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
age42ot60
MMM
E-guide Content
MMM
‘What's in Your Big Data Environment?
‘cemagesctoranaatansusngyers cat pats fb oa araperet er of emece
trainin rahe penttoming meio om
in eatenaionah | TT cn
seomewen atm
recemornoune rie sc SE
cements at
ssxcctee Eas
Greater penetration of data centers is expected going forward: Analyst
‘group Wikibon forecast last year that worldwide revenue for NoSQL
‘software and services would grow from $286 million in 2012 to $1826 billion
in 2017, And venture capitals are betting big on that kind of growth,
‘MongoDB Inc, which leads the development of its namesake database,
raised $150 milion in new funding last fll. That came shortly after $45
milion and $25 milion funding rounds by DataStax Ine. and Couchbase Inc,
two other NoSQL vendors.
Relational players hit from both sides
Even the big relational database vendors have gotten into the NoSQL game.
Oracle introduced a NoSQL database in late 2011 and was one of the leadoO
aa
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
Page 42060
MMM
E-guide Content
MMM
‘sponsors of ast year's NoSQL Now! conference; Oracle representatives
‘gave a keynote speech and led two technical sessions at the event. Last
June, IBM added support for MongoDB's application programming interface
to its DB2 relational database, enabling users to store data there in the
JavaScript Object Notation (JSON) format. DB2 can also handle graph and
XML data, and IBM in March acquired Cioudant Inc, a NOSQL vendor that
runs a hosted version of the JSON-based CouchDB database. Microsoft
offers a NOSQL data store as part ofits Windows Azure cloud platform.
Application-driven data needs and the growing move toward cloud
‘computing are creating a wider opening for NoSQL methods, said Carl
Olofson, a database analyst at market research company IDG. For IT
managers and business executives, though, he compared buying into
NoSQL with investing in a new stock that doesn't have alot of market
history.
"Most of the NoSQL databases are new. They stil need to be battle tested,"
Olofson said. "If you're constantly changing data definitions and you can't
‘change your relational database fast enough, you might look at NoSQL. But
there is risk”
For one thing, NoSQL technologies typically don't provide full ACID
capabilities — atomicity, consistency, isolation and durability — for
‘guaranteeng transaction integrity, as relational databases do. In adltion,
they often lack enterprise-class services in areas such as disaster recovery,
security and data quality, according to Olofson, Like other analysts, he also
‘expects a whittling of the welkpopulated ranks of NoSQL vendors as the
market matures.oO
aa
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
Page 44ot60
MMM
E-guide Content
MMM
*NoSQL databases are really good for handling XML and JSON data, which
includes a lot of things Java developers are working on these days,” said
Wayne Eckerson, a TechTarget industry analyst and president of
consultancy Eckerson Group Inc. In particular, they're well suited to high-
performance Web applications "with a high volume of reads and writes,”
Eckerson said. But, he added, they aren't such a good fit for “long-running
‘queries® and other complex analytics jobs.
NoSQL software provides speed boost
‘That maps to the database architecture at Exelate, a marketing data
‘services and technology provider that uses a diverse range of tools to
‘supply information on household demographics and purchases to online
advertisers and publishers. "Data is what we do," sald Elad Efraim, co-
founder and chief technology officer at the New York company. That makes
performance paramount, he added. And while Exelate didn't start out with
NoSAL technology when it was founded seven years ago, the need for
speed eventually led Efraim and his team to deploy Aerospike, an in-memory
NoSQL database that has helped scale the company's infrastructure to
rapidly handle as many as one trillion realtime data transactions a month,
‘Aerospike provides a highperformance repository for data on the user
‘session activity of website visitors that is constantly being updated, Efraim
said, ‘We're talking about a large-scale system with a very high capacity of
reads and writes that have to complete in some milliseconds, I's very
important for us to make sure we can access the data in a way so that it can
be made availabe [to our customers] for decision making."Pr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
ape4.o160
p60
MMM
E-guide Content
MMM
‘The database runs on servers at four fully replicated data centers
‘worldwide, indexing everything to memory and holding itn the server cluster
for further processing, From there, the data can be mined and correlated to
other information in analytics and back-office systems. To make that
happen, though, Exelate’s applications don't solely use NoSQL software,
‘One layer above the Aerospike repository is a “pretty standard” MySQL
relational database that lets customers aggregate data, Efraim said, The
‘company also uses an IBM Netezza appliance and relational database as a
data warehouse for analytics uses.
To put things in Henry Ford's terms users like Exelate and Crittercism no
longer have to limit themselves to basic-black relational databases —and
they're taking advantage of NoSQL's new color choices to drive applications
that mainstream relational software isn't suited for. But SQL black isn't going
‘completely out of style with IT shoppers. For now, the two technologies are
likely to share space in database garages.
‘Next article< i PRO+
prea E-guide br
MMLC
Pr
Inthise-auide IN Slew of disparate NoSQL databases vie to
Secon Raton displace RDBMSs, fit by fit
databases p2
— ‘Jack Vaughan, Senior News Writer
‘Section 2: NoSL databases Cassandra, MongoDB, HBase—- they're just a few of the many NoSQL
36 databases now proliferating. These databases look to solve one problem or
‘another encountered by the steadfast relational database systems
(RDBMSs) that have long ruled in the enterprise. But the very variety that
makes the NoSQL sector so vibrant can make comparing afferent products
‘challenging and often fruitless - proposition for would-be users.
[AGetting more PRO+ essential
content p60
Before looking more at that issue, i's reasonable to ask why any of these
NoSQL things matter at all. The short answer is that large-scale distributed
processing is taking hold in more applications, thus exposing some of the
‘creaky flooring on which the RDBMS sits. In Web applications and
enterprise apps alike, a common theme has been emerging: The relational
database may not always be the best fi
Examples of RDBMS misfits are common. The relational database can be
too expensive to grow out in a widely distributed version. It doesn't easily
‘adapt to new styles of data for example, the unstructured information
that's common in ig data applications It struggles with the massive data
volumes coming from in-the-field sensors or Web server activity logs.
Page 4660Pr
eer
Inthise-guide
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
age 47ot 60
p60
MMMM
E-guide Content
VMOU LLL
‘As people have found more and more reasons to move work off of
incumbent relational databases, what has emerged is a “fit for purpose”
mentality of the kind that was a bit more prevalent in the days before the
RDBMS became the all-purpose flour in the database server pantry. And the
number of NoSQL database options developed to fit various purposes has
grown greatly
Searching for Cassandra
‘Apache Cassandra is a good example. Like some other NoSQL.
technologies, the Cassandra database came about because of a big Web
2.0 fish -in this case, Facebook. The purpose for which Facebook created
‘Cassandra was to enable users of the social network to search their
inboxes. When the database was launched in 2008, it supported replication
‘across geographically distributed data centers to quickly service the
‘searches of as many as 100 million users.
Inside, Cassandra is a distributed key-value database that uses a row store
‘scheme and a peer-to-peer (or shared nothing) architecture. Its design
incorporates some of the characteristics of Google BigTable and Amazon
Dynamo, two early and influenti NoSQL databases. Along the way,
‘Cassandra has added support for MapReduce, gained a query language and
triggers and refined its support for lightweight transactions and database
‘compaction.
Facebook eventually replaced the Cassandra-based search system with a
Hadoop and HBase implementation, but the company ceded the software toPr
eer
Inthise-guide
MMM
E-guide Content
MMM
‘open source; a community arose to carry it forward, and Cassandra became
‘a top-level Apache Software Foundation project in 2010.
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
Pape48ct60
p60
Mapping to the problems
‘Cassandra represented a good fit for the needs of Internet Identity,
according to Jason Atlas, vice president of technology and engineering at
the Tacoma, Wash-based security services company. Known as IID, the
‘company had a rapidly growing database of IP addresses running on a
MySQL RDBMS cluster, But for cost and other reasons, the MySQL path
didn't seem tenable going forward,
ID was harvesting and collecting 600,000 unique IPv4 addresses and host
names per week. Related metadata collections were also growing. "We
started to see that we couldn't store more than 30 days of information at
‘one time,” Atlas said. "The problems largely revolved around scale.” He
added that the IPv4 data “lent itself to a key-value approach,” which
Ultimately led IID to the DataStax Enterprise version of Cassandra.
Cassandra is built to run on commodity clusters, as might be expected given
its Google-Amazon-Facebook lineage. Its focus on scalability bears frut, in
‘Atlas's estimation: He said its "coming as close to linear scaling" as
anything he has previously seen He also gives points to DataStax for a
Cassandra-MapReduce integration that he expects to use going forward
But he cautioned those who are looking to embrace Cassandra or other
NoSQL databases, offering a reminder that itis unwise to forceitPr
eer
Inthise-guide
MMM
E-guide Content
MMM
technologies onto problems. “It's always best to map the problem onto the
solution, Atlas said,
INSection : Relational
databases
p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content
age490t60
p60
How do | NoSQL? Let me count the ways
Sorting through the variety in the NoSQL. spaceis nothing short of daunting
‘Some NoSQL vendors are becoming household names in database circles
for example, DataStax and a quartet of other NoSQL database makers
(Basho Technologies Inc, Couchbase Inc, MarkLogie Corp. and MongoDB
Inc.) were listed among the top vendors of operational database
management systems in a recent Gartner Inc. Magic Quadrant report. But
there are dozens of NeSQL offerings in several distinct product categories -
~ and different databases in the same category were bul to support
different uses. Is all abit of a maze to navigate.
| caught up with Gartner analyst Merv Adrian on this issue in the
‘Twittersphere. In a tweet, he had pointed to a Linux Journal reader poll
‘comparing NoSQL databases. Adrian deadpanned: “in related news --do
you prefer apples, cocktails or broccoli?" While rolling on the floor laughing, |
tweeted him that I thought | understood his point. He tweeted back: "It's
Useless -- and meaningless ~- to compare ‘NoSQL' products that are so
wildly different in structure and intent.”
‘Atlas made a similar point. “Mongo and Cassandra have nothing to do with
‘one another, but are still both called 'NoSQL’ Their use cases are very
different,” he saidPRO+
E-guide Content
MMLC
Oo Po
eer
@LLLLLLLLLE™ Ultimately, we should expect some thinning of the NoSQL ranks. Cassandra
In thise-guide is showing signs that it could be one of the survivors. But despite being fit
a {for some specific purposes, it and others under the NoSQL umbrella may
INSection Relational need to find more general uses to truly thrive.
databases p2 LALILLLLLLLLLLILLLALILLLALLLLLLILLLALALALLLLLLLLLLLLALALALLLLLLLLALALALLLILLLLLO
‘Next article
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content p60
apes060< i PRO+
prea E-guide br
Pr
MMLC
Inthise-auide When does a NoSQL DB trump a
section + Relational traditional database?
databases 2
Mork Whitehorn, mers Professor of Anaycs
[Section 2: NoSQL databases We're looking at the issue of NoSQL vs. SQL databases. When does it
936 ‘make sense to consider using a NoSQL DB rather than a relational
database?
Getting more PRO> essential Put simply, NOSQL databases are a better choice when you have data that
content p60 doesn't fit well nto tables. We have worked on SQL-based relational
databases for about 40 years now; the result ofall that work is that they are
very good at handling trarsactions involving tabular data stored in rows and
columns. We can also analyze such data very effectively in dimensional
databases.
The kind of data that fits well into relational tables is known as atomic dat
which simply means that we split the data up into the smallest components
‘we want to manipulate. For example, we don't usually store the complete
name of a customer in one field. If we're adding data about a customer
named "Mr. James Mason’ to a relational database—but we want to be able
to find al ofthe people withthe tile "Mr in the database and sort
‘customers by both first and last name —-we would store the name data in
three distinct columns in a table.
age sc60oO
aa
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
Pages20160
MMM
E-guide Content
MMM
However, a great deal of the data we are now collecting doesn't tabularize
‘well. We're talking about images, sensor data, Word documents, Twitter
feeds and so on - or what is often called big data. Even when we can put
‘such data in tables, it may not be efficlent to do so. For example, you could
store every pixel of an image as a row in a relational table, But then you have
to ask yourself, "What SQL code could | write to determine if the image
includes a person?" | can't even begin to imagine what that would look like.
‘The good news is we have specitic database engines that are built to hold
‘and manage big data: NoSQL databases. Relational databases require us to
impose what is called a schema in the data. Think of the schema as a way of
‘organizing the data: In relational databases, we have to split up data into
atomic units and then organize it as columns and rows in tables,
NoSQL database engines come in a variety of different types, so too much
‘generalization can be misleading, But in general they require only a very
simple schema and sometimes no schema at all. For example, in some
NoSQL database systems, we could put mage files straight into the
database without altering their structure, We could also put audio fies into
the same database. Getting data into the database becomes much simpler,
‘and there's more flexibility on how that data is structured
So if you have data that can't be put into tabular form in an elegant way, or
that needs queries that can't be comfortably expressed in SQL, think about
looking at the range of NoSQL database engines that are available.
‘Next articlei PRO+
BeCCR ea E-cuido m
MMLC
Pr
Innse-guide FM NoSQL security: Do NoSQL database
INsection Relational ; security features stack up to RDBMS?
sees “ ‘Michael Cobb, CISSP-ISSAP
omens oso cones NoSOL, oF Not On SOL, an approach to data storage and ceva thats
Soe Wary fashionable ith startope developing nterseve Web apcaons ae
enterprises dealing with huge quantities of data The main reason for its
popularity is that t provides better scalabilty and availabilty, as well as
faster access to data, than traitional relational database management
systems (RDBMS), including Oracle's MySQL and Microsoft's SQL Server.
[AGetting more PRO+ essential
content 60
Data held in a RDBMS has to be predictable so it can be stored in organized
tables and rows, with relationships defined between different elements. Data
ina NoSQL database, on the other hand, doesn't need to be so structured or
follow a fixed schema. When performance and real-time access are more
important than consistency, such as when indexing and retrieving a large
number of records, NoSQL is a better fit than a relational database. Data
‘can also be more easily held across multiple servers, providing improved
fault tolerance and scalability. Companies like Google and Amazonuse their
‘own cloud-friendly NoSQL database technologies, and there are a number
‘of commercial and open source NoSQL databases available, such as
‘Couchbase, MongoDB, Cassandra and Riak.
For all the advantages of storing data in a NoSQL database, NoSQL security
is adversely impacted by the need to access data quickly and easily. To<2 ic
Ser ras E-guide PRO+
MMLC
Pr
XXKLL_LLE™ store information securely, a database needs to provide confidentiality,
In thise-guide integrity and availablity (CIA). Enterprise RDBMS databases provide CIA
a through integrated security features such as role-based security, encrypted
Niseoion t Relational ‘communications, support for row and field access control, as well as access
ae we control through userevel permissions on stored procedures. RDBMS
databases also have ACID (atomicity, consistency, isolation, durability)
—_ properties that guarantee database transactions are processed reliably;
IMSection 2: NoSL databases data replication and logging ensure durability and data integrity. These
p36 features increase the time it takes to retrieve large amounts of data, so they
ee are not implemented in NoSQL databases.
[AGetting more PRO+ essential
tr “ In order to maintain fast access to data, NoSQL databases come with litle
content p
builtin security. They have what's called BASE (basically availabe, soft
state, eventually consistent) properties; rather than requiring consistency
after every transaction, the database just needs to eventually reach a
consistent state. For example, when users view data, such as the number of
items in stock, they may see the last snapshot taken of the data rather than
‘a current view. Because transactions aren't written to the database
immediately, there is a possiblity that simultaneous transactions could
interfere with each other. This inherent race condition, in which users do not
necessarily see the same data at the same time, means a NoSQL database
could never be used for handling financial transactions,
NoSQL databases also lack confidentiality and integrity. As NoSQL
databases don't have a schema, permissions on a table, column or row can't
be segregated. This can also lead to multiple copies of the same data, This,
‘can make it hard to keep data consistent, particularly as changes to multiple
ages 0160,oO
aa
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
Page ot 60
MMM
E-guide Content
MMM
tables can't be wrapped in a transaction where a logical unit of insert,
Update or delete operations is executed as a whole.
With more than 20 different implementations of NoSQL available, a lack of
standards also increases the complexities of keeping data secure.
Confidentiality and integrity have to be provided entirely by the application
accessing the NoSQL data, It is not a sound practice to have the last line of
defense for any valuable data at the application level. Application developers
are not renowned for implementing security features, and new code usually
means new bugs. Any requests sent to a NoSQL database need to be
escaped, filtered and validated, while the database itself needs to reside in a
hardened environment.
Interestingly some NoSQL projects are now starting to add back RDBMS-
type security features. Oracle, for example, added transactional control over
data written to one node. Cassandra supports transaction logging and
‘automatic replication, and MongoDB supports master-slave replication.
If scalability and availablity are the key database requirements for an
‘organization, then NoSQL may be the right choice for certain large data
sets. However, system architects should take a close look at their
requirements for security, privacy and data integrity before choosing a
NoSQL database. The lack of NoSQL security features, namely
‘authentication or authorization support, means that sensitive data is best
kept in a traditional RDBMS.
‘Next article< i PRO+
prea E-guide br
MMLC
Pr
Inthise-ouide ® How non-relational database technologies
Msecon ¢Reatonal free up data to create value
sees “ ‘Nick Millman and Pankaj Sodhi
MSecion2NoSOL databases “The proliferation of multiple nor-elational databases is transforming the
338 data management landscape Instead ot having to force structures nto
their data, organisations can now choose NoSQL database architectures.
that fit their emerging data needs, as well as combining these new
technologies with conventional relational databases to drive new value from
their information,
[AGetting more PRO+ essential
content p60
Until recently, data’s potential as a source of rich business insight has been
limited by the structures that have been imposed upon it. Without access to
the new database technologies now available, standard back-end design
practice has been to force data into rigid architectures (regardless of
variations in the structure of the actual data).
Inherently inflexible, these legacy architectures have prevented
‘organisations from developing new use cases for the exploitation of
structured and unstructured information.
‘The ongoing proliferation of non-relational database architectures marks a
watershed in data management. What is emerging is a new world of
horizontally scaling, unstructured databases that are better at solving some
apes60E-guide Content
MMLC
oO
aa
LLAMA problems, along with traditional relational databases that remain relevant for
In thise-guide others.
culm Teehnoogy as evolved to the extent that organisations need no longer be
Sesion feston! gate by alk of chai in database arcitectires As fronteunners
have moved to dently the database options that match the specifi data
hea ative hey enanges becoming increasingly prevalent Curing
secon NS taba bow
038
1. Arebalancing of the database landscape, asdata architects began to
‘embrace the fact that their architecture and design toolkit has
‘Getting more PRO+ essential evolved from being relational database-centric to also including a
content ps0 vatied and maturing set of non-relational options (NoSQL database
systems).
2. The increasing pervasiveness of hybrid data ecosystems powered by
disruptive technologies and techniques (such as the Apache Hadoop
software framework for costeffective processing of data at extreme
scale).
8. The emergence of more responsive data management ecosystems to
provide the flexibility needed to undertake prototyping-enabled
delivery (test-prove-industralise) at lower cost and at scale,
From now on, savvy analytical leaders will be seeking to crystallise the use
‘cases to which platforms are best suited. Instead of becoming overly
focused on the availabilty of new technologies, they wil identify the "sweet
spots" where relational and non-relational databases can be combined to
ccreate value for information above and beyond its original purpose.
Page 70160E-guide Content
MMLC
oO
aa
LLL” By taking advantage of the new world of choice in data architectures, more
Inthise-guide ‘organisations will be equipped to identify and exploit breakthrough
‘opportunities for data monetisation.
Sector Reston ‘Just as communications operators have created valuable B2B revenue
joeses a ‘streams from the wealth of customer data at their disposal, so better usage
_ of their existing data will empower other companies to build potent new
INsecton 2: NoSOL databases business models.
936
Implementing a rethink of how data is stored, processed and enriched
means re-evaluating the traditional world of data management. Until now,
[AGetting more PRO+ essential data has been viewed as a structured asset and a cost centre that must be
content p60 maintaned.
‘The availabilty of new database architectures means that this mindset will
‘change forever. Data management in a services-led world will require IT
leaders to think about how the business can most easily take advantage of
the data they have and the data they may previously have been unable to
harness.
Agile data services architecture
‘As more architecture options become available, data lifecycles will shrink
‘and become more agile. Rather than seeking to “over control” data,
‘approaches to data managemert will become much less rigid, One key alm
wil be to open up new possiblities by encouraging and facilitating data
sharing, Amazon stands out as a pioneer inthis field. By bulding a service-
oriented platform with an agile data services architecture, the company has
been able to offer new services around cloud storage and data management
ages 60Zon Co . PRO+
Seas E-guide Content
MMLC
LLL” ~ a8 well as giving itself the flexiblity needed to cope with future demand for
In thise-guide as yet unknown services.
Unprecedented accessibility to non-relational databases is reinvigorating the
Sesion | Relational role of conventional architectures and “traditional” data management
satabases p2 disciplines. From now on, analytics leaders will increasingly move to adopt
hybrid architectures that combine the best of both worlds to leverage fresh
‘Section 2: NoSL databases new insights from the surging volumes of structured and unstructured
36 information that are now the norm, In summary, there has never been a more
exciting time to be a data management professional
[AGetting more PRO+ essential SALIILLLLLLLLLLLILILLLLLLLLLLLLLLLLDLLLLLLLLLLL LLL LLLLLLALLLLLLLEDDALALLLLLLLED
content p60 Next article
ages.t60Oo Po
eer
Inthise-guide
INSection : Relational
databases p2
‘Section 2: NoSL databases
p36
[AGetting more PRO+ essential
content 60
age60ot 60
MM
E-guide Content
CMLL LLL LOLOL
J Getting more PRO+ exclusive content
This e-guideis made available to you, our member, through PRO+ Offers—a
collection of free publications, training and special opportunities specifically
gathered from our partners and across our network of sites.
PRO+ Offers isa free benefit only available to members of the TechTarget
network of sites.
Take full advantage of your membership by visiting
http://pro.techtarget.com/ProLP/