Big Data - Uncovering Hidden Business Value in The Financial Services Industry - GFT

GFT Technologies AG 2014
Big Data Uncovering Hidden Business

Value in the Financial Services Industry
Authors

Dr. Karl Rieder, Dr. Ignasi Barri and Josep Tarruella

Version 1.0
Published:
September 2014

Big Data Uncovering Hidden Business Value in the Financial Services Industry | GFT Technologies AG 2014 September 2014 Page 2/38
Table of Contents
1 Executive Summary ..................................................................................................................................... 4
2 Introduction ................................................................................................................................................... 5
2.1 So what is big data? .................................................................................................................................... 5
2.2 Its a big data world ..................................................................................................................................... 6
2.3 The financial services industry today .......................................................................................................... 6
2.4 Big data adoption by the numbers .............................................................................................................. 8
3 Big data opportunities in the financial sector ......................................................................................... 10
3.1 Retail banking ............................................................................................................................................ 10
3.2 Investment banking ................................................................................................................................... 14
3.3 Insurance ................................................................................................................................................... 17
3.4 IT efficiency ............................................................................................................................................... 19
4 The architecture of big data solutions ..................................................................................................... 20
4.1 Big data technologies ................................................................................................................................ 20
4.2 Commercial appliances ............................................................................................................................. 23
4.3 Impact on existing systems ....................................................................................................................... 23
4.4 Cloud architectures ................................................................................................................................... 24
5 Addressing big datas challenges ............................................................................................................ 25
5.1 Technology ................................................................................................................................................ 25
5.2 Security and data protection ...................................................................................................................... 26
5.3 Data quality................................................................................................................................................ 26
5.4 Internal organisation .................................................................................................................................. 27

6 First steps ................................................................................................................................................... 29
6.1 Getting started ........................................................................................................................................... 29
6.2 Establishing new roles in the company ..................................................................................................... 31
6.3 Strategies to follow lessons learned from successful big data projects ................................................. 32
7 GFTs Big Data practice ............................................................................................................................. 34
About the authors ............................................................................................................................................. 35

This report has been published based on a number of interviews with industry experts, secondary market research, and GFTs internal
expertise. The intention of the report is to render industry trends transparent and understandable within their context and to give
readers ideas for their businesses. The content has been created with the utmost diligence. Therefore, we are not liable for any
possible mistakes.
GFT Technologies AG
Executive Board: Ulrich Dietz (CEO), Jean-Franois Bodin, Marika Lulay, Dr. Jochen Ruetz.
Chairman of the Supervisory Board: Dr. Paul Lerbinger
Commercial Register of the local court (Amtsgericht): Stuttgart, Register number: HRB 727178
Copyright 2014 GFT Technologies AG. All rights reserved.

Big data success requires new skill
sets and processes as well as
technology
Agility is key for success in the 21st
century
1 Executive Summary
Between now and 2020, the digital universe will double in size every two years. Businesses are projected to
increase their investment in hardware, software, and services by 40% between 2012 and 2020 to keep up with
this expansion
1
, and the data-driven financial sector is likely to be strongly impacted. Banks will be looking to
use new technology to remain competitive in the face of a rapidly diversifying financial services landscape.
Investment in big data technologies will grow even faster, given the massive evolutionary leap it represents in
shifting IT focus from relational databases to more open and flexible platforms that enable the management of
huge volumes of data and provide new analytical and actionable capabilities. The key challenge will be to
employ these technologies in such a way that they meet IT needs and at the same time provide new value to
the business, delivering a real return on that investment in the form of an improved bottom line.
Big data technologies will enable broader and better data analysis than ever before, leading to targeted event-
driven, customer-centric marketing, improved fraud detection, better
risk calculation, and operational efficiencies. Agility is key for business
success in the 21st century, and the effective deployment of big data
projects is likely to be the difference between success and failure in an ever more competitive market.
This blue paper is a guide to the opportunities, requirements, and challenges of the big data revolution,
revealing the value that can be tapped from big data technologies. In
making the shift to the brave new world of big data, the financial
services sector will need to consider not only technology changes, but
new use cases, processes, and skill sets. The paper also includes a
set of recommendations that will help financial services organisations to successfully implement big data
technologies.

1
IDC Study Dec 2012: The Digital Universe in 2020: Big Data, Bigger Digital Shadows, and Biggest Growth in the Far East

Relational databases and their associa-
ted processes were not designed to
handle huge data volumes
The financial services sector has seen
more wide-ranging change because of
its extensive use of data
2 Introduction
The computer age has brought significant changes to every industry, and the financial services sector is no
exception. It has seen wide-ranging change mainly because of its
extensive use of data: customer data, market data, and trading data
are all central to the industry. From the automation of business
processes in the 1960s through the emergence of the Internet in the
1990s, and on to the current era of mobile banking and high-frequency algorithmic trading, success in financial
services has always been about the smart use of data.
However, the technological infrastructure on which todays banking systems were built is beginning to buckle
under the strain of the sheer volume of data. Relational databases and
their associated processes cannot effectively handle such a high
volume of information; the continuous tuning of multiple environments
and the number of resources needed to keep data extraction,
transformation, and loading (ETL) processes working every day combine to make these legacy systems
extremely expensive to maintain. Insurance companies, too, are discovering that they could significantly
improve their bottom lines through more effective use of data to control fraud.
2.1 So what is big data?
Big data is the buzzword of the current technological decade. The volume of data available to organisations is
so huge that it requires entirely new technologies and processes to address it. The concept of big data is
usually described by its four dimensions of volume, variety, velocity, and value, also called the 4 Vs.
2

2
1 Zettabyte = 1,000 Exabytes = 1,000,000 Petabytes = 1,000,000,000 Terabytes = 1,000,000,000,000 Gigabytes

Between now and 2020, the digital
universe will double in size every two
years
IT investment in the digital universe
infrastructure is forecast to grow by
40% between 2012 and 2020
2.2 Its a big data world
The ever-decreasing cost of storage means that there is essentially no physical barrier to the amount of data
retained; by 2020, it will cost less than $0.20 to store a gigabyte of data, down from $2.00 per gigabyte today.
3

It is anticipated that, by 2020, the size of the digital universe will have increased by a factor of 300 over 2005s
volume; in hard numbers, this represents an increase from 130 exabytes to 40,000 exabytes, or 40 trillion
gigabytes thats over 5,200 gigabytes of data for every man, woman,
and child on the planet. Between now and 2020, the digital universe
will double in size every two years, fuelled by the 2.5 quintillion bytes
of information we create every day.
4
As we enter the age of the
Internet of Things, in which all types of sensors, appliances, and systems are connected and controlled
remotely from smartphones and other mobile devices, we can expect the growth to increase further.
In order to support these volumes of data, IT investment in hardware,
software, services, telecommunications, and people what we can
collectively describe as the infrastructure of the digital universe is
forecast to grow by 40% between 2012 and 2020. Targeted areas that
specifically apply to the storage and use of data, such as storage management, security, big data ETL, and
cloud computing, will likely grow at an even faster pace.
5

2.3 The financial services industry today
Driven by the adoption of technology changes around mobile devices, cloud computing, social media, and big
data, the financial services landscape is evolving from a traditional model to a digital model alongside increased
competition from both traditional and non-traditional players such as telecom carriers, retailers, and electronic
payment providers.
We see four main trends that financial institutions need to address.
Regulatory compliance: In the last 10 years, the number and variety of regulations affecting the
banking sector have greatly increased. Currently, maintaining compliance directly or indirectly affects
around 50% of the discretionary IT budgets of financial services companies. In order to react to the
constant business change in the financial services environment while also meeting compliance
regulations, increased agility is required.
Customer centricity: The way customers interact with their financial institutions is changing
fundamentally for the first time in the history of the industry. New technologies are enabling a focus on
customers that both empowers them and helps meet their individual, customised needs.
Back-office modernisation: Banks, especially in mature markets, are beginning to suffer heavily from
aging IT infrastructures that still run back-office legacy systems such as core banking software. Many
major banks have already started to replace these systems with more modern technologies.

3
4
Mike Hogan, Big Data of your Own, Aug 2013.
5

Banking and insurance activities are
now just another digital service or
shopping experience
There must be a 360 view of each
customer, with a complete
understanding of wants and needs
Cost reduction: As with almost every industry, financial institutions are pursuing cost reduction
strategies in order to reduce operational expenses. These strategies translate into fewer branches,
fewer new hires, more cloud and outsourced services, and the rationalisation of existing resources
through shared-service centres such as
payment service hubs that are either
self-run or run by a third party.
All four major trends are shaping the future of the
industry, and all are driving big data adoption.
However, customer centricity and regulatory
compliance are particularly relevant. Customer
centricity is pushing much of the modernization of
retail banking and insurance, while regulatory
compliance is driving changes in investment
banking itself.
2.3.1 Customer centricity
The trend toward customer centricity, while not
new, will greatly change the banking business
over the next few years. Banks are now focusing
on multiple initiatives, especially around customer empowerment and customer understanding.
Mobile banking means that customers rarely need to visit a physical bank branch or meet with an insurance
agent in order to conduct routine financial transactions. Almost everything can be done on the web or via a
mobile device, which means that banking and insurance activities
have essentially been commoditised as just another digital service or
shopping experience. This trend is growing faster in some countries
than others, but as more digital natives become bank customers, the
adoption of digital banking will continue apace. As devices and apps become ever-easier to use, even late
adopters will shift more of their routine banking activities online. This commoditisation has had a dramatic effect
on the way individuals regard their financial institutions; because there is little or no human interaction between
the institutions and their customers, there is also little or no loyalty. People can change their bank or insurance
provider with greater ease than ever before.
To differentiate themselves, banks and insurance firms must develop truly personalised customer service. This
requires a 360
o
view of each customer, with a complete understanding
of each individuals wants and needs a particular challenge for the
insurance industry, which continues to operate a siloed system where
different types of insurance coverage offered (life, health, cars,
houses, etc.) live in separate worlds.

Increased regulation of the financial
services industry will lead to more
storage of historical data
The big data sector will grow at about
six times the rate of the overall
information technology market
2.3.2 Regulatory compliance

Over the last decade, there has been a significant push toward increased regulation of the financial services
industry, primarily in the area of risk control, but also in data
processing governance and auditing. Regulatory agencies are trying to
ensure that the industry acts responsibly and continues to provide
services regardless of market movements.
In order to meet regulatory demand, historical data must now be retained for seven years under the
requirements of the Dodd-Frank Reform Act, or for five years under the terms of the Basel Agreement.
Moreover, banks must have systems and processes in place to bring together these data to respond to
regulatory reporting requirements. In the case of a specific enquiry, the bank must be able to quickly sort
through the data to find all relevant information about a particular case; this requires data management far
beyond what the industry is currently doing.
2.4 Big data adoption by the numbers
In a study published at the end of 2013
6
, the European Information Technology Observatory (EITO) found that
financial institutions are leading the way in adopting big-data centric strategies. 92% of financial institutions
were identified as considering big data strategies (compared with only 40% across all industries); however, only
9% of financial institutions (and 4% of all industries) had actually implemented systems using these
technologies. Nevertheless, preparations do appear to be in process, with 38% of financial institutions (27%
across all industries) investing in improvements to their data storage facilities.
Clearly, the big data technology and services marketplace is set for
significant growth. At the end of 2013, IDC issued a prediction
7
that
the sector will grow at 27% CAGR to $32.4 billion by 2017 about six
times the growth rate of the overall information technology market.
Also in 2013, the IBM Institute for Business Value published an Executive Report
8
that summarised research
undertaken with Oxford Universitys Said School of Business across 1144 business and IT professionals in 95
countries, including 124 respondents from the financial services sector. The research focused on the use of big
data inside organisations and found, not unsurprisingly at this stage, that most initiatives were being
constructed around customer centricity (55%) and risk management (34%); the latter figure was significantly
higher for the financial services sector than for other business types, where operational improvements
outranked risk management.

6
EITO, Big Data in Europe: Evolution AND Revolution, December 2013
7
IDC,Worldwide Big Data Technology and Services 20132017 Forecast, December 2013
8
IBM Institute for Business Value and the Said Business School at the University of Oxford,Analytics: The real-world use of big data in financial services,
2013

90% of insurance firms have yet to
implement a company-wide big data
strategy
Insurance firms appear to be lagging somewhat in big data strategy adoption, according to research
undertaken by Bearing Point
9
. This survey found that 90% of insurance firms have yet to implement a
company-wide big data strategy, despite
more than two-thirds of participants
stating that big data would play an
important role in their future. The
research also revealed that, while 71%
said big data would be a top priority by
2018, less than quarter (24%) said their
companys big data maturity was
advanced or leading, and only 33% have
actually started a departmental or
enterprise implementation process.
While the financial services sector is still
in the early stages of big data strategy
adoption, there is general understanding
that the industry is at a crucial point and
has everything to gain from moving
forward with efforts to leverage big data
to improve its public image and deliver
excellence in customer service to its
customers and clients.

9
DataIQ News, May 2014

By bringing together disparate data
stores, organisations can begin to
derive new insights into their business
3 Big data opportunities in the financial sector
The availability of technologies and systems designed specifically for dealing with large volumes data opens
the door to the application of new approaches to common use cases, both evolutionary and revolutionary.
Evolutionary approaches might involve the development of complementary processes and architectures to
accelerate data processing performance when applied to massive stores of structured and unstructured data;
such approaches may meld elements of parallel and distributed computing to achieve the required outcomes.
A revolutionary approach, on the other hand, might see the complete restructuring of architectures and
processes to support entirely new ways of doing things. Different data
stores and warehouses could be refactored and consolidated into a
single raw golden data source, enabling the analysis of full data sets
rather than small samples or slices. By bringing together disparate
data stores, organisations could begin to derive new insights through the application of machine learning
algorithms.

3.1 Retail banking
According to the Millennial Disruption Index
10
, a survey of more than 10,000 Americans aged
18-33 conducted by Scratch (a division of Viacom), todays retail banking sector is at
exceptionally high risk for disruption. The four leading banks in the United States all appear in
the top ten lowest-rated companies in the country, with 53% of respondents seeing no difference between their
bank and any other. This disturbingly low level of loyalty is further underscored by the fact that 73% of
respondents would enthusiastically welcome offers of financial services from brands outside of the traditional
financial services marketplace such as Google, Amazon, Apple, and Paypal, among others. Retail banking is
vulnerable on many fronts.

10
Scratch-Viacom media networks, Millennial Disruption Index, 2014

Retail banks must put the customer at
the centre of their product and service
development initiatives
Client focus improves service,
relevance, customer satisfaction, and
loyalty
Customer-facing staff have a unified
history of interactions between the
company and each customer
3.1.1 Customer centricity replaces product centricity
Historically, banks and insurance companies have focused on developing product offerings, often at the
expense of understanding what their customers actually want. In this,
the financial services industry lags behind many other user-facing
sectors such as consumer goods. Retail banks must put the customer
at the centre of their product and service development initiatives.
This process can be made much easier through the development of a single, 360
o
view of each customer,
achieved by gathering information from across the organisation to better understand the services each
customer is using now and those that may be needed in the future. Big data technologies provide the means by
which large volumes of disparate data (accounts, consumer credit,
credit/debit cards, mortgages, etc.) can be brought together and
synthesised into a custom package of goods and services that will
best serve each individual customer.
When everything is focused on the client, service improves, relevance improves, customer satisfaction
improves, and loyalty will return. At the same time, cross-selling increases and customer churn slows.
3.1.2 Improved best offer process
Marketing campaign success rates can improve dramatically when banks put a relevant offer in front of a
customer at the optimum time for that customer to make a positive purchasing decision. Heres what the
traditional best offer marketing campaign looks like:

The problem with this type of campaign is that its effectiveness is very limited, largely because there is no real
alignment between the customers needs and the bank's offer. Here are a couple of ways this type of campaign
can be brought into the 21
st
century:
Unified vision: All staff in customer-facing positions (branches, call centres, etc.) are provided with a
unified history of interactions between the company and each customer. For instance, when a
customer enters the bank branch, the staff member meeting with that
customer can immediately see recent interactions such as reporting
the loss of a credit card or browsing the banks website to test out
different mortgage scenarios. This enables the staff member to have a
relevant discussion with the customer (about credit card protection or mortgages) rather than focus on
the product the bank has determined should be promoted to customers (for example, pension plans).
Dynamic offers: By making use of web and mobile channels, and the event log of online interactions
between the customer and the bank, dynamic and highly relevant offers can also be made to each
customer. For example, if the bank knows the customer is about to reach their credit card limit, the
system can be programmed to generate a custom-created offer for an increased credit limit or higher-

Big-data technologies can help by
enabling large volumes of data to be
stored and analysed rapidly
Big data also allows this type of
hindsight analysis to take place
and identify at-risk customers
level card account the next time that customer logs in. This model can be applied to any perceived
time-relevant customer need.
By aligning offerings with customer need, revenues are positively impacted. Big data makes such campaigns
possible by bringing together large amounts of data about customer behaviour, analysing them, and generating
offers that match the perceived customer need.
3.1.3 Client scoring
To effectively segment their customer base, banks apply a range of different client scoring methodologies. Of
particular interest is credit scoring, which measures the potential risk associated with lending to a particular
client. To make an accurate score, banks need to sort through large volumes of data and apply complex
algorithms to come up with a realistic risk factor for any particular individual. Some companies are even using
social network data (e.g. from Facebook) to measure potential credit risk.
Big data technologies such as in-memory databases can help with this
by enabling large volumes of data to be stored and analysed rapidly;
because the data is being held in memory rather than on a traditional
storage device (e.g. disk drive), access and processing times can be
reduced by an order of magnitude.
Client scoring methodologies can also be effective in improving up-sell and cross-sell revenues by detecting
potential opportunities and framing relevant offers for customers in the existing client base.
3.1.4 Customer retention and churn prevention
When banks understand why their customers are leaving, they can take appropriate steps to rectify the
situation and keep the customer in the fold. Big data also allows this type of hindsight analysis to take place
by providing insight into the customers behaviour prior to their ending
the relationship. Structured and unstructured bank data can be
combined with external sources (social media comments, media
coverage) to better understand brand reputation issues and identify at-
risk customers. Banks are then in a position to react in real time to these customers as they navigate through
the website or call centre and adjust their behaviours to focus on customer retention.
3.1.5 Credit card fraud detection
Credit and debit cards are the de facto worldwide standard for conducting secure payments. According to the
World Payments Report 2013
11
, the use of these payment methods increased by 15.8% for debit cards (124
billion transactions) and by 12.3% for credit cards (57 billion transactions). In the rapidly expanding field of
mobile payments, an increase of 58.5% per annum is predicted for 2014, equivalent to 28.9 billion transactions.

11
RBS & Cap Gemini, World Payments Report 2013

In 2013, credit card fraud represented
40% of the total number of fraud
incidents in banking
More complete data about customers
can prevent incorrect predictions and
lost customers
Banks have a great deal of aggregate
information that can be useful to other
types of businesses
3.1.5.1 Fraud control
The widespread use of any payment mechanism naturally also attracts widespread attempts to defraud that
mechanism. In 2013, credit card fraud represented 40% of the total number of fraud incidents in banking,
totalling approximately $5.5 billion.
12
Clearly, there is room for banks to
save significant amounts of money by improving fraud controls, and
big data can also help here. By sifting through the millions of payment
transactions made every day, combining these with data from other
internal and external sources, and analysing and understanding customer behaviour, investigators can
establish patterns and more accurately detect potential fraud quickly enough to minimize the damage. Big data
technologies allow this analysis to be done in real-time, as the transaction occurs, as opposed to batch
processing at the end of the day.
3.1.5.2 False positives
To prevent fraud, financial institutions frequently disable credit cards at the first sign of suspicious behaviour, a
move which is likely to lose them money and customers - if the prediction is incorrect. For example, a
customer of a bank who was traveling in China his first overseas trip
in several years activated the fraud alert system when he used his
credit card there. If the bank had had access to a more complete
picture of the customer that included, for example, recent payments (to
travel agents and airlines) or social network posts, they would easily have been able to verify the legitimate use
of the card because they would know he was travelling. Instead, the customer was embarrassed and
inconvenienced, and his relationship with his bank was negatively impacted.
3.1.6 New business models
The consolidation of customer and payment data can generate new business models and revenue
opportunities for retail banks:
The sale of non-identifiable data: Banks have a great deal of aggregate information, such as credit
card usage patterns (without identifying individual customers) that can be useful for other types of
businesses:

- Average expenditure on a given street or in a given area over a
particular period of time
- The times of day or days of the week when an area is busiest
- Where customers go when they leave a particular business location
Obviously, such data must be completely anonymous to be sellable.

12
Bank Systems & Technology, August 2013

Big data technologies allow banks to
streamline operational procedures and
compliance reporting
Banks can ensure a consistent view of
activity for business planning and
regulatory compliance
Creating solutions based on data: In addition to selling the data itself, banks can also create value-
added solutions based on the data. For instance, by analysing the behaviour of customers who pay
using POS devices, banks would be able to guide businesses to prioritise where they set up their
stores. This kind of service will allow thousands of small businesses to access information that was
previously only available to large corporations.

3.2 Investment banking
In investment banking, the trade is the central piece of information around which all other data
is mapped. For each of the tens of millions of trades a large investment bank has open at any
one time, the bank needs to process the trade through its lifecycle and calculate the related
profit or loss and the level of market and credit risk that attach to that
trade. With an exponential increase in the number of regulations
applied to stock trading, banks are struggling to report all the
information that is required. Big data technologies allow the bank to
manage more data and process it more quickly, thus streamlining operational procedures and compliance
reporting.
3.2.1 Consolidated view of trades
Over the decades, investment banking has generated a large number of silos for different types of products
fixed income, equities, foreign exchange, derivatives, etc. effectively preventing a 360
o
view of any aspect of
the whole business. Big data changes all that. Banks can now store huge volumes of data in a single data
warehouse, permitting the creation of a bank-wide trade repository. By taking advantage of distributed storage
techniques, banks can also store historic views of the data, enabling trade and position data to be consolidated
in a single data store.
By unifying data storage in this way, banks can centralise data
functions, rationalise storage architecture, and ensure a consistent
view of banking activity for both business planning and regulatory
compliance purposes. This also represents a huge cost savings for the
bank thanks to reduced redundant processes, systems, and personnel.
3.2.2 Flexible formats for trade repositories
When consolidating data from across a bank, IT departments will likely encounter a major challenge: how to
unify disparate data types in a single universal data model. For the consolidation to be successful, data
received from different business units using different front-office trading systems must be extracted,
transformed, and loaded (ETL) into a central data repository that uses a common model.
But whenever the upstream or downstream data format changes, which will happen because of constantly
changing requirements, the ETL system has to be modified. Using non-structured data storage, however, data
from different systems can be stored using the same approach, without the need to artificially fit any particular
record types into a universal data model.

Processes that used to be run as
overnight batches can now be com-
pleted in minutes
Market and credit risk can be estimated
more quickly and in greater detail
This not only removes the need to develop and maintain a multitude of ETL systems, but also gives the
platform flexibility to grow and evolve over time. Such a system is less costly to maintain, as fewer changes are
required to implement new functionality.
3.2.3 Trade analytics and business intelligence
Once the data is centralised, the bank can execute trade analytics to prepare management reports, meet
regulatory requirements, and effectively measure business
performance. By harnessing the power of big datas parallel computing
architecture, data processing can be completed orders of magnitude
more rapidly, enabling operations and reporting to be concluded faster
than ever before. Processes that used to be run as overnight batches can now be completed in minutes,
producing an almost-real-time system.
Advanced analytical and business intelligence capability can also be applied to a broader, more complete data
set, thus enabling the identification of new insights into improved business performance. Having more data at
their disposal and having access to tools with which that data can be analysed, banks can better measure and
understand their trading activity.
3.2.4 Market and credit risk calculation
Risk management applies to a number of different aspects of the investment banking business, including
market risk and credit risk, and operates on multiple levels: individual trader, desk-, department-, or enterprise-
wide. As the scope increases, so too does the volume of underlying data that needs to be considered.
Market risk estimates the potential effect of adverse market movements on portfolios of financial
instruments.
Credit risk represents the measure of the effect on the banks trades due to counterparty failure.
To calculate market and credit risk, banks must make a statistical estimate of the future development of their
portfolios. This is done by generating thousands of potential scenarios and evaluating the whole portfolio based
on these market changes, a process that requires an enormous
amount of computing power. Fortunately, this is a highly parallelisable
computation and one that lends itself perfectly to distributed
computing. The results of this evaluation can then be combined to estimate the market and credit risk with a
high degree of accuracy.
Historically, these processes have been run using grid computing, an expensive and technologically complex
approach; today, these same processes can be completed far more quickly and effectively using commodity
hardware. According to the SAP report Big Data for Finance
13
, risk management is most effectively enhanced
through the adoption of big data technologies to create a system-wide trading database for oversight and
compliance, following the proposed Consolidated Audit Trail (CAT) standard.

13
BigDataForFinance.com from A-Team Group report for SAP - 2012

Detecting patterns in trading data is an
ideal task for big data analytics
Big data management tools are
perfectly suited for counterparty risk
monitoring
3.2.5 Rogue trade detection
Rogue trading continues to hit the front pages with depressing frequency and remains an important operational
risk for investment banks. Typically, banks place trading limits on their employees to ensure that no one
individual can increase the banks exposure (measured by its market risk) beyond an accepted level, but in the
most recent case, the trader bypassed these limits by hiding false
trades among the millions of legitimate trades.
The only effective way to prevent this is to detect patterns in the
trading data an ideal task for big data analytics. By analysing the huge volumes of data associated with
todays hyperactive trading market, it is possible to uncover the patterns and correlations that are the tell-tale
signs of rogue trading. Big data brings both the ability to manage huge volumes of data and the capacity to sift
through that data to discover problems before they can get out of hand.
3.2.6 Counterparty risk monitoring
As we all know, the financial crisis of 2008 began with the collapse of Lehman Brothers on September 15, a
collapse which was brought about by the banks over-exposure to the subprime mortgage market. The knock-
on effect, however, meant that every bank doing business with Lehman was also impacted.
Banks regularly undertake counterparty risk monitoring to protect themselves against just such a situation by
analysing trading activity and measuring both direct and indirect exposure. To do this effectively, they need to
collect a broad array of market data, potentially including information
from sources such as social networks, to take the pulse of the market
and detect potential problems before they occur. Once again, big data
management tools are perfectly suited for this kind of task; they are
able to collect data from a wide variety of sources in a wide variety of formats and combine them under a single
umbrella. Once thats done, banks can develop algorithms to detect counterparty risk in order to take timely
action.
3.2.7 Regulatory reporting
One of the most onerous tasks facing investment banks today is regulation. In the last 10 years, a number of
regulations have been passed which require investment banks to measure and report activity in a consistent
and accurate way. Markets in Financial Instruments Directive (MIFID), Sarbanes-Oxley, Basel III, FATCA, and
Dodd-Frank are just a few of the more significant regulations currently affecting bank operations in Europe and
the U.S.
These regulations require banks to report across all their asset classes, necessitating bank-wide views of
operations and activities. This entails not only the management of huge volumes of data, but the rapid and
accurate processing of that data in order to deliver timely reports to the compliance authorities. Big data
technologies enable both distributed storage and computing, which together provide an effective framework on
which to build regulatory reporting systems.
One clear example of this is the Volcker Rule. As part of the Frank Dodd Reform Act, the Volcker Rule requires
banks to report on the inventory aging (the length of time an instrument has been held by the bank) of all their

Big data technologies provide an
effective framework for regulatory
reporting systems
Big data can create customised
incentive programs for drivers
positions. To make this calculation correctly, the bank needs to hold a
view of its tens of millions of positions for every day up to one year
clearly a big data problem. Parallel storage and computing enable the
consolidation of these data and the computing power to process them
effectively.
3.3 Insurance
The insurance industry, as with retail and investment banking, has much to gain from big
data. By gathering and analysing data about their customers, insurance companies can gain
significant advantage over their competitors.
3.3.1 Premium calculation
Until very recently, the insurance industry had used statistics based on global, generalised variables to
calculate insurance premiums. However, this resulted in a level and quality of service which was not
customised to its customers. Big data technologies permits the use of customer data to enable a more
complex, and personalised, mode of calculation.
Under the old system, if a customer went to buy car insurance, the company would ask for age, type of car,
annual distance driven, and a few other variables in order to calculate
insurance premiums. However, new insurance models, and big data
technologies, allow for the passive collection of data from the car itself
(driving style, speed, times and places of use, fuel consumption, roads used, etc.) which in turn enables the
application of pay-as-you-drive car insurance using real-time risk analysis. Such systems can also be used to
help drivers reduce their premiums by improving their driving skills, making them a lower insurance risk. This
data is an ideal foundation for the creation of special offers that reward drivers for reaching established safety
goals.
3.3.2 Fraud detection and prevention
Fraud has always been a major challenge for the insurance industry
14
; its the second most costly white-collar
crime in America, and its easy to see its impact on the market from these numbers:
The property-casualty insurance industry pays out about $20 billion a year in fraudulent claims
At least 10 percent of all property-casualty insurance claims are either inflated or outright fraudulent
Insurance fraud raises insurance premiums on the innocent by approximately $300 per household per year,
affects every type of insurance, and takes many forms, from underwriting fraud to staged accidents to
conspiracy. Every $1 invested in workers compensation anti-fraud efforts returned $6.17, or $260.3 million in
total, in California in 2006-2007.
15

14
http://www.maif.net/site/insurance/insurance-fraud-faqs/
15
California Insurance Department annual report 2007

Big data analysis can uncover and stop
the activities of fraud networks
Fraud is an equal-opportunity crime, undertaken by people from all walks of life and all echelons of society, and
conducted in many different ways. There is no standard methodology
for fraud. By mining data from a multiplicity of sources, big data
analysis can uncover hitherto inaccessible patterns and connections.
For example, the relationships between claimants can reveal a network of individuals engaged in the
perpetration of a fraud. Insurance companies are able to reduce the incidence and cost of fraudulent claims by
analysing relationships among subjects and broadening the pool of source data. By applying complex anti-fraud
rules in real time and rapidly analysing and processing huge volumes of heterogeneous data, companies can
detect potential fraud early.
3.3.3 Customer Segmentation
In a business where profits are made or lost by measuring the risk associated with each customer, knowing
each customer well is absolutely critical. As with retail banking, having a 360
o
view of customer activity enables
companies to segment customers and create personalised services. In the insurance business, these might
include tools to enable drivers to adapt their driving style to save fuel or reward customers with a drop in their
premium if they achieve certain goals. By identifying preferred customers, insurance companies can target
them for up-selling as well.
These are the kinds of incentives that directly benefit the customer, more closely link the client to the company,
and will tend to improve customer loyalty over the long term.

NoSQL databases make integrating
siloed data stores easier
Big data technology ensures easy
retrieval of transaction histories
3.4 IT efficiency
Across all industries, there are common use cases for big data technologies. Either by
making existing processes run more efficiently or by re-architecting processes completely,
new technologies can make important improvements to operational efficiency.
3.4.1 Improved operations
Banks and insurance companies typically operate hundreds of different systems to support hundreds of
different business processes, which spawn thousands of data integration processes. These processes sit for
the most part in siloed relational databases. IT processes have traditionally used ETL processes to transfer
data between those databases, but such a system is inefficient and expensive to maintain. Every single change
in a field in a table of a database can mean weeks of work to change the whole dependent chain of systems.
Systems that are built using big data technologies can greatly reduce
this problem by allowing the storage of data in heterogeneous formats,
thus not requiring that fields to be defined nor included in advance.
The integration process is therefore not affected by changes to the structure, and vice versa.
3.4.2 Database and system archiving
Regulatory bodies require that financial services companies maintain a history of their transactions for years
beyond the expiry of the initial transaction. They must not only keep the structured data from finance, risk, and
accounting systems, but also retain emails, instant messages, and other unstructured data to provide context
about those transactions. Historically, banks have done this by independently archiving the data and
documents from each system and retrieving them in response to a specific enquiry. However, this is a difficult
and time-consuming task when the required data is stored across dozens of different systems on different
platforms using different formats.
By using big data technology, banks can archive all their databases
and unstructured data sources in a single repository. This means that,
when an enquiry is received, the history of any particular transaction
can be quickly and easily retrieved by using the systems integral search capabilities a significant
improvement over todays fragmented processes.

Big data architecture is associated
more with Google and Amazon than
with traditional enterprise systems
Hadoop and NoSQL are at the heart of
big data technologies
4 The architecture of big data solutions
Big data technologies originated at Internet-driven companies like Google, Yahoo, Facebook, and Amazon. So
it is not surprising that these technologies seem to more naturally address the requirements of such companies
than of traditional businesses which do not have to solve Internet-scale problems as part of their core business.
However, the trends identified previously are causing this distinction to
blur, as brick-and-mortar companies search for ways to efficiently
manage the wealth of data now at their disposal.
Big data solutions are typically based around four core technology
trends:

4.1 Big data technologies
Three key technologies underpin all major current implementations of
big data architectures: the Hadoop open source framework, NoSQL
databases, and event management platforms to support the real-time
processing of large streams of data.

Distributed
storage

The ability to store huge volumes of data across multiple servers, with
essentially no cap on the amount of data that can be managed.

Distributed
computing

The ability to distribute (and speed up) the processing of that data
across multiple servers, breaking up large jobs into multiple smaller and
more manageable ones without loss of integrity.

Unstructured
data storage

The ability to manage data from a variety of sources and in a variety of
formats: relational databases, documents, system logs, data feeds,
social media, and more.

Real-time
analysis

The ability to analyse and process data as soon as it becomes available
instead of waiting for daily batch processing schedules.

4.1.1 Hadoop
Hadoop is an open source framework which employs distributed storage and processing using an adaptable,
reliable, and scalable programming model. It has been developed as an open source initiative under the
Apache umbrella that drives so much of the Internets infrastructure. Hadoop comprises two core modules:
HDFS the Hadoop Distributed File System, which manages the distribution and replication of data
throughout a cluster of data servers. Multiple copies of the data are held on different servers to enable
failover if any single server goes down, there is no loss of data.
MapReduce a programming model that supports distributed computing processes via a standard
Java API originally developed by Google.
Hadoop runs on commodity, low-cost hardware and offers a host of other software components, including;
HBase columnar data store for large data sets
Hive data warehousing and SQL-like query capability
Mahout machine learning and data mining capability
Pig high-level language for expressing data analysis
Flume data collection and loading
Sqoop data exchange for relational database connectivity
Zookeeper process coordination and synchronization
Oozie workflow management for Hadoop
By combining these software components, a wide array of technological solutions can be built to meet any big
data need.
4.1.2 NoSQL databases
The NoSQL name originally meant simply Not SQL, but it has evolved today to the more flexible Not Only
SQL. These databases, unlike relational databases:
generally cannot be accessed using SQL standards
do not for the most part support transactionality
are schema-less
are in some cases built on top of Hadoop in order to take advantage of distributed storage and
computing
The main NoSQL databases can be classified as follows
16
:
In-memory: provides a quick response for dynamic data. Example: Redis
Document-oriented: stores structured documents in XML/JSON formats. Examples: MongoDB,
CouchDB, MarkLogic
Key-value: stores data "key-value" pairs from data such as weblogs. Examples: Cassandra, HBase,
Redis (see in-memory databases above)

16
DB-Engines Ranking, 2014

Event management is an essential
component of an effective big data
solution
Search engines: databases used to generate indices that enable fast searches on the stored data.
Examples: Solr, Elasticsearch
NoSQL databases specialize in storing heterogeneous (both structured and un-structured) data side by side
and providing the tools to search and analyse them.
4.1.3 Event management
Sometimes the need to respond to events in real-time is even greater than the need to store those huge
volumes of data. In those instances, event management is an essential component of an effective big data
solution. Event management platforms such as CEP (Complex Event
Processing) and RTD (Real Time Decision) systems are able to
manage multiple types of events in a split second and determine
appropriate action based on the input received by the decision engine.

All major vendors are offering big data
tools
The data warehouse model will in
future be complemented by big data
systems
4.2 Commercial appliances
There are many commercial implementations of big data technology on the market already, which is not
unusual considering that the original technology was open-source. Today, the technology has matured to the
point where corporate services such as security and access control
are being added to delivered solutions, along with more formal
support, version control, and release management. This growing
maturity is reflected in the inclusion for the
first time of a pure big data player, Cloudera,
in the latest Gartner Magic Quadrant
17
for
Data Warehouse and Database Management
Systems.
As we can see, all the major vendors are
already offering implementations:
IBMs InfoSphere BigInsights
Oracles Big Data Appliance, which
combines relational and NoSQL
databases
SAP HANAs in-memory, NoSQL
database
HPs Vertica
EMCs Pivotal (acquired from
Greenplum)
4.3 Impact on existing systems
Will this technology render existing data warehouse and business intelligence environments obsolete? No, but
some tasks may be accomplished in a different way, such as the elimination of intermediate transformation
processes developed with classic ETL tools in favour of ETL based on Hadoop MapReduce or none
whatsoever, instead using NoSQL to store data with heterogeneous formats.
The data warehouse model will remain an essential part of many corporate processes that involve largely
structured data, such as financial and management reporting, but it will in future be complemented by big data
systems that will enrich those models with additional, particularly non-structured, data. Hadoop will be used to
collect all the data from the enterprise and permit analysis of the wider data set; there will still be data
warehouses, but not the need to create specific data marts for every
reporting purpose.
In this scenario, the big data environment will process all the
structured and unstructured information, remove the noise, add value,

17
Gartner Magic Quadrant for Data Warehouse and Database Management Systems, 2014

Financial services will likely continue to
lag in adopting cloud-based services
consolidate, and deliver the clean, high-value data to the data warehouse, which will provide context to the
business analyst working with that data.
The data warehouse will also continue to respond to the needs of recurring reports and information analysis;
the big data platform will provide new data not previously addressed, and enhance the value of the output
through automated decision-making.
4.4 Cloud architectures
No discussion of big data architecture is complete without acknowledging the role of the cloud. Use of cloud
computing, which provides computer and IT services as a utility, is growing rapidly across all industries.
The first wave of cloud computing focused on improving IT department efficiency and reducing the need to
manage ever-expanding infrastructures in-house. This offloading of resources from the physical business
environment will continue with the next area of focus, Business as a Service, which will further externalise
business services.
The conservative nature of the financial services industry, however, is likely to continue to act as a drag on the
expansion of cloud-based services. Gartner predicts that, while 90% of
organisations across the board will store personally identifiable
information in the cloud by 2019, only 60% of banks will be conducting
the majority of their transactions in the cloud. Cloud service providers are now responding to bankings
hesitance with new and innovative services which more closely address data security and compliance
requirements.

A structured, iterative approach will
deliver the best results
Big data systems become more
challenging as they grow
Specialised staffing is key for
successful deployment
5 Addressing big datas challenges
As one might expect, IT departments face a number of challenges in considering a shift to big data-centric
systems. In this section, we highlight four of the biggest issues organisations should pay close attention to in
planning their move.
5.1 Technology
The continuous evolution of technology is a challenge in and of itself; given the high rate of
innovation, many technologies never reach maturity and can quickly become obsolete. It is
important to make smart decisions about which technologies to bet on.
Big data systems are by their very nature distributed systems, normally with significant scale. This means that
software architects must be prepared to deal with issues like partial failures, unpredictable communications
latencies, concurrency, consistency, and replication in the process of
designing the system. These issues become increasingly more
challenging as systems grow to encompass the use of thousands of
processing nodes and disks, geographically distributed across multiple data centres. The probability of failure of
a hardware component, for example, increases dramatically with scale.
Scale also impacts the economics of big data projects. Big data applications can require a huge volume of
computing and storage resources. Regardless of whether these resources are covered by capital expenditure
or hosted by a commercial cloud provider, they will be a major cost factor and thus a target for budget or scope
reductions. A straightforward resource reduction approach such as data compression is a relatively simple way
to reduce storage costs. Elasticity is another way in which resource usage can be optimised, by dynamically
deploying new servers to handle increases in load and releasing them as the load decreases.
The successful deployment of big data technologies requires
specialised staffing, which can also be expensive, especially given the
immaturity of these technologies and shortage of specialist resources.
To mitigate the potential risks associated with scale and technology, organisations should adopt a systematic,
iterative approach to ensure that initial design models and technology selections can support the long-term
scalability and analysis needs of a big data application. A relatively
modest investment in upfront design can produce a major return on
investment in terms of reduced redesign, implementation, and
operational costs over the lifetime of a large-scale big data system.
Because the scale of such systems can prevent the creation of full-fidelity prototypes, a well-structured
software engineering approach is needed to frame the technical issues, identify the architecture decision
criteria, and rapidly construct and execute relevant but focused prototypes. Without this structured approach, it
is easy to fall into the trap of chasing after a deep understanding of the underlying technology instead of
answering the key go/no-go questions about a particular technology option.

Context is key when deciding who gets
access to what data and when
5.2 Security and data protection
It is often assumed that the primary security problem with big data is access management and
control. In reality, however, the loss of context that results when large amounts of data are
aggregated becomes a bigger issue because it impacts the ease with which the bank can
preserve granular rights for accessing those data. For example, a certain user may have access to certain data
entities (or tables), but which data (or rows) within those entities can be accessed? This is true for relational
databases as well, but, by combining all the data in one place, the problem is exacerbated.
Data access is not the only challenge. For example, even the most well-intentioned users can make mistakes,
such as deleting huge stacks of data, in seconds by accidentally
executing a distributed delete. Less well-intentioned users could
decide to lower the priorities of other Hadoop jobs in order to ensure
their job is completed ahead of the others or worse, kill those other jobs.
So how can an organisation keep carefully constructed data ownership and data context rules in effect without
killing the benefits that led it to choose a big data solution in the first place?
While a highly scalable architecture like Hadoop does make it possible to store context alongside data,
checking the entire context for each piece of data is an expensive proposition. Differential privacy, which aims
to provide a means by which organisations can maximise the accuracy of queries from statistical databases
while minimising the likelihood of any records being identified, is one concept thats being explored.
The need for security hasnt been lost on the largest vendor of Hadoop software and services. Cloudera has
been constantly improving the security of the platform and this year has made further acquisitions to improve
future security.
5.3 Data quality
Data quality is imperative in the financial services sector. In order to extract information and
insight from the data, its quality has to be well understood and the data has to be trusted. When
bringing a wide array of data together from across the organisation, it is especially important to understand the
relationships and dependencies between them.
A clear methodology must be established which includes core data management principles:
Data governance true accountability and commitment to quality can only happen with an effective
governance and control framework in place which identifies data owners and stewards
Data architecture understanding data architecture (relationships between data) and standardising
data interfaces between systems will streamline data flows and help identify and resolve the root cause
of many data quality problems
Data lifecycle identifying data quality issues at each stage of their lifecycle is key to understanding
data lineage: the sourcing, transformation, aggregation, and consumption of data
Reference data establishing consistent reference data will eliminate inconsistencies and ensure a
common view across the organisation

An agile structure is required to take
advantage of big data
Data analytics in order to fully empower business users and enable decision-making, the data must
be fully trusted and the systems must be highly usable. Rich, graphic dashboards, multi-dimensional
analysis and drill-down capability, and what-if scenario analysis are also required.
Data technology the optimum mix of technology which meets data quality business needs will enable
data profiling, matching, merging, and cleansing processes
Without high quality data which is fully trusted by the business users, no big data initiative can be successful.
By establishing a clear data governance model and methodology, quality can be controlled and the maximum
value can be extracted from the data.

5.4 Internal organisation
The financial sector consists primarily of large organisations with complex, hierarchical
structures that do not easily (or cheaply) change direction. A more agile structure is required to
take advantage of big data, and this may prove to be the most difficult challenge to overcome. Data siloes that
are scattered across the organisation must be merged, and new
staffing roles created, to deliver the essential 360
o
view of the
customer and trade data a big change from the way most large
financial services organisations function currently.
Embracing this change in organisational structure will involve several key decisions. The business must:
determine a strategy for big data deployment
assign responsibility for the collection and ownership of data across business functions
plan how to extract useful information from the data
prioritise opportunities
allocate data scientists time appropriately
host and maintain the IT infrastructure
set privacy policy and access rights
determine accountability for compliance with local data protection mandates
Organisations must plan carefully as they navigate their way towards big data readiness. Companies will likely
implement one of the following four organisational models:
Business unit led: when business units have their own data sets and scale isnt an issue, each
business unit can make its own big data decisions with limited coordination. AT&T and Zynga are
among the companies that use this model.
Business unit led with central support: business units make their own decisions but collaborate on
selected initiatives. Google is an example of this approach.
Centre of Excellence (CoE): an independent specialised division oversees the companys big data
initiative. Each unit pursues appropriate initiatives, guided and coordinated by the CoE. Amazon and
LinkedIn rely on CoE.

It is important to choose the right model
for your organisation
Fully centralised: the companys central operations take direct responsibility for identifying and
prioritizing big data initiatives. Netflix is an example of a company that pursues this route.
Companies must consider how they want to use its data and which
model is most appropriate for its organisation.

Approach 1: Test the new big data
technology in non-critical IT initiatives
Approach 2: Create a CoE that focuses
on building competence in big data
6 First steps
Incorporating new technologies and processes into a large, hierarchical corporate structure is a major
challenge, particularly when the expectations are as high as they are with big data.
6.1 Getting started
Financial institutions have already begun to explore leveraging big data to improve their business. We have
identified three common approaches which are representative of how banks are incorporating these new
technologies into their organizations:
IT operations project
This approach is the most common within large organisations which do not have good coordination between
business units. Here, big data infrastructure is directly created by IT that supports differing business areas in
order to test the technology in non-business-critical initiatives. These
test initiatives may focus on streamlining IT departmental functions
and operations but would not be related to establishing company-wide
architectures or new business models. Such initiatives can be used to improve operation times (itself a direct
business benefit), to understand the value the new technology brings in a low-risk environment, and to build
staff competency by enabling the development of familiarity with new systems.
This creates a foundation on which more visible initiatives can be built in the future and can be adopted more
centrally. However, this approach tries to make order from chaos, instead of planning and organizing from the
beginning.
Establishing a Big Data CoE
An innovation department exists within the structure of the corporation to bring new ideas to the organisation. It
generally has a high-level sponsor, is provided with a technology budget, and has at its core a dedicated team
that owns technological innovation within the company.
The innovation department creates groups of technologists which concentrate on specific topics; this often
results in a Centre of Excellence (CoE) of key resources who can build
out competence in these technologies. The technological resources
used by the CoE are initially provided by the corporate IT department
and vendor support teams until such time as specialist staff are hired and can take on the mantle of knowledge
ownership.
The CoE staff is then 100% dedicated to big data initiatives, responsible for extracting maximum value from the
technology and staffing investments. The Centres head must be able to bridge the gap between business and
technology to understand, manage, and communicate all facts of the project to all levels of the organization.
The centre itself must have:
a strong team with deep knowledge of all aspects of big data and good exposure to different
technologies

Approach 3: Spin-off competence in
order to become more agile
sufficient resources to fully focus on the initiative
a high-level sponsor within the organisation with a clear strategy for implementing big data projects
the ability to work with business and IT departments and being able to provide support into IT
a close relationship with an external provider that can bring an additional dimension to the project
clear goals and objectives, whether for internal (developmental) or commercial purposes
the ability to bring in resources from outside while the in-house capacity is being built out.
Creating an experimental spin-off
The CoE helps build infrastructure and capability in the technology and promotes it to the various business
units in the organisation, providing them with business value. As business uses are identified, projects are
proposed to and adopted by line-of-business teams (with support of the CoE) in an organic fashion.
These may grow to a level where they may be spun off into separate business units, rather like start-ups. This
type of operation will likely include teams of data scientists who will
leverage the branding, capacity, and funding of the business outside
the confines of the corporate structure, enabling them to remain
flexible, make rapid decisions, and deploy new business models at will.
This approach of externalising the banks technological competence and available data requires a mature
organisation and a well-developed business model, but it can also provide totally new revenue streams.
In reviewing the above examples, we can identify certain common characteristics:
It is unrealistic to expect a fast return on investment; initial activities must be perceived more as R&D
than fully rounded business plans.
These exercises have potentially far-reaching effects across the organisation. It is therefore imperative
that they have a corporate sponsor in the form of someone who can make decisions at a high level and
promote and defend the project in the C-suite.


Financial institutions are making major investments in big data, so there is clearly a strong belief within
the sector that there is value to be extracted from this data that may convey a competitive advantage to
the fastest movers.
Deriving value from big data means cutting across silos to bring together and analyse diverse data from
across the organisation and using new technology thats better suited to dealing with multiple types of
data.

6.2 Establishing new roles in the company
As these big data initiatives evolve, new roles need to evolve to promote cross-business communication and a
common focus to meet the corporations broad goals. Expect to see the following job descriptions begin to
emerge:
Chief Data Officer (CDO)
The CDO is responsible for all the data within the organisation and plays a significant role in overseeing the
whole data environment. Much emphasis is being placed on this role in investment banks today, because
they understand that good quality data and data analysis are fundamental to the business. The CDO must
also ensure that all applicable regulations regarding governance, security, and privacy are adhered to, and
that access controls are tightly managed.
Data Scientist
Data scientists play one of the most important roles in the deployment and management of an effective big
data solution; their focus is entirely on extracting optimum value from the data they manage, via
mathematical modelling to identify correlations between data, segment and generate predictive models,
mine the data, and more. Data scientists need to understand the business, the data, and the technology
although in the latter instance, excellent analytical capability is more important than the deep, detailed
architectural skills required to design and build code. Data scientists are more likely to take advantage of
end-user tools to extract meaning from the data.
Chief eXperience Officer (CXO)
When the customer is at the centre of the corporate strategy, all customer interactions with the company
through every channel must be managed globally. The CXO is responsible for developing and maintaining
the 360
o
view of the customer, much of which will be achieved through the deployment of big data
initiatives. While the CXO is not directly responsible for the data itself (thats the province of the Chief Data
Officer see above), he or she is responsible for optimising the customer experience through the
appropriate analysis and application of that data.

6.3 Strategies to follow lessons learned from successful big data projects
This paper has highlighted key use cases for big data in the financial services industry and how banks and
insurance companies have begun to implement them within their organisations. GFT, working with clients
across retail banking, investment banking, and insurance has gained significant experience in designing and
building big data systems. From this experience, and many successful big data projects, we have identified
certain common lessons learned:

1

Bring IT and
business together

IT must understand the business need and build a
cross-functional team to support short, mid, and
long-term goals. Involve business analysts who can
effectively bridge the gap between end users and
the core IT team.

2

Control
the data

Start out with only internal data. By minimising the
use of data from external sources that you cannot
control, you minimise the risks involved in uncertain
data quality. Bring all available data into a single
location, but dont filter it initially. Build user trust in
the data by directly involving them in the data
collection / provision process.

3

Define the right
technology

Take your time in deciding on the technology to be
used both now and into the future, but dont
overthink your decision. Remember that the
technology is dynamic, so focus on stability and
flexibility of integration with your current
infrastructure in choosing your starting platform.
Consult experts for an objective opinion.

4

Install the necessary
infrastructure

Although Hadoop clusters offer a highly scalable
infrastructure, think beyond your first project. Your
use of this platform will grow rapidly and you need
to ensure it will support that growth. Consult experts
to ensure the infrastructure is configured
appropriately for your performance requirements
and expectations. Budget appropriate resource
levels for monitoring and maintenance.


5

Choose your first
project carefully

The first project will have a great deal of visibility.
Work iteratively and build out functionality little by
little, but be sure to show return on value as early in
the process as possible to ensure continuous
executive sponsorship. Often, the maximum value
will come from merging data that is currently in
separate silos a clear case of 1+1 = 3. Create
mockups of a user interface and design to ensure
user acceptance.

6

Establish the right
team

Define the scope and responsibilities of your Centre
of Excellence. Complement in-house capability with
third-party support. Make sure you have a team that
encompasses both technology and business
knowledge to collectively meet the needs of the
project.

7

Gain operational
buy-in

Ensure that everyone is appropriately trained and
that IT operations are equipped to run the new
systems from both a functional and a technological
perspective. Systems with high usability which
effectively meet the business need are imperative.

GFT has 25 years experience in
financial services technology projects
7 GFTs Big Data practice
GFT helps financial services firms extract optimum value from the ever-accelerating deluge of data. We help
firms meet their regulatory requirements and better serve their corporate, retail, and institutional clients by
managing data in a more efficient way, which in turn reduces cost and risk.
We offer industry and business insight as well as the latest technology, tools, and techniques - designed to help
clients grow, scale, innovate, and compete. Our breadth of industry
knowledge, along with our rigorous approach to delivering quality,
strategic planning, client collaboration, project management, and
passion for innovation, allows us to continually add insight and value throughout the engagement process.
Employing big data technologies, GFT has successfully implemented a number of solutions for financial
institutions, enabling those institutions to extract greater value from the huge volumes of data they are
processing every day:
Debit/credit postings are calculated in real-time through trade message processors, managing
hundreds of millions of daily balances in different asset classes around the world
Regulatory reports are accurately calculated on a daily basis using the age of an investment banks
trading positions, trawling through four petabytes of data in just 20 minutes
Deep financial insight into bank customers spending habits is provided via a highly usable web
interface which allows them to sort, categorize, and visualize over 10 years of transaction data
The incidence and cost of fraudulent insurance claims is reduced by analysing relationships among
subjects and broadening the pool of source data.
5.5 million trades are stored on a daily basis in a trade repository, with the ability to hold more than
seven years of data (10 billion trades or approximately six petabytes of data), facilitating cutting-edge
deep-dive analysis
Fraudulent credit card transactions are minimised by implementing a set of rule-based filters and
adaptive pattern recognition methods through a logic engine to detect suspicious payments.
Millions of daily trade events are stored and managed centrally, reducing data duplication,
inconsistency, and process redundancy.

About the authors
This Blue Paper has been developed by specialists from our Big Data and Innovation practices:

Dr. Karl Rieder (GFT)

Karl Rieder is a Director at GFT and leads its Big Data Practice. He helps clients with
the challenges of business innovation, defining technology strategies and delivering
applications for retail and investment banking. He acts as the bridge between clients
who want to understand how new technologies can help them and the technology
experts who create engaging applications.
Karl has previously worked as a senior project manager and lead technical architect,
through which he has gained deep banking industry knowledge and more than 18
years IT management experience. He publishes a mobility and big data blog on
Finextra.
Karl holds a Bachelors Degree in Mechanical Engineering and a PhD in
Oceanography from the University of California, San Diego.

Dr. Ignasi Barri (GFT)

Ignasi Barri is a member of the Applied Technologies group at GFT.
His passion for business led him to study for a bachelors degree in business
administration. His fascination with technology and business helps him to
achieve his primary goal at GFT: to empower innovation across entire
organisations. His particular focus is on the identification and promotion of
outstandingly talented people with creative ideas.
He received his PhD in Computer Science from the University of Lleida,
Spain, in 2012.

Josep Tarruella (freelance)

With more than 14 years of IT experience, Josep has focused his career on
the field of data management. He has expertise in the management of teams
and projects in Business Intelligence, data warehouses, and ETL.
Josep has implemented projects in large corporations in many industry
sectors such as pharmaceuticals, banking, telecommunications, and
insurance. Through his work with technology consulting firms, he has played
many roles, including technical consultant, pre-sales consultant, and services
director.
He holds a Degree in Computer Science from the Polytechnic University of
Catalonia and an Executive MBA at the IESE Business School.



Contact:
www.gft.com/bigdata
bigdata@gft.com

Big Data - Uncovering Hidden Business Value in The Financial Services Industry - GFT

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Big Data - Uncovering Hidden Business Value in The Financial Services Industry - GFT

Transféré par

Droits d'auteur :

Formats disponibles

GFT Technologies AG 2014

Big Data Uncovering Hidden Business

Vous aimerez peut-être aussi