Vous êtes sur la page 1sur 44

Copyright 2014 Splunk Inc.

Using Splunk to
Protect Students,
Faculty and the
University
Chris Kurtz
System Architect
Arizona State University

Disclaimer
During the course of this presentation, we may make forward looking statements regarding future events or the
expected performance of the company. We caution you that such statements reflect our current expectations and
estimates based on factors currently known to us and that actual events or results could differ materially. For important
factors that may cause actual results to differ from those contained in our forward-looking statements, please review
our filings with the SEC. The forward-looking statements made in the this presentation are being made as of the time
and date of its live presentation. If reviewed after its live presentation, this presentation may not contain current or
accurate information. We do not assume any obligation to update any forward looking statements we may make. In
addition, any information about our roadmap outlines our general product direction and is subject to change at any
time without notice. It is for informational purposes only and shall not, be incorporated into any contract or other
commitment. Splunk undertakes no obligation either to develop the features or functionality described or to include
any such feature or functionality in a future release.

Additional Speaker Disclaimer: While I am speaking as an employee of Arizona State University, I do not speak for the
University nor dictate policy, procedures, or purchases. Any and all statements made in this presentation are mine
alone, and do not in any way represent an official statement from ASU. The opinions and comments contained herein
are entirely my own. ASU does not endorse or represent any product mentioned, up to and including Splunk.
2

Agenda
Introduction to me and Arizona State University
About ASU
About me
Our Environment and our challenges

Use Cases and Examples


Protecting Direct Deposit, two versions
Phishing as a teaching tool
Leveraging your institutional data with lookups and apps

Conclusion: Where weve been, where were going!

Introduction

Largest single University in the US


More than 80,000 active students

and another 20,000 accounts (faculty/staff, alumni, affiliates)


Located in Tempe, Arizona, suburb of Phoenix, 6th Largest City in US
Not located on the surface of the sunbut you can see it from here!

Obligatory About Me: Professionally


Unix/Linux System Administrator by trade, 23 years experience
Supported NASA/JPL Mars projects at ASU for more than 10 years:
TES & THEMIS Instrument onboard Mars Global Surveyor & Mars Odyssey
MTES Instrument on the Mars Exploration Rovers Spirit and Opportunity

ASUs Splunk Guy (System Architect) since early 2013


Splunk Video Interview Value of Higher Education and Splunk
Author of the ISO 3166 Splunk App more on this later!

Obligatory about me: Personally


Self-proclaimed Geek, whats it to ya?
Steampunk Enthusiast (I made my

hat, goggles, and the gun!)


Beginning Maker (Steampunk and
Arduino/Electronics)
xoff on #splunk on efnet
Little known fact about me:
Clyde Tombaugh, the discoverer of
Pluto, was a personal friend growing up
http://about.me/chk

First Google Apps for Education customer


Multiple campuses with a diverse IT infrastructure

Many organic, home-grown, custom, and proprietary systems


Large number of governing requirements:
FERPA, HIPPA, DARPA, DoJ, NASA, JPL, etc

Clear separation of responsibilities inside the University Technology


Office: the Information Security Office (ISO) does not have access to
the systems (and more importantly the logs) run by Operations

The Power of Splunk


Splunk as ASUs universal aggregator of all machine generated logs
Without
Splunk

Logs reside in multiple locations, depending on when and where the system was
installed: web logs in one location, system logs multiple others (depending on
OS); some are on single log concentrator and some in an old, slow, and
unsupported proprietary search database. ISO requests logs for incident. Ops
has to use the proprietary tools (or just as often, just grep through multiple
logfiles) based on ISO description and email/share logs. ISO likely has to revise
request at least once.
Typical response time to incident: multiple business days

With
Splunk

ISO directly accesses logs in Splunk, often using pre-built dashboards,


alerts, and saved searches. Ops can concentrate on Operations.
Typical response time to incident: minutes!

Splunk and Arizona State University


Infrastructure
Physical Indexers in Cluster
~14TB in Hardware RAID10
NFS for Cold (being phased out)
Architected for 1TB (10 indexers)
Search Head Pooling
3 virtual servers (12 CPUs, 32gb)
NFS SSD storage for shared data
Virtual Support servers:
Deployment Server
License Manager
Cluster Master

Licensing
750gb/day
Started at 50gb in November 2012
to 150gb in February 2013
to 500gb in June 2013
to 750gb in July of 2014
On track to reach 1TB this FY
The value of Splunk to the Information Security Office
has driven the rapid growth
but other groups are starting to see the value!

10

We didnt know
To ASU, Splunk was like the invention of the
microscope: we didnt know what we couldnt
see.
Martin Idaszak
Security Architect, Arizona State University

11

Protecting Direct
Deposit

12

Use Case: Protecting Direct Deposit


Being able to change your employee

information online is a great convenience, but


a target for hackers
Because of ASUs international students,
faculty, and staff, just blocking other countries
isnt acceptable
Splunk is the solution!

13

How we did itbefore Splunk.


1.

Payroll gets a call that an employee didnt get their direct deposit.

2.

investigates, sees a foreign bank depositand contacts the


Information Security Office.

3.

ISO changes the users password.

4.

ISO requests webserver single sign-on and HR system logs from


Operations and our HR Vendor (could take days!)

5.

Eventually details are discovered (compromised account) and user


is informed. Funds are long gone, and ASU has to re-issue the
employees check, eating the loss.
14

How we did it with Splunk, Version 1


Web auth

All unusual changes

DB records

IP

username

Geo tag
country

user

address

IP
username
state/country

1.

Logs from webserver single sign-on and Peoplesoft now go to Splunk. No more waiting on
Operations to retrieve logs! This makes both ISO and Ops very happy!

2.

Splunk monitors for Direct Deposit changes via a schedule search, building a transactions to
link the change back to the users webserver authentication. Ok, now we have an originating
IP and a usernameso we run geolocation on the originating IP so its easier to create
reports based on location of the change.
15

Version 1 stop here:


ISO creates a scheduled report of unusual originating IPs (Malaysia, etc) and sends it to
Payroll before the close of each payroll run.
Payroll contacts users with unusual changes for verification before payroll is run and if
it was a fraudulent change, the change is reverted, so no funds are lost.
Even at this point, Payroll is ecstatic and saves over 30 hours per payroll run reviewing
direct deposit, and ASU saves tens of thousands of dollars per payroll run!

16

NowHow do we improve this?


We asked the question: Where do you change your direct deposit from?

1. Home
2. Work
So, lets think about it:
If your direct deposit changes from Malaysia, its probably fraud
but what about Ohio, if you live in Arizona?

Thats likely fraud, too!


So lets leverage Splunks geolocation features!
17

Version 2 (now in progress)


1.

2.
3.

4.

Starting with the originating IP and username from Version 1we use a
custom lookup tables (more later!) to leverage HR system data, so we can
lookup a usernames information: Name, address, etc.
Geolocation information about the users home zip code (via the zip
code) is generated.
Using a free Splunk App called haversine, we calculate the distance
between the users home (technically, the lat/lon of the center of their
zipcode) and the lat/lon of the IP the change was made from. We realize
both of these are a bit vague, but were really only looking for scale.
If the distance is unusual (~50 miles) the result will be flagged for Payroll
review automatically.
18

Lessons learnedand you can do this too!


1.
2.

3.
4.

5.

GET YOUR DATA INTO SPLUNK!


One of the beautiful things about Splunk is that you can modify how the data appears
(field extractions, etc.) once its already in Splunk, and that applies to already indexed
data. The focus should be getting it into Splunk first, and figure out fields later. Think of
it as schema on demand!
When you find people who get it use them to evangelize Splunk to others in the
organization.
When you find people who resist, show them how much time and effort they can save,
especially interacting with other departments (if appropriate) by using Splunk. We won
several people over when they discovered that the number of requests from groups
like ISO dropped from 3-5 per week (each taking hours to do) to zero once the data
was in Splunk.
Dont get caught up on use cases: Once you have the data in Splunk, use cases
present themselves again and again! Think of it as use case on demand!
19

Flexibility
Its not only its schema-on-the-fly,
its use-case-on-the-fly.
- Barak Reeves
Splunk Sales Engineer, Team TK-421

20

Phishing as a
teaching tool

21

Use Case: Phishing as a teaching tool


As a public University, a large amount of our

information is mandated to be publically available,


including a directory of email addressesand we have
over 100,000 users, and each can have as many email
addresses as they want
This means ASU receives a lot of email: In fact, we used
Splunk to determine exactly how much. In the last 12
months, ASU received more than ONE BILLION email
messages, and more than 750 million of them were
spam and phishing!

As usualSplunk is the solution!

22

Mandatory
Pie Chart

Phishing and ASU


Inbound
Phishing
Email

Mail
Filter

Firewall

Email
Stored

Firewall blocks some


Some gets through

User clicks
on phishing link

ASU is hard to protect


ASU, as an entity, is very hard to protect. We have students from all
across the world, and by their nature, they are very transient: they
move apartments, dorms, travel the US and abroad, and access ASU
systems from almost everywhere. Unlike most corporations, we cant
assume that access to ASU from Nigeria, China, or Malaysia are hacking
attemptsin fact, its probably legitimate!
One of the very first things we saw with Splunk were logins on campus and from
India for the same user on the same day. What was this? Hacking? VPN? Multiple
people using the same login? Turns out Indian students often gave their passwords
(gasp!) to their parents, who insisted on it, so the parents could regularly check their
grades! This let another project to provide limited access to secondary accounts
(just for this purpose) know that their efforts were valid and necessary!
24

Use the data you have


To protect ASU from spam, we use Barracuda Spam
& Virus Firewalls, but there is no Splunk app (yet) so
we make custom field extractions from the
Barracuda logs.
but ASU does not store user emails in Splunk, only
the headers of the messages that transit our system.

Seems legit?
25

Do managers ever ask you


if a product is worthwhile?
We regularly use Splunk to
show that other products
are doing their jobs!)

Phishing and ASU


Correlate Firewall information with our mail logs
to get a list of every user who clicked on a
phishing link.

Table of user clicking bad link


Firewall log
IP

Email log
Bad URL

Email with link

user

CMDB for contact

and let your data combine!


BUTASU also uses Palo Alto firewalls to protect our users. These firewalls very often catch phishing URLs
that users click on, either via mistake or lack of understanding and we correlate that Palo Alto information
with our mail logs to get a list of every user who clicked on a phishing link.
The ISO can then directly contact the users who clicked on a phishing link, explain to them why they need to
change their password (and probably run a virus/malware scan), and use the opportunity to explain to the
user why what they did was bad. The users are thankful that the University is watching out for them, and
some of the potential victims have become our best reporting sources for received phishing and spear
phishing emails!

This too is being automated! We plan to use workflows to allow ISO to easily flag a
potentially compromised account in Splunk, which (via a REST API call to our authentication
system) is automatically disabled and (via another REST API) a ticket is created for the
helpdesk, so they can explain the situation to the user when they call in because their
password no longer works.
27

Version 2 (now in progress)


1.

2.

ISO actively follows phishing links (from a secure and isolated Virtual
Machine) and enters bogus credentials. We are now using Splunk to alert on
attempted logins using those honeypot credentials. These active hackers are
then blocked on the Palo Alto Firewalls in a quick but manual processthis
protects users who might click on the phishing. Eventually, we plan to semiautomate this using Splunk workflows that let ISO directly block several
different types of attackers from Splunk, using the Palo Altos APIs.
ASU is investigating using honeypot full email accounts that will be scraped
from the public directory and then sent spam/phishing attempts just like real
users. The plan is to use Splunk to index the entire email, so we will have the
full body of phishing and spam emails as well as headers. Phishing URLs
identified would be blocked using a workflow to the Palo Alto APIs, as above,
and the from addresses would be blocked on the Barracudas with their APIs.
28

Lessons learnedand you can do this too!


1.
2.

3.

4.

LEVERAGE YOUR DATA!


Combining data from multiple sources is amazing! We use data from the
Barracuda Spam Firewalls as well as the Palo Alto Firewalls to provide
multiple points of visibility into phishing.
Standardize your data! Follow Splunks Common Information Model so
that field names are consistent across data types. Once you realize that
src_ip, for example, exists in multiple datasets, the possibilities just jump
out at you!
Fill in the gaps. When you find gaps in your data models, work on how to
fill them in. For us, its the honeypot registrations and full-email indexing.
Once we realized full-email indexing was possible (and easy!) all sorts of
new use cases appeared!
29

Value of Splunk

This is the best tool weve seen in 10 years.


- Jay Steed
AVP for UTO Operations, Arizona State University

30

Leveraging your own


custom data

31

The Power of Splunk!


No schemas! This means if you need to alter your data structure (field
extractions, calculated fields, etc.) you can easily do it on the fly, and its
retroactive!
No types! Splunk really doesnt care if 42 is a string or a number, so you can
divide 42 by 7 and get 6, or add a string to make it 42 is the answer just as
easily to modify a field or make a new one on the fly.
Eval is your friend!
RememberIt doesnt matter if data is from a logfile, database, textfile, script
output, or anything elsecombine it in any way you want, on the fly!
Why mention this? Because as a Splunk Admin always remember: the data structure is mutable!
If it doesnt work for your needs, change it on the fly!
32

To correlate data,
you have to have data to correlate
Having data from machine logs such as mailservers and firewalls is
great, its the first (and easiest) data to get into Splunk.

Without a common key, there is no way to know that two pieces of


data refer to the same individual.
For ASU, the master datasource is the Data Warehouse. These
databases contain the records for every student and employee.

Does the email jbunbury@asu.edu belong to John Bunbury?

33

Lookups from Databases


Isolated Splunk server running Database Connect (DBX) runs SQL Queries on

several databases, and writes a series of lookup tables (with the affiliate ID)
every 4 hours
Linux ionotify monitors the lookup tables, and on write-close copies data to
production systems (sanity checking applies)

Data Warehouse

Isolated Splunk
running DBX

Production Splunk

100000001, jbunbury7, John Bunbury, dude2011@asu.edu, student


100000002, jbunbury, Jane Bunbury, jbunbury@asu.edu, employee
34

Problem is
Splunk (and most other applications) use the ISO3166 standard alpha-2 country codes (US for
United States, for example). This is standard for geolocation services in Splunk.
Butour Oracle Databases for Student data get the data from the students, often their
passports. And machine-readable passports use the ISO3166 alpha-3 country codesand
there isnt a simple conversion!
If the Country Code is not in the standard geolocation format, I cant do any geolocation, which
means the data is far less useful.
I looked on the Splunk Apps site (http://apps.splunk.com) but didnt find a solution

Country

alpha-3

alpha-2

United States

USA

US

China

CHN

CN

Nigeria

NGA

NG

35

So, I wrote the app myself!


Very simple structure, but so useful!
I took the online ISO 3166 country codes (3 kinds: alpha-3,
alpha-2, and numeric) and built a lookup table, which I
call in the dbquery search before outputting the lookup
table

Lookup Sample:
alpha-2,alpha-3,numeric
US,USA,840
CN,CHN,156
NG,NGA,566

| dbquery "PS PRD" "SELECT EMPLID,CITY,STATE,POSTAL,COUNTRY_CODE FROM EDS_ADDRESS" | dedup


EMPLID CITY STATE POSTAL COUNTRY_CODE | lookup iso3166 iso3166_alpha-3 as COUNTRY_CODE
| eval city=upper(substr(CITY,1,1)).lower(substr(CITY,2)) | rename STATE as region_name EMPLID as
affiliate_id POSTAL as postal_code iso3166_alpha-2 as country_code | eval
postal_code=if(country_code="US",substr(postal_code,1,5),postal_code) | table
affiliate_id,city,region_name,postal_code,country_code | outputlookup affiliate_to_address.csv
Why bother publishing as an app?
Because it might be useful to someone else, and at least 2 people have now said to me:
Wow, thanks, that solves my problem!
36

Building an App is simple!


http://wiki.splunk.com/Community:Creating_your_first_application
1.
2.
3.
4.
5.
6.
7.

In etc/apps, create a directory for your app, with appropriate subdirs


(default is mandatory)
All config files go in default nothing in local!
Write an appropriate default/app.conf (look at other apps)
Create a README file and other appropriate documentation.
Package and test on a generic Splunk install for sanity (hint .spl files are
just tgz files!)
Upload to apps.splunk.com if something isnt right, itll let you know.
Make sure to put the docs online!

My app took me about a day to do, including an obsessive amount of research on how to do it.
37

#splunk
It is days like today when I am stuck with a piece
of crappy software with horrible documentation
and support that I am very thankful that I spend
the rest of my time dealing with Splunk.
- David Shpritz (automine) Splunk IRC channel

38

Conclusion

39

The past and the future


ASU has heavily invested in Splunk because it solves many of our
outstanding issues, and a culture of how can we use Splunk to solve
this? is developing.
First round (FY14) of data onboarding concentrated on the needs of
the Information Security Office. Second round (FY15) is focusing on
Operations needs, with some interesting use cases thrown in as they
appear.
Splunk is expensive, but the savings in man hours, extreme flexibility,
use to validate other systems, and goals to replace antiquated
systems has very much paid off.
40

Get some help!


Splunk Docs (http://docs.splunk.com) I use Splunk docs so much I have a Chrome
shortcut to just search it. And if you do occasionally find something that is unclear, use
the links at the bottom to provide feedbackthe team is great at responding!
Splunk Answers (http://answers.splunk.com) I always look (and often post in Answers)
here before I contact support. Just looking at what others are posting is often just what
you need to rephrase the question to find the answers you need. The users who are on
answers are the true heroes of Splunk. In fact there is only one group better
The Splunk Wiki specifically http://wiki.splunk.com/Things_I_wish_I_knew_then
The #splunk IRC channel on efnet (http://wiki.splunk.com/Community:IRC) Ok, I admit
it, Im a Splunk IRC junkie. This group is just the besta great mix of Splunkers (aka
Splunk employees), customers, and professional services and hysterical to boot. Props to
the crew: Piebob, cgales, ^Brian^, DaGryph, Coccyx, amrit, Duckfez, Yorokobi, Madscient,
automine, starcher, jtrucks, and even Trex (a fellow ASUer).
Also check out @splunk, @splunkdev, and @splunkanswers on Twitter!
41

I look to the future because thats where Im


going to spend the rest of my life.
- George Burns

42

Questions and mentioned links


My Splunk App to do ISO 3166 translations:
http://apps.splunk.com/app/1775/

Free Splunk App to calculate distances on a globe (a Great Circle or


haversine calculation):
http://apps.splunk.com/app/936/
My Splunk Video:
http://www.splunk.com/view/SP-CAAAJPW

43

THANK YOU

Vous aimerez peut-être aussi