Vous êtes sur la page 1sur 216

Advanced Searching & Reporting with Splunk

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
1 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Document Usage Guidelines
• Should be used only for enrolled students
• Not meant to be a self-paced document, an instructor is needed
• Do not distribute

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution 1 June 2018

Advanced Searching & Reporting with Splunk


2 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Course Prerequisites
• Required:
– Splunk Fundamentals 1 (eLearning)
– Splunk Fundamentals 2

• Recommended: Minimum six months experience using the Splunk


search language

Note
In order to receive credit for this
course, you must complete ALL
lab exercises.
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
3 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Course Guidelines
• Hands-on lab exercises reinforce information presented in the
lecture modules
• To receive a certificate of completion for the course, you must
complete the lab exercises
• The lab exercises must be completed sequentially
– Later
lab exercises often depend on steps completed in previous lab
exercises

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
4 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Course Goals
• Search for events and create reports using:
– Subsearches
– Statistics,
data manipulation, and filtering
– Transactions
– Lookups

• Create and sort searches based on time


• Reformat the date/time field of returned events
• Search tsidx files

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
5 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Course Outline
Module 1: Beyond Search Fundamentals
Module 2: Using Subsearches
Module 3: Using Advanced Statistics
Module 4: Manipulating and Filtering Data
Module 5: Additional Data Manipulation Techniques
Module 6: Using Advanced Transactions
Module 7: Working with Time
Module 8: Using Advanced Lookups
Module 9: Searching tsidx Files
Appendix A: Putting It All Together
Appendix B: Another Time Search Example
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
6 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Callouts
• Scenarios Scenario
The online sales manager wants
to see the action, productId,
– Many of the examples in this course relate to a and status of customer
interactions in the online store.
specific scenario
– For each example, a question is
posed from a colleague or manager
at Buttercup Games
• Notes & Tips
– References for more information on
a topic and tips for best practices Note
Lookups are discussed in the
Splunk Fundamentals 1 course.

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
7 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Course Scenario
• Use cases in this course are based on Buttercup Games, a
gaming company
• Searches and reports are based on:
– Business analytics from the web access logs and lookups
– Internal operations information from mail and internal network data
– Security operations information from internal network and badge
reader data

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
8 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Buttercup Games, Inc.
• Buttercup Games, Inc.
– Is a multinational company with its
HQ in San Francisco and offices in
Boston and London
– Sells product mainly through its
worldwide chain of third party
stores, but also sells through its
online store

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
9 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Your Role at Buttercup Games
• You are a Splunk power user
• Your responsibility is to provide information to users throughout the
company
• You gather data and statistics and report on:
– Security
– IToperations
– Operational intelligence
– Etc.

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
10 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Buttercup Games Network
Index Description Sourcetype Host
web Online transactions access_combined www1
www2
www3
security Badge reader data history_access badgesv1
AD/DNS data winauthentication_security adldapsv1
Web login data linux_secure www1
www2
www3
sales Retail sales data vendor_sales vendorUS1
BI data sales_entries ecommsv1
network Firewall data cisco_firewall cisco_router1
Email data cisco_esa
Web security appliance data cisco_wsa_squid

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
11 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module 1:
Beyond Search Fundamentals

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
12 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module Objectives
• Review search modes
• Define search best practices
• Identify search troubleshooting aids

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
13 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Reviewing Search Mode – Fast Mode
index=web sourcetype=access_combined
• Emphasizes performance, returning
only essential and required data
• For non-transforming searches:
✓ Events – fields sidebar displays only
those fields required for the search
✓ Patterns index=web sourcetype=access_combined

✗ Statistics or visualizations
| stats count by action

• For transforming searches:


✗ Events
✗ Patterns
✓ Statistics or visualizations
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
14 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Reviewing Search Mode – Verbose Mode
• Emphasizes completeness by returning all possible
field and event data
• For non-transforming searches:
✓ Events – fields sidebar displays all fields
✓ Patterns
✗ Statistics or visualizations

• For transforming searches:


✓ Events
✓ Patterns
✓ Statistics or visualizations

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
15 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Reviewing Search Mode – Smart Mode (Default)
• Designed to give you the best results for your search
• Combination of Fast and Verbose modes
• For non-transforming searches [Verbose]:
✓ Events – fields sidebar displays all fields
✓ Patterns
✗ Statistics or visualizations

• For transforming searches:


✗ Events
✗ Patterns
✓ Statistics or visualizations
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
16 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Search Performance – Modes
Use the most appropriate search mode:
index=web sourcetype=access_combined
| chart count by product_name Note
These searches were run in our
lab environment, not a production
environment - your results may
Time range: last 365 days v a r y.

Mode Returned Results Events Scanned Time


Fast 14 566,731 1.82
Smart 14 566,731 1.91
Verbose 14 566,731 15.21

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
17 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
How Splunk Searches – Buckets
• As events come in, Splunk places them into one of the indexer's hot
buckets
– Only hot buckets are writeable
• Over time, a bucket progresses from hot, to warm, to cold
• Each bucket has its own index, earliest and latest time, and raw data
• When you search, Splunk:
– Identifies
the buckets for the search time range
– Searches those buckets for the requested data

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
18 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Search Heads and Indexers
• A search head deploys the user Search Head
search to the indexers
–When Splunk is configured to run on
multiple servers (a distributed Indexers
(Search peers)
deployment), indexers are also known
as search peers
• The search peers:
– Execute the search
– Send the results to the search head
Note
• The search head merges and This slide shows a distributed search deployment.
For more information about how to deploy Splunk,
returns the results from the search refer to the Splunk Enterprise System
Administration course or docs.splunk.com.
peers Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
19 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Data Sent to the Search Head
• Verbose mode causes the most data to be sent from the indexers
to the search heads
– Verbose mode returns all possible field and event data

• Generally, Fast and Smart modes cause much less data to be sent
from the indexers to the search heads
– Exception: For non-transforming searches, Smart mode and
Verbose mode send the same amount of data
• Using the fields command also limits data going to the search
head

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
20 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
General Search Practices
• As events are stored by time, _time is the most efficient filter
– Also helpful: host, source, and sourcetype
• Specify the index/indexes to improve performance
• The more you tell Splunk, the better the chance for good results
– Searching for sourcetype=x failure is better than failure
– Tomake searches more efficient, include as many terms as possible
– Use the fields command to extract only the fields you need
– Example: Search last 365 days, scans 566,720 events (in seconds):
index=web sourcetype=access_combined 15.16
index=web sourcetype=access_combined | fields clientip bytes referrer 4.49

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
21 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
General Search Practices – Wildcards
• Splunk only searches for whole words, but you can use wildcards
• Only trailing wildcards can make efficient use of the index
– Use fail* rather than *fail or *fail*
• Avoid leading wildcards
– *fail or *fail* scans all events within the time frame specified
• Wildcards are tested after all other terms

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
22 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
General Search Practices
• Inclusion is generally better than exclusion
for "access denied" is faster than
– Searching
NOT "access granted"
• Filter as early in your search as possible
– Removing duplicates then sorting is faster than
sorting then removing duplicates
• Know how case-sensitivity works, for example:
– Fieldnames and Boolean operators are case-sensitive
– Search terms and field values are not case-sensitive

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
23 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Transforming Search Commands
• A transforming command:
– Presents search results as a data table
– 'Transforms' specified cell values for each event into numerical values
that you can use for statistical purposes
– Is required to 'transform' search results into visualizations

• Transforming commands include:


– chart
– timechart
– stats
– top
– rare
– geostats
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
24 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Troubleshooting Aids
• For complex searches, you can search or replace items in the search box
• To search: ctrl-f (Windows) cmd-f (Mac)

• To replace: ctrl-f ctrl-f (Windows) cmd-f cmd-f (Mac)

RegExp Search CaseSensitive Search Whole Word Search


Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
25 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Troubleshooting Aids (cont.)
• Click after a bracket or parenthesis and a box encloses the
corresponding item

• Selecting an item in a search string causes all of the occurrences of


that item to appear highlighted

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
26 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Troubleshooting Aids (cont.)
• In Account Settings, Search
auto-format automatically puts
line breaks in when you type [ or
a | into the search box
– Only works for searches that you
type into the search box
– By default, this setting is Off

• Auto-format does not work for • Auto-format can also be done


searches that are: using a keyboard shortcut
– From search history – Windows: Ctrl + \
– Copied and pasted into the
– MacOS: ⌘ + \
search Generated
box for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
27 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Troubleshooting Aids (cont.)
• Show line numbers automatically
adds line numbers to the search box
• This can be useful for long searches
• By default, this feature is Off

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
28 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Troubleshooting Aids (cont.)
• A row in the search bar is not necessarily a single line
• For example:
– If you paste a long search string into the Search bar
– And the search string has not already been formatted with multiple
lines
– Then the search has one line number and spans multiple rows

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
29 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Troubleshooting Aids (cont.)
• After pasting the search string, you can force the string to appear
on multiple lines by either using
– The auto-format keyboard shortcut, such as Ctrl-\
– Position the cursor at each desired line break, and then press Shift-
Enter
• In addition to rendering the string on multiple lines, each line will be
numbered

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
30 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Troubleshooting Aids (cont.)
index=web sourcetype=access_combined
• Embed comments in a search action=purchase status=200
using the comment macro | timechart span=2h sum(price) as sales
| trendline sma2(sales) as trend
`comment(" Display total sales and trends")`
• Well-written comments can
provide context for a search
• Syntax:
• When debugging, it can also be
useful to comment out particular `comment("YourCommentHere")`
lines in a search
• Comments begin and end with
the back quote, or grave accent,
character

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
31 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Lab Exercise 1
Time: 5 minutes
Tasks:
• Log into Splunk on the classroom server
• Make CLASS: Advanced Searching & Reporting your default app
and change your account time zone setting to reflect your local
time
• Answer a set of questions concerning Splunk searches

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
32 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module 2:
Using Subsearches

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
33 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module Objectives
• Use subsearches to provide filtering and other information to a
main search
• Know when NOT to use subsearches
• Troubleshoot subsearches

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
34 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Filtering Through Many Results
Scenario
The Security Operations manager
wants a list of all IP addresses that
might have been used by people
trying to hack into the network
during the last 4 hours.

• Most hacking attempts begin with many failures from


one or more source IP addresses
• This search
– Counts the password failures by IP address (src_ip)
– Produces many results
• It would be good to filter these results to make them more
meaningful
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
35 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Filtering Through Many Results (cont.)
• Assume that internal IP index=security sourcetype=linux_secure
"failed password" src_ip!=10.* A
addresses are not likely hackers | stats count by src_ip
– For training purposes, assume | where count > 10 B

internal IPs are: 10.x.x.x



A Filter out src_ip != 10.*
• Assume that repeated failures
are the most likely hacking
attempts

B Search for counts greater than 10
• Still, it is unclear if individuals
using these src_ip values were
able to obtain network access
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
36 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Filtering Through Many Results (cont.)
A
• Needed: to search for index=security sourcetype=linux_secure "accepted"
AND ( src_ip=107.3.146.207 OR src_ip=128.241.220.82 OR
src_ip values associated src_ip=132.55.227.221 )
with: | table src_ip

– Repeated login failures


("failed password")
AND
– Successful access to the
network ("accepted")
• To do this, the src_ip B
values from search B need index=security sourcetype=linux_secure
"failed password" src_ip!=10.*
to be supplied to search A | stats count by src_ip
| where count > 10
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
37 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Subsearch Overview
Outer Search A
• A subsearch
index=security sourcetype=linux_secure "accepted"
– Takes the results from an AND ( src_ip=107.3.146.207 OR src_ip=128.241.220.82 OR
inner search B src_ip=132.55.227.221 )
| table src_ip
– Using a Boolean AND,
combines the results with the
outer search A
• A Boolean OR is inserted
between each of the inner
search results Inner Search B

index=security sourcetype=linux_secure
"failed password" src_ip!=10.*
| stats count by src_ip
| where count > 10
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
38 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Subsearch Overview (cont.)
• Subsearches: index=security sourcetype=linux_secure
"accepted"
– Typically begin with the search [ search index=security
sourcetype=linux_secure "failed
command password" src_ip!=10.*
– Are enclosed in square brackets | stats count by src_ip
| where count > 10
[search search-criteria] | fields src_ip ]
| dedup src_ip
– Are evaluated first, before the | table src_ip
outer search
– Use the fields or return
command to send only specified
fields back to the outer search

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
39 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
return Results from a Subsearch
• Command syntax: return [count] [alias=field…] [$field…]
• By default, only the first value for each specified field is returned

• To return multiple field values, specify a count


– For example, for the first five src_ip values: return 5 src_ip

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
40 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
return Field Name with Results
• Note how return src_ip causes the result to be formatted:
The field name is returned with the
field value

• To omit the field name from the search, add a $ before the field
name
• return $src_ip produces this result:

The field name is not returned with


the field value

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
41 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
return Results Using an Alias
• The returned result can be
renamed
return ip=src_ip

• If you want to see exactly


what a subsearch returns,
rewrite it as an independent
search and execute it

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
42 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Subsearch Caveats
• Subsearches are limited by both time and event count
• Default time limit = 60 seconds
– If
the subsearch continues to run after this time, it is finalized
– Only the events found during that time are returned to the outer search

• Default results limit = 10,000


– When the limit is met, the results are truncated (partial result set)

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
43 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
When to Use Subsearch
• Subsearches work best for small result sets
• In particular, they work well if one side of the search produces
sparse results relative to the other
Outer
– Use the "sparse" search as the inner search search

– The inner search will be used to filter results


from the outer search
Inner
search

Results

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
44 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
When Not to Use Subsearch
• For subsearches that return many results, it is generally much
more efficient to use stats and/or eval instead
– In general, subsearches take much longer than other types of
searches
– This can be confirmed using the Job Inspector

• Therefore, where possible, use stats and/or eval instead of


subsearch
is especially true of searches that are executed often – such as
– This
scheduled reports or searches that are executed from a dashboard

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
45 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
When Not to Use a Subsearch (cont.)
Scenario
The CSO wants a list of tailgaters • A "tailgater" is defined as a user who did not badge
during the last 4 hours.
into the building, but logged on to the network
• The tailgater followed another user into the building, who did badge in
• To find tailgaters, you must find two different sets of users:
– Users who logged on to the network
– Users who did not badge into the building

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
46 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
When Not to Use a Subsearch (cont.)
A Outer Search
• The outer search finds users
A

index=security sourcetype=winauthentication_security
who logged onto the network (EventCode=540 OR EventCode=4624)
C NOT<from the Inner Search>
• The inner search finds users
B
| stats count by User
who badged in to the building | fields - count

• The inner search is added to All the


the outer search users who
badged
• The subsearch returns ALL into the
building
C users who badged into the
B Inner Search
building
index=security sourcetype=history_access
– This is a relatively large Event_Description=Access
| dedup User
number | fields User
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
47 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
When Not to Use a Subsearch (cont.)
• The search works! index=security sourcetype=winauthentication_security
(EventCode=540 OR EventCode=4624) NOT
• However, it is NOT the best [search index=security sourcetype=history_access
way to obtain this data Event_Description=Access
| dedup User
• On networks with many | fields User]
| stats count by User
users, this search could run | fields - count
for a long time
–If this search is run often, it
could be resource intensive,
making the problem even
worse!

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
48 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
When Not to Use a Subsearch (cont.)
Scenario
The CSO wants a list of tailgaters
(index=security sourcetype=winauthentication_security
during the last 4 hours. (EventCode=540 OR EventCode=4624))
OR (index=security sourcetype=history_access Event_Description=Access)
| eval badge_access = if(sourcetype="history_access", 1, 0)
| stats max(badge_access) as badged_in by User
| where badged_in = 0
| sort User
| fields - badged_in

This search finds the same data significantly faster

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
49 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Troubleshooting Subsearches
• Click after a bracket or parenthesis A and a
A
box encloses the corresponding item B
• Run both searches independently to be sure
events are being returned and to gain an B

understanding of the data


– You should have some idea of what it is that you want
– Can frequently have results, but if the search is not constructed properly, the
results may not be correct
• Use format to display results of a subsearch as a single result

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
50 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Troubleshooting Subsearches (cont.)
You can also use the normalizedSearch property, found in the Search
job inspector, to see the subsearch results
1

….
3

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
51 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Lab Exercise 2
Time: 15 minutes
Task:
• Find the average and median sales total for website customers
who have experienced problems (i.e., http_status >= 400) when
trying to complete a web order during the previous week

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
52 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module 3:
Using Advanced Statistics

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
53 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module Objectives
• Perform statistical analysis using these stats functions: min, max,
mean, median, stdev, range
• Add sub-totals to search data using appendpipe
• Use eventstats to calculate additional statistics on search
results, after the search has been completed
• Use streamstats to calculate additional statistics on search
results, as each search result is returned

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
54 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Searching and Reporting Review
In the Splunk Fundamentals courses, you learned to use the
following commands:
– top, rare, and fields
– stats
 sum, avg, count, dc
 list, values
– addtotals
– eval
 if, round, tostring Note
– search, where This module discusses additional
commands that allow you to
create more advanced searches.
– chart,Generated
timechart
for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
55 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Transforming Commands – Statistical Functions
• The stats, chart, and timechart commands support the basic
statistical functions, mean, median, stdev
• range provides the
difference between
the minimum and
maximum values
for the given
field

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
56 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
appendpipe Command
Scenario
Find the number of non-business
related connections to the internet
for the last 24 hours, by user and
total attempts by usage.

• appendpipe
1. Takes the existing results and pushes them into the sub-pipeline
2. Then, appends the result of the sub-pipeline as new lines to the outer
search
• Results are displayed inline
• Example: appendpipe [stats sum(count) as count by usage]
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
57 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
appendpipe Command – Subtotals
Scenario index=network sourcetype=cisco_wsa_squid usage!=Business
Find the number of non-business | stats count by usage, cs_username A
related connections to the internet
| appendpipe [stats sum(count) as count by usage] B
for the last 24 hours, by user and
total attempts by usage. | sort usage

• Look for usage other than


A
Business
• Count number of connections B

by usage and user


• Below each group of usage
values, add a subtotal
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
58 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eval Command – Subtotals
Scenario index=network sourcetype=cisco_wsa_squid usage!=Business
Find the number of non-business | stats count by usage, cs_username
related connections to the internet | appendpipe [stats sum(count) C as count by usage A
for the last 24 hours, by user and | eval cs_username = "Total for usage of ".usage] B
total attempts by usage.
| sort usage

Use eval to provide a


description for the
accumulated field C

A B

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
59 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
appendpipe Command – Grand Total
Scenario index=network sourcetype=cisco_wsa_squid usage!=Business
| stats count by usage, cs_username
Find the number of non-business related
connections to the internet for the last 4
| appendpipe [stats sum(count) A as count by usage B
hours, by user and total attempts by | eval cs_username = "Total for usage of ".usage] C
usage. Include a subtotal of total | appendpipe [search cs_username="Total for*"
attempts by usage and a grand total. | stats sum(count) as count
| eval cs_username = "GRAND TOTAL"] D
| sort usage, count
• Can use multiple appendpipe
commands
• The second appendpipe adds a grand A
B
total to the end of the report
• If there is more than one page, the
C
grand total displays on the last page

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution D
Advanced Searching & Reporting with Splunk
60 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using count and list Functions
Scenario index=web sourcetype=access_combined
Sales asked for the top purchased action=purchase status=200
products by host over the last 24 | top product_name by host showperc=0
hours.

• This search produces a boring table in which the duplicate hosts are
distracting or difficult to read
• showperc=0 removes the percent column from the table

This is what they GET This is what they WANT

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
61 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using count and list Functions (cont.)
Scenario index=web sourcetype=access_combined action=purchase status=200
Sales asked for the top purchased | stats count by host, product_name
products by host over the last 24 | sort -count
hours. | stats list(product_name) as "Product Name", A
list(count) as Count B by host C

• count displays the number


of products by host C A B

• list displays the product


names and counts in the
same row
• Sorts ascending by host,
then descending by count
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
62 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using count and list Functions (cont.)
Scenario
Sales asked for the top purchased
products by host over the last 24
hours.

Sorted descending by
most active host, then
descending by count

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
63 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using count and list Functions (cont.)
Scenario index=web sourcetype=access_combined product_name=*
Sales asked for the top purchased | stats count by host, A product_name
products by host over the last 24 | sort -count B
hours.
| stats list(product_name) as "Product Name", C
list(count) as Count, sum(count) as total by host
sum(count) as total | sort -total
sums the count values for | fields - total

each host into a new field


A C B
called total
– Usedto sort
– Removed from results

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
64 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eventstats Command
eventstats
• Generates summary statistics
of all existing fields in your
search results
• Saves them as values in new
fields

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
65 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eventstats Command – Example 1
Scenario
For a new campaign, the online index=web sourcetype=access_combined action=remove
sales manager wants a display of | chart sum(price) as lostSales by product_name
products that are losing more | eventstats avg(lostSales) as averageLoss
sales than the average during the
last 24 hours.
| where lostSales > averageLoss
| sort -lostSales
| fields - averageLoss

• Find the aggregate ‘lost sales’ by product


• Use eventstats to calculate the average loss

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
66 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eventstats Command – Example 1 (cont.)
Scenario
For a new campaign, the online index=web sourcetype=access_combined action=remove
sales manager wants a display of | chart sum(price) as lostSales by product_name
products that are losing more | eventstats avg(lostSales) as averageLoss
sales than the average during the
last 24 hours.
| where lostSales > averageLoss
| sort -lostSales
| fields - averageLoss

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
67 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eventstats Command – Example 2
Scenario
index=web sourcetype=access_combined
The sales team want to know the lowest and
highest sales totals during the previous week
| timechart sum(price) as totalSales
– and on which days they occurred. | eventstats max(totalSales) as highest, A
min(totalSales) as lowest B
A Using min, determine the | where totalSales=highest OR totalSales=lowest
| eval Outcome = if(totalSales=highest,"Highest",
lowest value "Lowest") C
| eval Day = strftime(_time,"%A") D
B
Using max, determine the | table Day, Outcome, totalSales
| eval totalSales = "$".tostring(totalSales,"commas")
highest value
C Label highest and lowest
values
D
Label day of the week
D C
Note
A
The strftime() function is discussed in more B
detail later in the course.
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
68 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
streamstats Command
• The streamstats command is
similar to the stats command
• Like stats, it calculates
summary statistics on search
results
• However:
– stats works on the entire results
– streamstats calculates statistics
for each event at the time it is
seen
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
69 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
streamstats Command – Example 1
Scenario
index=web sourcetype=access_combined
Sales wants to monitor a moving average of the price
of a purchase on the Buttercup Games website over
action=purchase status=200
the previous 20 purchases during the last 24 hours. | stats sum(price) as order_price by JSESSIONID
| sort _time
| streamstats avg(order_price) as averageOrder
• Calculate the average bytes over current=f window=20
the past 20 events (window=20)
• Do not include current event in
summary calculations (current=f)

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
70 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
streamstats Command – Example 2
Scenario index=security sourcetype=linux_secure fail*
SecOps manager wants the network | stats count by user, src_ip
failures for the last 4 hours by user with the | sort -count
IPs, rank and number of failures. | streamstats count as ip_count by user
| stats list(src_ip) as suspectIPs,
list(count) as count,
list(ip_count) as ip_order by user

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
71 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Lab Exercise 3
Time: 45-60 minutes
Tasks:
• Display retail sales over the last 24 hours by category and product
name with total sales for each category
• Display the number of purchases during the previous week by host and
category, sorted ascending by host, then descending by count
• Identify the retail products with less than average total sales for the
previous week
• Report on the 3 most active network users last week by usage type.
Include user name, ranking, usage type, count, and total values
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
72 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module 4:
Manipulating and Filtering Data

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
73 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module Objectives
• Divide search results into different groups, based on values in a
specified field, using the bin command
• Regroup fields of search results, using xyseries
• Create a template for performing additional processing on a set of
related fields, using foreach
• Conditionally process search results, using case
• Filter search results, using where, isnull, like, and wildcards
• Convert epoch time data to a user friendly time string
• Mask data at search time, using replace
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
74 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
bin Command
bin [bin-options] field [as newfield]
• Puts continuous numerical values into discrete sets, or bins, by
adjusting field values so that all of the items in a particular bin
share the same value
• The bins option segregates the field values into the maximum
number of bins that you specify
• The span option sets the size for each bin (span=250)
– Ifspan creates more buckets than the max specified by bins, bins
is ignored
• bucket is an alias for bin
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
75 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
bin Command Example
Scenario index=web sourcetype=access_combined
An analyst in BizOps wants a list | stats sum(price) as totalSales by product_name
of products grouped by revenue | bin totalSales span=1000 A
range over the last 24 hours. | stats list(product_name) as product_name by totalSales
| eval totalSales = "$".totalSales

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
76 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
xyseries Command

xyseries x-field y-name-field y-data-field


• x-field is the field to use as the x-axis
• y-name-field is the field that contains the values to be used as
labels for the data series
• y-data-field is the field(s) that contains the data to be charted

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
77 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
xyseries vs. chart
• Generally, instead of xyseries, you would use
chart a over b by c
chart a over b by c is equivalent to:
stats a by b,c | xyseries b c a
• However, if you need to do some processing after the chart
command, use stats followed by xyseries

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
78 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
xyseries Command – Example
Scenario
index=web sourcetype=access_combined
TechOps needs to see yesterday's | bin _time span=1h A
hourly volume in MB for each
webserver. | stats sum(bytes) as totalBytes by _time, host
| eval totalBytes = round(totalBytes/(1024*1024),2) B
| xyseries _time, host, totalBytes C

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
79 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
foreach Command
• Another approach: use foreach index=web sourcetype=access_combined
| timechart span=1h sum(bytes) by host
• foreach runs a templatized, | foreach www* [eval <<FIELD>> =
round(<<FIELD>>/(1024*1024),2)]
streaming subsearch
– foreach replaces the
<<FIELD>> token with field
names that match a provided
value
– In this example, <<FIELD>> is
is replaced by www1, www2,
and www3, which match www*
– foreach * can be used to apply
the template to all columns
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
80 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Filtering Results – Review
• In previous courses, you learned that to filter results, you can use the
search and where commands at any point in the search pipeline
• search command (uses case insensitive field values)
– May be easier because you are familiar with basic search syntax
– Can use the * (asterisk) as a wildcard
– Allows searching on keywords
• where command (uses case sensitive field values)
– Can compare values from two different fields
– Can do a wildcard search on multiple characters (%) or simply on one
character (_); must use the like operator with wildcards
– eval functions are available
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
81 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
where Command
• where uses the same
expressions as eval to
evaluate field values
• Uses booleans to filter search
results and only keeps results
that are True
– Use where to compare two different fields, which you cannot do with
search
Note
To view all of where's functions,
please see:
docs.splunk.com/Documentation/S
plunk/latest/SearchReference/Wh
ere
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
82 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
where Command (cont.)
• Use where to perform case sensitive value searches
– sourcetype=access_combined | where action="purchase"
• Remember, search is not case sensitive
– sourcetype=access_combined action="purchase"
returns all variations of purchase, Purchase, PURCHASE,
pUrChAsE
• To use the search command to perform a case sensitive search
on keywords, use the CASE directive
– sourcetype=access_combined CASE(purchase)

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
83 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
where Command – Example
Scenario index=web sourcetype=access_combined
SalesOps needs to know which | timechart count(eval(action="changequantity"))
days over the previous week have
seen more remove actions than
as changes B , count(eval(action="remove")) as removals A

change quantity actions. | where removals > changes C

• Compare results of calculated fields


• Keep rows where the number of removal actions exceeds the
number of change quantity actions
C
A B

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
84 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
where Command Wildcards
• The like operator supports wildcards to find patterns:
_ single character, which can be repeated
% multiple characters
| where like (field, pattern) or
| where field like "pattern"
• Examples:
| where like (user, "%dmin%")
sysadmin sapadmin redmine itmadmin administrator admin

| where like (user, "__dmin%") Note


__dmin% contains a double
redmine underscore.

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
85 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
where Command like Clause – Example
Scenario index=security sourcetype=linux_secure
Display a list of user accounts that | where user like "adm%"
are like admin (adm%). | dedup user
| table user

• %dm, %dm% and _dm% all return index=security sourcetype=linux_secure


| where user like "_dm%"
an obvious user name: edmond | dedup user
| table user
• _dm misses sapadmin and
sysadmin

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
86 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
where Command – isnull/isnotnull
Scenario index=sales sourcetype=vendor_sales
A sales campaign manager wants | timechart span=15m sum(price) as sum
to know which 15 minute periods | where isnull(sum)
contained no sales during the last
24 hours.

• Use isnull to find events with an empty value for a particular field
• Use isnotnull to find events that contain a non-empty value for a
particular field

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
87 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eval Command – strftime
• Use strftime to convert epoch time to a readable format
strftime(X,Y)
• Example: eval Hour=strftime(_time, "%I:%p") 07:PM
• Some variables
Time Days Month
%H 24 hour (00 to 23) %d Day of month (01 to 31) %b Abbr month name (Jan)
%T 24 hour (HMS) %w Weekday (0 to 6) %B Month name (January)
%I 12 hour (01 to 12) %F %Y-%m-%d %m Month number (01 to 12)
%M Minute (00 to 59) %a Abbreviated weekday (Sun)
%p AM or PM %A Weekday (Sunday)

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
88 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eval Command – strftime: Example
Scenario
index=sales sourcetype=vendor_sales
SalesOps needs to know when, | timechart span=1h sum(price) as h_sales
during the past 24 hours,
hourly retail sales were more than | search h_sales > 100
$100. | eval h_sales ="$"+tostring(h_sales,"commas")
| eval Hour=strftime(_time, "%b %d, %I %p") A
| table Hour, h_sales
| rename h_sales as "Hourly Sales"

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
89 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eval Command – case Function
Scenario
index=network sourcetype=cisco_wsa_squid
SecOps found a potential virus on
| eval Risk = case(x_wbrs_score >= 5,"1 Very Safe",
a user's machine. Find and
classify the number of x_wbrs_score
>= 3,"2 Safe",
internet visits by risk during the x_wbrs_score
>= 0,"3 Neutral",
past 24 hours. x_wbrs_score
>= -5,"4 Dangerous",
x_wbrs_score
< -5, "5 Very Dangerous")
| timechart count by Risk

• case takes pairs of arguments X and Y


• Expressions are read from left to right and processing on the row
stops when it evaluates to TRUE
• If none are TRUE, the function defaults to NULL

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
90 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eval Command – case Function Example
Scenario index=network sourcetype=cisco_wsa_squid
SecOps found a potential virus on | eval Risk = case(x_wbrs_score >= 5,"1 Very Safe",
a user's machine. Find and
x_wbrs_score
>= 3,"2 Safe",
classify the number of
internet visits by risk during the x_wbrs_score
>= 0,"3 Neutral",
past 24 hours. x_wbrs_score
>= -5,"4 Dangerous",
x_wbrs_score
< -5, "5 Very Dangerous")
| timechart count by Risk

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
91 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
case Function – Setting the Default
• If you choose not to suppress index=network sourcetype=cisco_wsa_squid
| eval Risk = case(x_wbrs_score >= 5,"1 Very Safe",
the NULL value, reset it to x_wbrs_score >= 3,"2 Safe",
x_wbrs_score >= 0,"3 Neutral",
something meaningful x_wbrs_score >= -5,"4 Dangerous",
x_wbrs_score < -5, "5 Very Dangerous",
• To do this, use 1==1, 1==1, "Not Known")
"string" as the last match | timechart count by Risk

condition
• Since 1==1 is always true:
– All remaining events
match this condition
– "string" is applied to
these events

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
92 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eval Command – replace Function
• Use the replace function to replace field values at search time
using a regular expression ("RegEx")
• Very useful in masking data such as account numbers, IP
addresses, etc.
• Note that replace does not alter indexed data, but it does alter the
data in tables, charts, and reports for distribution to others
• Can use a regular expression in the replace function
• Example, mask an IP address
| eval clientip = replace(clientip,"(\d+\.\d+)\S+","\1.xx.xx")
Example output: 201.28.xx.xx
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
93 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eval Command – replace Function (cont.)
Scenario index=sales sourcetype=sales_entries
The Legal department wants to be sure | top AcctCode
customer account number data is not exposed.
Find the count of sales in the last hour by
| eval AcctCode = replace(AcctCode,"(\d{4}-).*","\1xxxx")
account number. Mask the account number.

• Sourcetype "sales_entries" contains a field, AcctCode in the


form dddd-dddd
• Using regex:
– Replace account code by retaining the first 4 digits and the hyphen
(\d{4}-)
– Replace what follows (.*) with xxxx
• Placing eval before top can be less efficient
– All events, not just the 10 values resulting from top, are masked
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
94 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eval Command – replace Function Example

Note
While the field in the report is
masked, the field is not masked in
the raw event.

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
95 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
eval Command – replace Function Example (cont.)
Scenario
Show sales information for the 3
best-selling products of the last 24 index=web sourcetype=access_combined
hours. Mask the middle octets of | chart sum(price) as totalSales over clientip by product_name
the customer IP address. limit=3 useother=f
| eval clientip =
replace(clientip, "(\d+\.)\d+\.\d+(\.\d+)","\1xxx.xxx\2") A
• limit=x - number of | fillnull

products to display
– Splunk automatically
uses the highest values
• Mask the middle
A

clientip octets with


xxx

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
96 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Counting Specific Field Values - Review
Scenario index=network sourcetype=cisco_wsa_squid
Report uptime percentage for the | timechart span=1d
last 7 days using digital field count(eval(s_hierarchy="DIRECT")) as Up, A
values.
count(eval(s_hierarchy="NONE")) as Down B
| eval Up%=tostring((Up/(Up+Down))*100,"commas") + "%" C
| rename _time as Day
Note | eval Day = strftime(Day, "%m/%d, %a")
The values "DIRECT" and "NONE"
are case sensitive.

A B C

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
97 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Lab Exercise 4
Time: 30-40 minutes
Tasks:
• Chart the products that have sold more than 50 units during the past week by product and
category
• List the failed attempts over the last 4 hours for accounts with user names similar to admin
• Graph the percentage of HTTP server errors that occurred on the e-commerce servers
over the previous week
• Evaluate and classify the size of events on the web servers during the last 24 hours as a
pie chart
• Display a count of sales over the last 4 hours and mask the vendors’ customer account
codes
**Challenge:
• Mask the values in both the raw event and the AcctCode field at search time
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
98 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module 5:
Additional Data
Manipulation Techniques

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
99 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module Objectives
Create tables and charts using the following commands:
• addtotals
• untable
• append
• appendcols

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
100 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Review: addtotals Command
Scenario
index=network sourcetype=cisco_wsa_squid
List the top 5 internal internet users over | chart count over cs_username by usage A
the last 24 hours by type of usage.
| addtotals B
| sort 5 -Total
• Use addtotals to add columns
across a line
• Chart the internet connections by user, then by usage, and use
addtotals to calculate the total connections, which is used for
sorting and filtering results
A B

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
101 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Calculating Highest X Values in a Data Series
Scenario index=web sourcetype=access_combined action=purchase
A sales campaign manager wants to | chart sum(price) by clientip, product_name
know yesterday's 5 best-selling limit=5 useother=f
products, and of those products, who | addtotals
are the 3 most active customers. | sort 3 -Total
| fields - Total

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
102 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Calculating Lowest X Values in a Data Series
Scenario index=web sourcetype=access_combined action=purchase
A sales campaign manager wants the 5 worst- | chart sum(price) by product_name, clientip limit=0
selling products, and of those products, who
are the 3 most active customers.
| addtotals
| sort 5 Total
| fields - Total
• The untable command | untable product_name, clientip, count
| xyseries clientip, product_name, count
converts results from a | addtotals
| sort 3 -Total
tabular format similar to | fields - Total
stats command output
• It is the inverse of xyseries

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
103 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
append Command
• Use the append command to
combine dissimilar searches
into a unified result set
• append command only runs
over historical data
– If
used in a real-time search,
does not produce correct
results

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
104 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
append Command – Example
Scenario index=security sourcetype=linux_secure "failed password" NOT "invalid user"
SecOps noticed an increase in | timechart count as known_users
penetration attempts. Compare | append [search index=security sourcetype=linux_secure "failed password"
the number of password failures "invalid user"
for known users vs. unknown | timechart count as unknown_users]
users over the last 4 hours.

• append adds the sub-search,which charts the count of unknown users over time
• Visualizations, such as Line Chart, overlay the _time–based output

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
105 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
append Command – Example (cont.)
Scenario index=security sourcetype=linux_secure "failed password" NOT "invalid user"
SecOps noticed an increase in | timechart count as known_users
penetration attempts. Compare | append [search index=security sourcetype=linux_secure "failed password"
the number of password failures "invalid user"
for known users vs. unknown | timechart count as unknown_users]
users over the last 4 hours.

The Statistics tab does NOT show the timechart output with an overlay

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
106 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
append Command – Example (cont.)
Scenario index=security sourcetype=linux_secure "failed password" NOT "invalid user"
SecOps noticed an increase in | timechart count as known_users
penetration attempts. Compare | append [search index=security sourcetype=linux_secure "failed password"
the number of password failures "invalid user"
for known users vs. unknown | timechart count as unknown_users]
users over the last 4 hours.
| timechart first(*) as *

timechart first(*) as * overlays the search results in the Statistics


tab

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
107 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Another Way to Do It: appendcols
Scenario index=security sourcetype=linux_secure "failed password" NOT "invalid user"
SecOps noticed an increase in | timechart count as known_users
penetration attempts. Compare | appendcols
the number of password failures [search index=security sourcetype=linux_secure "failed password"
for known users vs. unknown "invalid user"
users over the last 4 hours.
| timechart count as unknown_users ]

appendcols overlays the search results in one step

Note
When using appendcols, results
are only valid if there are NO
missing values for the "by field" of
both datasets.
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
108 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
appendcols and Visualizations
Scenario index=security sourcetype=linux_secure "failed password" NOT "invalid user"
SecOps noticed an increase in | timechart count as known_users
penetration attempts. Compare | appendcols
the number of password failures [search index=security sourcetype=linux_secure "failed password"
for known users vs. unknown "invalid user"
users over the last 4 hours.
| timechart count as unknown_users ]

appendcols causes the visualization to be formatted correctly

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
109 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
append Command – Caveat
Scenario
index=security sourcetype=linux_secure "failed password" NOT "invalid user"
SecOps noticed an increase in
| timechart count as known_users
penetration attempts. Compare
the number of password failures | append [search index=security sourcetype=linux_secure "failed password"
for known users vs. unknown "invalid user"
users over the last 4 hours. | timechart count as unknown_users ]
| rename _time as Day
| eval Day = strftime(Day, "%I")

If the _time field is modified, Splunk may no longer recognize it as _time-


based – and the visualization may not be formatted as expected

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
11 0
Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Caution!
index=web sourcetype=access_combined action=purchase
earliest=-1d@d latest=@d index=web sourcetype=access_combined action=purchase
| stats sum(price) as "Yesterday" by product_name earliest=-1d@d latest=@d
| append [search index=web sourcetype=access_combined | stats sum(price) as "Yesterday" by product_name
action=purchase earliest=-1h latest=now vs. | appendcols [search index=web sourcetype=access_combined
| stats sum(price) as "Previous Hour" by product_name] action=purchase earliest=-1h latest=now
| stats first(*) as * by product_name | stats sum(price) as "Previous Hour" by product_name]
| table product_name, Yesterday, "Previous Hour"

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
111 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Lab Exercise 5
Time: 30-40 minutes
Tasks:
• List retail sales by product and category for the last 24 hours with the total
sales for each product
• Provide a line chart for the last 7 days that shows online sales and lost sales
for each day
• List the top 10 employees, who have the most non-business related Internet
connections during the previous week; report number and type of visits with
10+ connection types
**Challenge:
• For the last 24 hours, report the 3 most active product categories and, of
those 3:
– The 5 best-selling products
– The 5 Generated
worst-selling products
for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
11 2
Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module 6:
Using Advanced Transactions

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
11 3
Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module Objectives
• Find the events logged before or after a particular event occurs
• Identify complete vs. incomplete transactions
• Analyze transactions

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
11 4
Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Transactions – Review
• A transaction is any group of conceptually related events
• The transaction command enables you to specify the criteria used
to determine how to group the events, for example:
– Ranges of time, e.g., maxspan, maxpause
– Number of events, e.g., eventcount
– Text
contained in the first and/or last event in a transaction, e.g.,
startswith, endswith

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
11 5
Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Transactions – Review (cont.)
• The events can come from multiple data sources
– Events related to a single purchase from an online store can span
across an application server, database, e-commerce engine
– One email message can create multiple events as it travels through
various queues
Individual Transaction
Events

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
11 6
Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Transactions – Review (cont.)
• transaction is resource intensive; stats is always faster
• Use transaction when stats is not sufficient, for
example:
– When field values are not enough to group events as desired, the
transaction command and its options can be useful
– To keep the raw data associated with each event, discarded by stats

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
11 7
Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Evaluating Events to Create a Transaction
• When a base search directly precedes the transaction command, the
search results are returned in reverse chronological order
– Thetransaction command requires that events be returned in reverse
chronological order

index=web sourcetype=access_combined
| transaction JSESSIONID endswith=(status=503) maxevents=5

Later

Earlier

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
11 8
Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Evaluating Events to Create a Transaction (cont.)
• If commands that precede transaction change the event ordering,
then the transactions are not guaranteed to be formed correctly
– Note that no error message is returned
• To correct this, use the sort command – for example:
| sort -_time
• The sort command must occur immediately beforethe transaction
command to reorder the search results in reverse chronological order

index=web sourcetype=access_combined
<other commands>
| sort -_time
| transaction JSESSIONID endswith=(status=503) maxevents=5

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
11 9
Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Finding Events that Occur Before an Event
Scenario index=web sourcetype=access_combined
SecOps is trying to track an issue | transaction JSESSIONID
with the web servers. Over the last endswith=(status=503) A maxevents=5
24 hours, find a 503 error and
display 4 related events before it.

• Form transactions based on common JSESSIONID values that end with an


event containing a status of 503
• Include up to 4 previous events leading up to the 503 error (maxevents=5)
1

2
3

5
A

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
120 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Finding Events that Occur After an Event
Scenario index=web sourcetype=access_combined
SecOps is trying to track an issue | transaction JSESSIONID
with the web servers. Over the last 24 startswith=(status=200) A maxevents=4
hours, find a 200 status code and
display 3 related events after it.

A 1

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
121 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Handling Common Values/Different Field Names
Scenario (index=network sourcetype=cisco_wsa_squid
A supervisor found sales ads on the OR (index=web sourcetype=access_combined (action=addtocart OR
printer. Find employees who are shopping action=purchase)))
from the online store while at work during | eval uniqueIP = coalesce(clientip,c_ip)
the last 60 minutes. | eval Actions=if(sourcetype="access_combined", action, null())
| eval Employee=if(sourcetype="cisco_wsa_squid", user, null())
| transaction uniqueIP
| stats list(sourcetype) as sourceType, list(Actions) as Actions,
dc(sourcetype) as Count by Employee, uniqueIP
| search Count > 1

• Use coalesce to normalize field names: coalesce(clientip,c_ip)


– coalesce takes clientip and c_ip and stores them as a single field
• Use eval to store the result from coalesce as a new field, uniqueIP
• Create a transaction based on the new field: Best Practice
For frequently used fields, instead
transaction uniqueIP of the coalesce function, use field
aliases to normalize field names.
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
122 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Handling Common Values/Different Field Names (cont.)
Scenario (index=network sourcetype=cisco_wsa_squid
A supervisor found sales ads on the OR (index=web sourcetype=access_combined (action=addtocart OR
printer. Find employees who are shopping action=purchase)))
from the online store while at work during | eval uniqueIP = coalesce(clientip,c_ip) A
the last 60 minutes. | eval Actions=if(sourcetype="access_combined", action, null()) B
| eval Employee=if(sourcetype="cisco_wsa_squid", user, null())C
| transaction uniqueIP A
| stats list(sourcetype) as sourceType, list(Actions) as Actions,
dc(sourcetype) as Count by Employee, uniqueIP
| search Count > 1

C A B

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
123 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Identifying Complete vs. Incomplete Transactions
Scenario index=web sourcetype=access_combined
TechOps needs to identify the success rate | transaction JSESSIONID startswith=(action="addtocart")
of transactions on their production web endswith=(action="purchase") keepevicted=1
servers over the last 24 hours. Evaluate
| search action="addtocart"
transactions and provide total attempted,
total successful, and percent completed. | stats count(eval(closed_txn=0)) as "Total Failed",
count(eval(closed_txn=1)) as "Total Successful", A
count as "Total Attempted" B
| eval "Percent Completed" = round(((('Total Attempted'
- 'Total Failed')*100)/'Total Attempted'),0) . "%" C
| table "Total Attempted","Total Successful","Percent Completed"

B A C
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
124 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Identifying Complete vs. Incomplete Transactions (cont.)
Scenario index=web sourcetype=access_combined
TechOps needs to identify the success rate | transaction JSESSIONID startswith=(action="addtocart")
of transactions on their production web endswith=(action="purchase") keepevicted=1
servers over the last 24 hours. Evaluate
| search action="addtocart"
transactions and provide total attempted,
total successful, and percent completed. | stats count(eval(closed_txn=0)) as "Total Failed",
count(eval(closed_txn=1)) as "Total Successful",
count as "Total Attempted"
| eval "Percent Completed" = round(((('Total Attempted'
- 'Total Failed')*100)/'Total Attempted'),0) . "%"
| table "Total Attempted","Total Successful","Percent Completed"

• Evaluate transactions based on JSESSIONID that


– Begin with startswith=eval(action="addtocart")
– Finish with endswith=eval(action="purchase")

• A completed transaction meets these criteria


Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
125 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Identifying Complete vs. Incomplete Transactions (cont.)
Scenario index=web sourcetype=access_combined
TechOps needs to identify the success rate | transaction JSESSIONID startswith=(action="addtocart")
of transactions on their production web endswith=(action="purchase") keepevicted=1
servers over the last 24 hours. Evaluate
| search action="addtocart"
transactions and provide total attempted,
total successful, and percent completed. | stats count(eval(closed_txn=0)) as "Total Failed",
count(eval(closed_txn=1)) as "Total Successful",
count as "Total Attempted"
| eval "Percent Completed" = round(((('Total Attempted'
- 'Total Failed')*100)/'Total Attempted'),0) . "%"
| table "Total Attempted","Total Successful","Percent Completed"

• An incomplete transaction is sometimes referred to as "evicted"


• To report on incomplete transactions, specify keepevicted=1
– This setting retains transactions where one or both of the beginning or ending
criteria are not satisfied
• Use the closed_txn field to determine if a transaction is complete or incomplete
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
126 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Identifying Complete vs. Incomplete Transactions (cont.)
Scenario index=web sourcetype=access_combined
TechOps needs to identify the success rate | transaction JSESSIONID startswith=(action="addtocart")
of transactions on their production web endswith=(action="purchase") keepevicted=1
servers over the last 24 hours. Evaluate
| search action="addtocart"
transactions and provide total attempted,
total successful, and percent completed. | stats count(eval(closed_txn=0)) as "Total Failed",
count(eval(closed_txn=1)) as "Total Successful",
count as "Total Attempted"
| eval "Percent Completed" = round(((('Total Attempted'
- 'Total Failed')*100)/'Total Attempted'),0) . "%"
| table "Total Attempted","Total Successful","Percent Completed"

• If a transaction completes ("closes") successfully, closed_txn is set to 1


• If a transaction did not complete successfully, closed_txn is set to 0
– In the transaction command, setting keepevicted to 1 causes Splunk to keep
data from transactions that did not complete successfully
– These are sometimes referred to as failed transactions
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
127 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Identifying Complete vs. Incomplete Transactions (cont.)
Scenario index=web sourcetype=access_combined
TechOps needs to identify the success rate | transaction JSESSIONID startswith=(action="addtocart")
of transactions on their production web endswith=(action="purchase") keepevicted=1
servers over the last 24 hours. Evaluate
| search action="addtocart"
transactions and provide total attempted,
total successful, and percent completed. | stats count(eval(closed_txn=0)) as "Total Failed",
count(eval(closed_txn=1)) as "Total Successful",
count as "Total Attempted"
| eval "Percent Completed" = round(((('Total Attempted'
- 'Total Failed')*100)/'Total Attempted'),0) . "%"
| table "Total Attempted","Total Successful","Percent Completed"

• A transaction is said to be closed if one of these conditions is met:


maxevents, maxpause, maxspan, startswith , endswith
• If no conditions are specified, all transactions are output, even when the
transactions are not closed
• A transaction can also be evicted when the memory limitations are reached
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
128 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Identifying Complete vs. Incomplete Transactions (cont.)
Scenario index=web sourcetype=access_combined
TechOps needs to identify the success rate | transaction JSESSIONID startswith=(action="addtocart")
of transactions on their production web endswith=(action="purchase") keepevicted=1
servers over the last 24 hours. Evaluate
| search action="addtocart"
transactions and provide total attempted,
total successful, and percent completed. | stats count(eval(closed_txn=0)) as "Total Failed",
count(eval(closed_txn=1)) as "Total Successful",
count as "Total Attempted"
| eval "Percent Completed" = round(((('Total Attempted'
- 'Total Failed')*100)/'Total Attempted'),0) . "%"
| table "Total Attempted","Total Successful","Percent Completed"

Reduce search results to transactions that include an addtocart event

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
129 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Identifying Complete vs. Incomplete Transactions (cont.)
Scenario index=web sourcetype=access_combined
TechOps needs to identify the success rate | transaction JSESSIONID startswith=(action="addtocart")
of transactions on their production web endswith=(action="purchase") keepevicted=1
servers over the last 24 hours. Evaluate
| search action="addtocart"
transactions and provide total attempted,
total successful, and percent completed. | stats count(eval(closed_txn=0)) as "Total Failed",
count(eval(closed_txn=1)) as "Total Successful",
count as "Total Attempted"
| eval "Percent Completed" = round(((('Total Attempted'
• Evaluate the - 'Total Failed')*100)/'Total Attempted'),0) . "%"
| table "Total Attempted","Total Successful","Percent Completed"
transactions and count
completed
(closed_txn=1) vs
failed (closed_txn=0)
• Count all transactions
and calculate
percentageGenerated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
130 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Comparing Transactions – Performance
Scenario index=web sourcetype=access_combined (action=purchase OR
TechOps needs to identify the success rate action=addtocart)
of transactions on their production web | transaction JSESSIONID startswith=(action="addtocart")
servers over the last 24 hours. Evaluate endswith=(action="purchase") keepevicted=1
transactions and provide total attempted, | search action="addtocart"
total successful, and percent completed. | stats count(eval(closed_txn=0)) as "Total Failed",
count(eval(closed_txn=1)) as "Total Successful",
• Make the search count as "Total Attempted"
| eval "Percent Completed" = round(((('Total Attempted'
before transaction - 'Total Failed')*100)/'Total Attempted'),0) . "%"
as efficient as possible | table "Total Attempted","Total Successful","Percent Completed"

• In this example, only the


purchase and addtocart actions are needed
• In our training environment, over the 30 days:
– Previous example searched: 108,529 events in 17.7 secs
– This example searched: 31,858 events in 4.1 secs
Extrapolate for for
Generated a mastinder
production environment
singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
131 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Transactions Are Expensive
• Creating transactions is a resource-intensive process
• When the transaction command is issued, the search is forced
to execute in verbose mode
– This is due to the fact that transactions require all of the event data
• Especially on very large scale deployments, try other means than
the transaction command to correlate data

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
132 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Transactions Are Expensive (cont.)
• This search executes in Smart Mode:

• This search executes in Verbose Mode:

• The execution time is the same (in our training environment, on


average, about 0.35 seconds)
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
133 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
stats and Transactions
Scenario
SalesOps is looking at the online index=web sourcetype=access_combined
transactions of a recent sale. (action="purchase" OR action="addtocart")
Provide the aggregate statistics on | stats range(_time) as duration by JSESSIONID
customer transactions during the | where duration > 0
last 24 hours. | stats count as "Number of Events",
min(duration) as "Minimum Duration",
• stats is more efficient at max(duration) as "Maximum Duration",
avg(duration) as "Average Duration",
computing aggregate statistics median(duration) as "Median Duration",
on transactions defined by data perc95(duration) as "95th Percentile"
| foreach Ave*, Max* [eval <<FIELD>> =
in a single field tostring('<<FIELD>>',"commas")]

• JSESSIONID is used here as well

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
134 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Lab Exercise 6
Time: 20-25 minutes
Tasks:
• Find related events that end with an HTTP status error
• Find related events that begin with an HTTP status error
• Search online sales and proxy server data yesterday for failed transaction
codes
**Challenges:
• Filter transactions to only display transactions with common failed status
codes from both source types
• Report the ratio of customer transactions that begin with an “add to cart”
action and end with a purchase compared to transactions that begin with an
“add to cart” action and do not end with a purchase
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
135 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module 7:
Working with Time

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
136 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module Objectives
• Use time modifiers
• Search for events using custom time ranges
• Search for events within a window of time
• Display and use relative dates

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
137 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Reviewing Time
• Override the time picker with the earliest and latest time modifiers
• With earliest and latest, specify relative times, using:
– Character strings to indicate the amount and unit of time
– (+/-) to indicate the offset from the current time

• Optionally, you can "snap to" a unit of time:


[+|-]time_integer time_unit@snap_time_unit
• Example: Find events containing "error" that occurred from yesterday
(snapped to midnight) to the last hour today (snapped to the hour)
error earliest=-1d@d latest=-1h@h
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
138 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Reviewing Time – Snapping
• Snapping always rounds down, not up
– Splunk snaps backward to a previous time, not after the specified time
• Examples:
– Ifit is 11:59:00 and you snap to hours (@h), you will snap to 11:00, not
12:00
– To "snap to" a specific day of the week, use @w0 for Sunday, @w1 for
Monday, etc.

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
139 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Reviewing Time – Values
• Current date & time: now
• Second: s, sec, secs, second, seconds
• Minute: m, min, minute, minutes
• Hour: h, hr, hrs, hour, hours
• Day: d, day, days
• Week: w, week, weeks
• Days of the week: w0 and w7 (Sunday), w1-w5 and w6 (Saturday)
• Month: mon, month, months
• Quarter: q, qtr, qtrs, quarter, quarters
• Year: y, yr, yrs, year, years
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
140 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
timewrap Overview
The timewrap command:
• Displays the output of the timechart command, so that each time
period is a separate series
• Can compare data over a specific time period, such as day-over-day or
month-over-month

Each line is a separate series

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
141 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
timewrap Syntax and Example
• Syntax:
timewrap timewrap-span
• timewrap-span can be second, minute, hour, day, week,
month, quarter or year
• For example:
timewrap 1w

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
142 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Displaying Data Week Over Week
index=security "failed password"
Scenario earliest=-14d@d latest=@d
Compare the number of password | timechart span=1d count as Failures
failures over the last week to password | timewrap 1w
failures over the previous week. | rename _time as Day
| eval Day = strftime(Day, "%A")

• Earliest to latest spans 14


days – i.e., 2 weeks
• Specifying 1w with
timewrap and using the
line chart visualization
causes two lines to be
displayed

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
143 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Displaying Data Week Over Week (cont.)
index=security "failed password"
Scenario earliest=-21d@d latest=@d
Compare the number of password | timechart span=1d count as Failures
failures over the last three weeks. | timewrap 1w
| rename _time as Day
| eval Day = strftime(Day, "%A")

• Add additional lines to the


chart by adding additional
periods to the search
• For example, adding an
extra week to the search
that was shown earlier
adds a line to the line
chart

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
144 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
What is "Normal"?
• Finding failures per hour is easy Scenario
What is the pattern of password

• However, it is not clear if these counts failures over the last 24 hours?

of password failures are higher or lower index=security "failed password"


earliest=-24h@h latest=@h
than normal | timechart count span=1h

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
145 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
What Is "Normal"? (cont.)
Scenario index=security "failed password"
What is the average number of earliest=-24h@h latest=@h
failures per hour for the last 24 | timechart count span=1h
hours? | stats avg(count) as HourlyAverage

• Get the average hourly count


• Use this value as "normal"
• Compare actual hourly values to
normal

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
146 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Comparing Hourly Data to an Average
Scenario index=security "failed password"
What is the pattern of password failures
earliest=-24h@h latest=@h
over the last 24 hours, compared to the | timechart count span=1h
hourly average? | eventstats avg(count) as HourlyAverage

• This is better but could


still be improved
• For example, suppose
that today's average is
not normal
• It would be better to
compare to an hourly
average, computed over
the last month
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
147 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Getting a Monthly Average
Scenario index=security "failed password"
What is the number of password failures earliest=-30d@d latest=@d
for each hour of the day, averaged over | timechart span=1h count as HourlyCount
the last 30 days? | eval Hour = strftime(_time,"%H")
| stats avg(HourlyCount) as AvgPerHour by Hour

• This provides a better idea of "normal"


• However, it would be helpful to overlay averages computed
over the last month with today's hourly password failures
For example, the average
failures at hour 16 (4 PM) over
the last month is
approximately 428.03333.

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
148 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Comparing Hourly Data to a Monthly Average
Scenario
index=security "failed password"
What is the number of password failures earliest=-30d@d latest=now
for each hour of the day, averaged over | eval StartTime=relative_time(now(),"@d")
the last 30 days?
| eval Series=
if(_time>=StartTime,"today","prior")
| timechart span=1h sum(eval(linecount>0)) by Series
| eval Hour = strftime(_time,"%H")
| stats avg(prior) as Average,
sum(today) as Today by Hour

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
149 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Comparing Hourly Data to a Monthly Average (cont.)
A • Using the relative_time
index=security "failed password"
function, the search calculates earliest=-30d@d latest=now
A | eval StartTime=relative_time(now(),"@d")
the time for 1 day ago
| eval Series=
– This value is stored as B if(_time>=StartTime,"today","prior")
StartTime
B • By comparing an event's _time to the StartTime, you can determine if the
event happened in the last 24 hours (today) or earlier this month (prior)
– For each event, this is
stored as Series

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
150 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Comparing Hourly Data to a Monthly Average (cont.)
C • In the timechart, exclude hours index=security "failed password"
after the current hour earliest=-30d@d latest=now
| eval StartTime=relative_time(now(),"@d")
– linecount is a default field that | eval Series=
if(_time>=StartTime,"today","prior")
contains the number of lines for an | timechart span=1h sum(eval(linecount>0)) by Series
event C

– For hours after the current hour, the


linecount = 0
• Because timechart
specifies by Series, and
Series can have two More formatting needs to be
values, two lines are done to create a visualization
created that shows data hour-by-hour

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
151 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Comparing Hourly Data to a Monthly Average (cont.)
• Recall that the goal here is to look index=security "failed password"
at data for each hour of the day earliest=-30d@d latest=now
| eval StartTime=relative_time(now(),"@d")
D • Therefore, use strftime to | eval Series=
if(_time>=StartTime,"today","prior")
calculate an Hour for each event | timechart span=1h sum(eval(linecount>0)) by Series
| eval Hour = strftime(_time,"%H") D
• Finally, calculate for each hour | stats avg(prior) as Average, E
sum(today) as Today by Hour
(by Hour) the: F

E – Average number of
failures for the month
prior to 24 hours ago
(Average)
F – Sum of today's
password failures
(Today)
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
152 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Other Calculations
• Instead of calculating an average per index=security "failed password"
earliest=-30d@d latest=now
hour for the month, use a different | eval StartTime=relative_time(now(),"@d")
statistical function | eval Series=
if(_time>=StartTime,"today","prior")
• For example, to show the 80th | timechart span=1h count by Series
| eval Hour = strftime(_time,"%H")
percentile per hour, substitute the | stats perc80(prior) as "80th Percentile",
perc80 function for the avg function sum(today) as Today by Hour

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
153 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Searching with Custom Time Ranges
Scenario index=sales sourcetype=vendor_sales earliest=-1d@d latest=-1d@d+6h A
A new campaign aimed at early | timechart span=1h sum(price) as hourlySales
morning sales is ongoing. Display | eval Hour=strftime(_time, "%b %d, %I %p") B
early morning retail sales for 12-6 | eval "Hourly Sales"="$"+tostring(hourlySales,"commas") C
am yesterday. | table Hour, "Hourly Sales"

B C

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
154 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
What are Default Time Fields?
• Events with timestamp information have date_* fields, which are
the time/date stamps directly from the events themselves
– date_* fields are only generated for events that include timestamps in
the raw event
– Events from scripted inputs and network inputs might not have these
fields
– Provide additional searchable granularity to event timestamps

date_second date_minute date_hour Warning


The default time fields do not
date_hour date_mday date_wday represent time zone conversions or
values changed at indexing time.
date_month date_year date_zone

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
155 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Searching for Events within Windows of Time
Scenario index=sales sourcetype=vendor_sales
A new campaign aimed at early earliest=-2d@d latest=@d date_hour>=2 AND date_hour<5
morning sales is ongoing. Display | bin span=1h _time
early morning retail sales for 2-5 | stats sum(price) as "Hourly Sales" by _time
am for the previous two days.
| eval Hour=strftime(_time, "%b %d, %I %p")
| table Hour, "Hourly Sales"

• earliest=-2d@d
– Begin at the start of the day, yesterday, which also includes today
• latest=@d
– Excludes events from the current day
• date_hour>=2 AND date_hour<5
– Find events between 2 am and 5 am (no time zone adjustment)
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
156 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Checking Your Data
Scenario index=sales sourcetype=vendor_sales
A new campaign aimed at early earliest=-2d@d latest=@d date_hour>=2 AND date_hour<5
morning sales is ongoing. Display
early morning retail sales for 2-5
| bin span=1h _time
am for the previous two days. | stats sum(price) as "Hourly Sales" by _time
| eval Hour=strftime(_time, "%b %d, %I %p")
| table Hour, "Hourly Sales"

What you see What you expected to see

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
157 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Remember Time Zones!
Scenario index=sales sourcetype=vendor_sales
A new campaign aimed at early earliest=-2d@d latest=@d date_hour>=2 AND date_hour<5
morning sales is ongoing. Display
| bin span=1h _time
early morning retail sales for 2 to 5
am for the previous two days. | stats sum(price) as "Hourly Sales" by _time
| eval Hour=strftime(_time, "%b %d, %I %p")
| table Hour, "Hourly Sales"

• Remember, date_time does not reflect your local time, but is the
value of time/date directly from the raw events
• To determine the time of your server:
1. In Account Settings, set Time Zone to Default System Timezone
2. Run a search over the last 15 minutes
3. Read the event timestamps and compare with your local time
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
158 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using strftime To Work with Time Zones
• Many organizations that span multiple time index=sales sourcetype=vendor_sales
zones normalize their data to UTC earliest=-d@d latest=@d
| eval my_hour =strftime(_time,"%H")
(Universal Time Coordinated) | table my_hour, date_hour

• In this case, it may be useful to display


data with the user’s time zone
preference
• Using the %H argument, the strftime
function:
– Getsthe hour from the event
– Converts the hour into your local time,
based on your time zone setting
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
159 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using strftime To Work with Time Zones (cont.)
Scenario
A new campaign aimed at early index=sales sourcetype=vendor_sales earliest=-2d@d latest=@d
morning sales is ongoing. Display
early morning retail sales for 2-5 | eval my_hour=tonumber(strftime(_time,"%H"))
am for the previous two days. | search my_hour>=2 AND my_hour<5
| bin span=1h _time
| stats sum(price) as "Hourly Sales" by _time
| eval Hour=strftime(_time, "%b %d, %I %p")
| table Hour, "Hourly Sales"

This is the same search shown earlier, modified to work in any time
zone

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
160 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using makeresults to Display Relative Dates
Scenario | makeresults
A sales manager wants a | eval previous_month_begin = relative_time(now(),"-1mon@mon")
dashboard for sales activity during | eval "Previous Month Start" = strftime(previous_month_begin, "%B %d, %Y")
the prior month, displaying relative | eval previous_month_end = relative_time(now(),"@mon-1m")
dates. | eval "Previous Month End" = strftime(previous_month_end, "%B %d, %Y")
| table "Previous Month Start", "Previous Month End"

• makeresults
– Generates one result with only the _time field
– Enables you to add new fields to the event with one or more eval
commands

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
161 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using makeresults to Display Relative Dates (cont.)
Scenario | makeresults
A sales manager wants a | eval previous_month_begin = relative_time(now(),"-1mon@mon")
dashboard for sales activity during | eval "Previous Month Start" = strftime(previous_month_begin, "%B %d, %Y") A
the prior month, displaying relative | eval previous_month_end = relative_time(now(),"@mon-1m")
dates. | eval "Previous Month End" = strftime(previous_month_end, "%B %d, %Y") B
| table "Previous Month Start", "Previous Month End"

• Using eval, find the start and


end of the previous month
now(),"-1mon@mon"
now(),"@mon-1m" A B

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
162 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using makeresults to Display Relative Dates (cont.)
• This search, by itself, is not particularly interesting
• However, when used in a dashboard, it can be useful

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
163 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Lab Exercise 7
Time: 35-40 minutes
Tasks:
• Compare the total daily online sales, week over week, for previous four weeks,
excluding the present week, for the Buttercup Games product “Puppies vs. Zombies
• Display the non-business internet activity during business hours Yesterday, in 1-hour
increments
• Determine the pattern of server errors on the web server over the last week,
compared to the daily average for the past four months
• Display daily online sales and the average sale amount between the hours of 9AM
and 5PM from Monday through Friday of last week
**Challenge:
• Display the number of yesterday’s password failures and the daily average during the
previous 30 days
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
164 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module 8:
Using Advanced Lookups

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
165 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module Objectives
• Include events based on values in a lookup table
• Exclude events based on values in a lookup table
• Build a baseline lookup table and reference the baseline values
in alerts

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
166 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Lookups – Review
• Previously, you learned how to
– Create,define, and use lookup tables
– Configure an automatic lookup
– Configure a time-based lookup
– Use the lookup in searches and reports

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
167 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using Lookup Tables to Include Values
Scenario
SecOps constantly monitors for
compromised accounts. Display
known users with more than 3
password failures during the last
60 minutes.

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
168 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using Lookup Tables to Include Values (cont.)
Scenario index=security sourcetype=linux_secure fail*
SecOps constantly monitors for ((user=admin) OR (user=mailman) OR (user=root) OR
compromised accounts. Display (user=sflaemmchen) OR (user=hsham) OR
known users with more than 3 (user=spahkthecah) OR (user=sscallion) OR … OR (user=fullian))
password failures during the last
| stats values(src_ip) as src_ip,
60 minutes.
count as failures by user
| search failures > 3

• Without a lookup file, you have to include all users in the search
• Or, you can use inputlookup to access the user lookup table and
pass values

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
169 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using Lookup Tables to Include Values (cont.)
Scenario index=security sourcetype=linux_secure fail*
SecOps constantly monitors for ((user=admin) OR (user=mailman) OR (user=root) OR
compromised accounts. Display (user=sflaemmchen) OR (user=hsham) OR
known users with more than 3 (user=spahkthecah) OR (user=sscallion) OR … OR (user=fullian))
password failures during the last
| stats values(src_ip) as src_ip,
60 minutes.
count as failures by user
| search failures > 3

((user="acurry") OR (user="admin") OR (user="adombrowski") OR (user="apreusig") OR (user="apucci") OR


(user="arangel") OR (user="basselin") OR (user="bgenin") OR (user="bhussain") OR (user="blu") OR (user="bsimmel")
OR (user="cberztiss") OR (user="cfarrell") OR (user="cganttchart") OR (user="cmunson") OR (user="cquinn") OR
(user="dhale") OR (user="djohnson") OR (user="dpiazza") OR (user="dtempesti") OR (user="edutra") OR
(user="emaxwell") OR (user="ewarwick") OR (user="ewilliams") OR (user="fbryan") OR (user="fullian") OR
(user="fyards") OR (user="gbottazzi") OR (user="gbowser") OR (user="gfacello") OR (user="gnooteboom") OR
(user="gvoronoff") OR (user="gzuyeva") OR (user="hsham") OR (user="iking") OR (user="jcappelletti") OR
(user="jreistad") OR (user="kjoslin") OR (user="kosullivan") OR (user="kpeha") OR (user="kpercy") OR
(user="kperna") OR (user="lsagers") OR (user="lsagers") OR (user="lteng") OR (user="madeyemi") OR (user="mailman")
OR (user="mkemmerer") OR (user="moh") OR (user="msluis") OR (user="myavatkar") OR (user="myuan") OR (user="npearce")
OR (user="nsharpe") OR (user="pbridgland") OR (user="pbunch") OR (user="pdabbeville") OR (user="pleuchs") OR
(user="podessa") OR (user="ptoscani") OR (user="rerde") OR (user="rjayaraman") OR (user="root")
OR (user="rroberts") OR (user="sflaemmchen") OR (user="showser") OR (user="sle") OR (user="spahkthecah") OR
(user="sscallion") OR (user="svoronoff") OR (user="swrappe") OR (user="syoungin") OR (user="tcugina") OR
(user="tzielinski") OR (user="yowen")
Generated OR(mastinder.singh@jpmchase.com)
for mastinder singh (user="yschonegge")) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
170 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using Lookup Tables to Include Values (cont.)
Scenario index=security sourcetype=linux_secure fail*
SecOps constantly monitors for [inputlookup users.csv]
compromised accounts. Display
| stats values(src_ip) as attackerIP,
known users with more than 3
password failures during the last count as failures by user
60 minutes. | search failures > 3

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
171 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using Lookup Tables to Exclude Values
Scenario index=security sourcetype=linux_secure fail*
SecOps is finding an increase in NOT [inputlookup users.csv]
penetration attempts. Find
unknown users with more than 3
| stats values(src_ip) as attackerIP,
failed logins within the last 60 count as failures by user
minutes. | search failures > 3
| sort -failures

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
172 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using Lookups in Alerts
Scenario index=security sourcetype=linux_secure "failed password"
SecOps is finding an increase in penetration earliest=-30d
attempts. Build an alert that fires when a user | stats count by user
exceeds the average daily failed login | eval daily_average=round(count/30)
attempts, using a 30-day sampling window | fields - count
within a 24-hour period. | outputlookup averages.csv createinapp=true

Step 1: Step 2:
Define populating Define alert
search for lookup

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
173 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using Lookups in Alerts – Step 1
Scenario index=security sourcetype=linux_secure "failed password"
SecOps is finding an increase in penetration earliest=-30d
attempts. Build an alert that fires when a user | stats count by user
exceeds the average daily failed login | eval daily_average=round(count/30)
attempts, using a 30-day sampling window | fields - count
within a 24-hour period. | outputlookup averages.csv createinapp=true

Step 1. Build a saved search that populates/updates the CSV file daily
• Create a daily average daily_average=round(count/30,0)
• Save results to a lookup table as specified by a filename (.csv or .gz)
outputlookup averages.csv createinapp=true
• If the lookup file exists, overwrite it with the results of outputlookup
• If it does not exist, create a file in the lookups directory of current application
• If createinapp=false, create the file in the system lookups directory
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
174 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using Lookups in Alerts – Step 2
Scenario index=security sourcetype=linux_secure "failed password"
SecOps is finding an increase in penetration | lookup averages.csv user OUTPUT daily_average
attempts. Build an alert that fires when a user | stats count by user, daily_average
exceeds the average daily failed login | where count > daily_average
attempts, using a 30-day sampling window | sort –count
within a 24-hour period.
| fields - count

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
175 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Lab Exercise 8
Time: 25-30 minutes
Tasks:
• Create a lookup file of Buttercup employees
• Validate the use of lookup tables to efficiently supply numerous
key=value pairs to a search string
• Report the number of failed login attempts and source IPs by known
user and IP addresses during the last 60 minutes. Limit the results to
users who have failed attempts from more than one source IP
• Display the source IP, user, and count of failed login attempts from
more than one unknown user from the same IP within the last 15
minutes Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
176 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module 9:
Searching tsidx Files

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
177 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Module Objectives
• Use the tstats command to search:
– Normal index data
– Data models
– Data model objects

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
178 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
What is a tsidx File?
• Indexes
– Splunktransforms raw data into events which it places into one or
more indexes, such as web, sales, security, etc.
– Each of these indexes contains a tsidx or time-series index file

• A tsidx is an inverted time series index file optimized for time,


consisting of a lexicon and a set of postings
– The lexicon is an alpha-numerically ordered list of terms within the
time range with a pointer to the posting list
– The posting list array contains seek address, _time, etc., and
maps each term to events in the rawdata files that contain that term
– Together, the rawdata files and their corresponding tsidx files make
up the contents of an index
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
179 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Index Structure
Security Index
Home Path source/sourcetype/host metadata
Hot_v1 1. source:opt/log/www1/secure.log et lt it
*.data 2. source:opt/log/mailsv1/secure.log et lt it
*.tsidx
rawdata - - - - - -

Hot_v2 TSIDX

… Accepted djohnson Failed for from invalid


Lexicon
db_lt_et nobody password port ssh2 sshd sysadmin …
..
Note password
This is a high-level,
Cold Path representational view. port Posting
:
db_lt_et
..
rawdata
sshd[87755]: Accepted password for djohnson from 10.3.10.46 port 2988 ssh2
Thawed Path sshd[3954]: Failed password for invalid user sysadmin from 10.3.10.46 port 4759 ssh2
sshd[1268]: Failed password for mail from 10.3.10.46 port 1617 ssh2
db_lt_et sshd[4816]: Failed password for nobody from 10.3.10.46 port 4412 ssh2
.. sshd[5744]: Failed password for sync from 10.3.10.46 port 4664 ssh2
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
180 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Search and tsidx Files
• When you run a search, Splunk:
1. Scans the tsidx lexicon for the search keywords
2. Looks up the locations in the posting file
3. Retrieves the associated events from the rawdata file

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
181 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
tstats Command
• Use tstats to perform statistical queries on indexed fields
in tsidx files
• You can only search those indexes to which you have access
• tstats is a generating command
– Generates events or reports from one or more indexes without
transforming them
– Must be the first command in a search
– Always preceded by a pipe
| tstats count FROM datamodel=AccButtercup_Games_Online_Sales

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
182 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
tstats Command (cont.)
• tstats does not support wildcard characters in field values in
aggregate functions, such as count(), sum(), etc. or in BY clauses
• You can specify:
| tstats count WHERE host=* BY source
| sort - count

• But not:
| tstats count(source*)
| tstats count WHERE host=* BY source*
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
183 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
tstats Command Basic Syntax
| tstats stats-function
[WHERE search-query][BY field-list]

• stats-function
– Perform a basic count or a function on a field
– Perform any number of aggregates
– Rename the result using AS

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
184 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
tstats – stats Functions
Scenario | tstats values(sourcetype) as sourcetype
ITOps wants a list of all source by index
types by index.

Remember, you can


perform any stats
function

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
185 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
tstats Command Syntax – by Clause
… [WHERE search-query][BY field-list]

• Can provide any number of group by fields


• If you group by _time, use span, i.e., span=3m to group into time
buckets
– Ifyou do not specify a span, Search Time Range Default Span
the value set by the time 5 minutes 5 seconds
picker determines the range, 15 minutes 10 seconds
for example: 60 minutes 1 minute
4 hours 5 minutes
24 hours 30 minutes
7 days 1 day

etc…
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
186 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Selecting Data
• The tstats command supports the following Note
This class does not discuss using
data sources: tscollect.

– Normal indexed data


– Data model or data model object
* namespace - manually collected with the tscollect command
* sid - Search ID of a tscollect search

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
187 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using tstats – Normal Index Data
If you do not use a FROM clause, Splunk | tstats count where index=* by index
| sort -count
performs a search for normal, indexed
data

Returned 7 results by scanning 10,364,307 events


in 0.196 seconds
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
188 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
tstats vs stats
Scenario
tstats works best with massive amounts
The Splunk Admin is doing
resource planning. He wants the
event load for the security index.
of data using indexed fields
Display a descending count for all
time by source, sourcetype, and
host, with commas.

| tstats count as events index=security


where index=security by source, sourcetype, host | stats count as events by source, sourcetype, host
| sort -events | sort –events
| eval events=tostring(events,"commas") | eval events = tostring(events, "commas")

11 results scanning 971,466 events in 0.08 sec 11 results by scanning 971,016 events in 1.59 seconds

| tstats count as events index=*


where index=* by source, sourcetype, host | stats count as events by source, sourcetype, host
| sort -events | sort –events
| eval events=tostring(events,"commas") | eval events = tostring(events, "commas")

101 results by scanning 12,945,032 events in 0.39 sec 101 results by scanning 12,944,443 events in 58.8 seconds

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
189 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
How Fast is tstats?
Scenario
Display a count of all events from
all time.

| tstats count as events 1 results by scanning 50,914,197 events


where index=* OR index=_* in 0.31 seconds

index=* OR index=_* 1 results by scanning 50,917,915 events


| stats count as events in 70.21 seconds

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
190 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using the datamodel Command
• Use the datamodel command to
display the structure of a data
model
• Syntax:
| datamodel [datamodel_name]
• To display all data models:
| datamodel

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
191 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using the datamodel Command (cont.)
Click the + next to
objectNameList to view a list of
all the object names in the
selected data model

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
192 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using the datamodel Command (cont.)
• To get data for a particular object within the data model, explore the
object properties, under objects
• For example, for information about successful_purchase, look at the
third object, under objects

1 2
2
3
3

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
193 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using the datamodel Command (cont.)
• For the field you want to access, note the
owner
• For example, to access information about
the price
– Note that the owner is http_request
 Therefore, to access price, you would specify
http_request.price

| tstats sum(http_request.price)

Note
You can also use the datamodel command to search.
Refer to docs.splunk.com for details.

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
194 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
tstats Command Basic Syntax
• Use the tstats command to search accelerated data models or
data model objects:
| tstats stats-function [summariesonly=boolean]
[FROM datamodel=data_model-name]
[WHERE search-query][BY field-list]

• stats-function
– Perform a basic count or a function on a field
– Perform any number of aggregates
– Rename the result using AS

• FROM datamodel
– tstats can be used with a data model or on a data model object
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
195 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using tstats – Accelerated Data Model
Scenario
| tstats count from datamodel=AccButtercup_Games_Online_Sales
TechOps is reconfiguring the web by host
servers. Provide them with the | sort -count
web requests during the last 24
hours.

• Accelerated data models contain a high-performance store consisting of a


collection of .tsidx data summaries
– This is an additional set of .tsidx files, not a replacement
• To access an accelerated data model's summaries, use
FROM datamodel=data_model_name
– Administrators can enable data model
acceleration
– Anyone can search using an accelerated
data model
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
196 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Searching Unaccelerated Data Models
• tstats can search | tstats count from datamodel=Buttercup_Games_Online_Sales
by host
| sort -count
unaccelerated data models
3 results scanning 272,480 events in 4.885 sec.
• However, note that searches
can be substantially slower | tstats count from datamodel=AccButtercup_Games_Online_Sales
by host
| sort -count

3 results scanning 272,746 events in 0.745 sec.

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
197 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
tstats Command – summariesonly
| tstats stats-function [summariesonly=boolean] …

• summariesonly=boolean - Applies only to an accelerated data model


– true only generates results from the TSIDX data automatically generated by the
acceleration, unsummarized data is not included
– false (default) generates results from both summarized and non-summarized data
• When running a search with summariesonly set to false, you might notice a larger
result count
– This can occur if some or all of the index data has not yet been added to the summary
| tstats count from datamodel=AccButtercup_Games_Online_Sales

(last 4 months) 2629 results by scanning 25,342 events in 0.279 seconds

| tstats count from datamodel=AccButtercup_Games_Online_Sales


summariesonly=t

(last 4 months) 2617


Generated for results
mastinderby scanning
singh 25,320 events in 0.054
(mastinder.singh@jpmchase.com) seconds
(C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
198 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
tstats Command – summariesonly (cont.)
| tstats stats-function [summariesonly=boolean] …

• summariesonly=boolean - Applies only to an accelerated data model


– When used with an unaccelerated data model, it produces no results

| tstats count from datamodel=Buttercup_Games_Online_Sales


summariesonly=t

| tstats count from datamodel=AccButtercup_Games_Online_Sales


summariesonly=t

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
199 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using tstats – Accelerated Data Model Object
To search a data | tstats sum(http_request.price) as tsales from
datamodel=AccButtercup_Games_Online_Sales.http_request

model object, specify where (http_request.action = purchase)


by http_request.product_name
summariesonly=t
a path to the object | rename http_request.product_name as Product
| sort – tsales
• Use dot notation | eval "Total Sales"="$".tostring(tsales,"commas")
| fields Product, "Total Sales"
(node[.object])
• For example, AccButtercup_Games_Online_Sales.http_request

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
200 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Using tstats – Accelerated Data Model Object (cont.)
Scenario | tstats sum(http_request.price) as tsales from
The Online Sales manager datamodel=AccButtercup_Games_Online_Sales.http_request
launched a new campaign by http_request.product_name
yesterday. Provide her with the summariesonly=t
total sales for yesterday. | sort - tsales
| eval tsales="$".tostring(tsales,"commas")
| rename http_request.product_name as Product
| rename tsales as "Daily Sales"
| fields Product, "Daily Sales"

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
201 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Lab Exercise 9
Time: 20 minutes
Tasks:
• Display the number of indexed events by month for the last 365
days with the number and time formatted
• Display a listing of the APAC vendors with retail sales of more than
$200 for the previous month

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
202 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
YouTube: The Splunk How-To Channel
• In addition to our roster of training courses, check out the Splunk
Education How-To channel: http://www.youtube.com/c/SplunkHowTo
• This site provides useful, short videos on a variety of Splunk topics

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
203 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Other Resources
• Splunk App Repository
https://splunkbase.splunk.com/
• Splunk Answers
http://answers.splunk.com/
• Splunk Blogs
http://blogs.splunk.com/
• Splunk Wiki
http://wiki.splunk.com/
• Splunk Docs
http://docs.splunk.com/Documentation/Splunk
• Splunk User Groups
http://usergroups.splunk.com /
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
204 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
The 9th Annual Splunk Conference
October 1st – 4 th
Walt Disney World Swan & Dolphin Resort
Orlando, Florida

• 6000+ IT and Business Professionals


• 200+ Sessions
• 80+ Customer Speakers

PLUS Splunk University


• Three days: Sept 29th – October 1st
• Get Splunk Certified for FREE!
• Get CPE credits for CISSP, CAP, SSCP

conf.splunk.com
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
205 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Thank You
• Complete the Class Evaluation to be in this month's drawing for a $100
Splunk Store voucher
1. Look for the invitation email, What did you think of your Splunk Education class,
in your inbox
2. Click the link or go to the specified URL in the email

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
206 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Appendix A:
Putting It All Together

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
207 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Putting It All Together – Lab Exercise 1
Time: 30-35 minutes
Task:
For the previous week, find the three customers with the most
purchases and, for each of those customers, display the five
products they purchased most

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
208 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Putting It All Together – Lab Exercise 2
Time: 20-25 minutes
Task:
Without using a subsearch, identify users who are active on the
local network over the last four hours, but did not badge into the
building

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
209 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Appendix B:
More Time Search Examples

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
210 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Searching for Current and Most Recent Logons
Scenario
SecOps always monitors who is
online. Display the most recent
logon/logoff status of users on the
network during the last 24 hours.

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
2 11
Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Possible Scenarios – User is Currently Logged On…
• And has not logged off
– Keep the logon_time timestamp
– Set logoff_time from null to “Session in Progress”

• Then logged off, then logged back on and hasn't logged off yet
– Both logon and logoff events exist, BUT the logoff event is earlier than the logon event
– Keep the logon_time timestamp
– Replace the logoff_time timestamp with “Session in Progress”

• But did so more than 24 hours ago and is still logged in


– Not reported because there would not be any logon/logoff events within search time range
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
212 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Possible Scenarios – User is Currently Logged Off
• User logged on and off within the last 24 hours
– Use the logon_time and logoff_time timestamps

• User logged on more than 24 hours ago (no logon event), but has
logged off within the last 24 hours
– Set logon_time from null to “Logon time out of range.”
– Keep the logoff_time timestamp

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
213 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Searching for Current and Most Recent Logons
Scenario index=security sourcetype=winauthentication_security
SecOps always monitors who is (EventCode=4624 OR EventCode=4634)
online. Display the most recent | stats latest(eval(if(EventCode=4624,_time, null()))) as
logon/logoff status of users on the logon_time,latest(eval(if(EventCode=4634,_time,null()))) as
network during the last 24 hours. logoff_time by User
| eval logoff_time = if(logoff_time<logon_time OR
isnull(logoff_time),"Session in Progress",logoff_time)
| eval logon_time = if(isnull(logon_time),
Initial search finds all "Logon time out of range", logon_time)
| eval duration=tostring(logoff_time-logon_time,"duration")
logon and logoff events | eval logon_time=if(isint(logon_time),
strftime(logon_time, "%b %d, %I:%M %p"), logon_time)
during the last 24 hours | eval logoff_time=if(isint(logoff_time),
strftime(logoff_time, "%b %d,%I:%M %p"),logoff_time)

Log on = EventCode 4624


Log off = EventCode 4634

Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
214 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Searching for Current and Most Recent Logons (cont.)
Scenario index=security sourcetype=winauthentication_security
SecOps always monitors who is (EventCode=4624 OR EventCode=4634)
online. Display the most recent | stats latest(eval(if(EventCode=4624,_time, null()))) as
logon/logoff status of users on the logon_time,latest(eval(if(EventCode=4634,_time,null()))) as
network during the last 24 hours. logoff_time by User
| eval logoff_time = if(logoff_time<logon_time OR
isnull(logoff_time),"Session in Progress",logoff_time)
• Use stats with latest, eval, and | eval logon_time = if(isnull(logon_time),
"Logon time out of range", logon_time)
if functions to find and set initial | eval duration=tostring(logoff_time-logon_time,"duration")
values for logon_time and | eval logon_time=if(isint(logon_time),
strftime(logon_time, "%b %d, %I:%M %p"), logon_time)
logoff_time | eval logoff_time=if(isint(logoff_time),
strftime(logoff_time, "%b %d,%I:%M %p"),logoff_time)
• If a logon event exists, set:
– logon_time with most recent
logon timestamp; otherwise, set
as null
– logoff_time with most recent logoff
timestamp; otherwise, set as null
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
215 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018
Searching for Current and Most Recent Logons (cont.)
Scenario index=security sourcetype=winauthentication_security
SecOps always monitors who is (EventCode=4624 OR EventCode=4634)
online. Display the most recent | stats latest(eval(if(EventCode=4624,_time, null()))) as
logon/logoff status of users on the logon_time,latest(eval(if(EventCode=4634,_time,null()))) as
network during the last 24 hours. logoff_time by User
| eval logoff_time = if(logoff_time<logon_time OR
isnull(logoff_time),"Session in Progress",logoff_time)
• Use eval and if, isnull | eval logon_time = if(isnull(logon_time),
"Logon time out of range", logon_time)
functions to validate and | eval duration=tostring(logoff_time-logon_time,"duration")
| eval logon_time=if(isint(logon_time),
set logon_time and strftime(logon_time, "%b %d, %I:%M %p"), logon_time)
| eval logoff_time=if(isint(logoff_time),
strftime(logoff_time, "%b %d,%I:%M %p"),logoff_time)
logoff_time
• Use eval and the if,
isint, and strftime
functions to reformat the
timestamps of logon_time
and logoff_time
Generated for mastinder singh (mastinder.singh@jpmchase.com) (C) Splunk Inc, not for distribution
Advanced Searching & Reporting with Splunk
216 Copyright © 2017 Splunk, Inc. All rights reserved | 1 June 2018