Fundamentals 3
arithmetic operators
+ - * / %
concatenation operators
+ .
benefits of summary indexing
-SI searches run faster because they're searching a smaller dataset -amortize costs over different reports -enables creation of rolling reports
summary indexing
-alternative to report acceleration -only use for reports that don't qualify for acceleration
regex best practices
-avoid backtracking -use + rather than * -avoid greedy operators, use non-greedy -use simple expressions rather than complex ones
example uses of eval
-calculate expressions -place the results in a field -use field in searches or other expressions
crypographic functions
-computes and returns a secure, encrypted has value of a string value -available for several of the most popular cryptographic hash algorithms
examples of stats functions
-count -distinct_count -sum -list -values -min -max -avg -median -range
Splunk can set off what kind of alerts?
-create an entry in triggered alerts -log an event -send emails -perform custom action -output results to a lookup file -use a webhook
nested macros
-create inner macro first -put inner macro name surrounded by backticks in definition of outer macro
what do acceleration summaries do?
-efficiently reports on large volumes of data -qualifies future searches against the summary
what kind of metadata might the schema of self-describing data include?
-element names -data types -compression/encoding scheme, etc.
alerts to create new searchable events
-events are sent to Splunk deployment for indexing -can be used alone or combined with other alert actions -required admin privileges/ edit_tcp capability
summary index overlaps
-events in SI that share the same timestamp -can skew reports and statistics created from SI -can occur if you set report time range to be longer than frequency of report schedule
why use a webhook alert action?
-generate a ticket for BCG or other vendor ticketing systems -make an alert msg pop up in a chat room -post a notification on a web page
what happens when an alert triggers a webhook?
-generates JSON formatted info about the alert -sends an HTTP POST request to the specified URL with the alert info in the body
how can gaps occur in a summary index?
-if populating reports run past the next scheduled run time -you've forced populating reports to use real-time scheduilng by editing savedsearches.conf -splunk goes down
what is acceleration?
-improves search completion ties -uses automatically created acceleration summaries
persistent data model acceleration restrictions
-must be admin or have accelerate_datamodel privilege -private data models can't be accelerated -accelerated data models can't be edited -only root events can be accelerated
eval date & time functions
-now(): returns time time a search was started -time(): returns time an event was processed by eval -command -strftime: converts timestamp to string -strptime: converts a time in string and parses it into a timestamp -relative_time: returns a timestamp relative to a supplied time
what kind of searches run faster without a summary?
-ones with fewer than 100K events in hot buckets -ones where summary size is projected to be too big
lookup best practices
-order fields so that it follows "key value" -for commonly used fields, make lookups automatic -use gzipped CSV files or KV Store for large lookups -use 'search.lookups' to see how long lookup took to execute
stats function
-perform a basic count or a funciton on a field -perform any number of aggregates -rename the result usig as
what are 2 methods to perform field extraction in the field extractor?
-regex (unstructured data) -delimiter (structured data)
what are 3 data summary creation methods?
-report acceleration -summary indexing -data model acceleration
report acceleration vs summary indexing
-report acceleration is easier to use and more efficient -only use summary indexing for reports that don't qualify for acceleration -once an acceleration summary is created from a shared report, any report that can use it will use it
what does the webhook POST request's JSON data payload include?
-search ID for the saved search that triggered the alert -link to search results -search owner and app -first result row from triggering search results
what are tokens available to represent?
-search metadata -search results -server information -job information
what reports qualify for acceleration?
-searches with transforming commands -commands before transforming commands must be streaming -commands after transforming commands must be non-streaming
what are 2 characteristics of reports that span a large volume of data?
-take a long time to complete -consume a lot of system resources
geospatial lookups
-used to create choropleth map visualizations -KMZ or KML file -splunk uses geo_us_states and geo_countries (two geospatial lookup files)
datamodel command
-used to display the structure of a data model or to search against it -returns a description of all or a specified data model and its objects all data models: | datamodel specific data models: | datamodel [datamodel_name] specified data model with object: | datamodel [datamodel_name] [object_name]
what is self-describing data?
-when schema or structure is embedded in the data itself -ex: JSON, XML
splunk alerts are based on searches that can run either....
...on regular scheduled interval ...in real-time
what happens after you accelerate a data model?
1. Splunk builds an acceleration summary for the specified summary range 2. summary takes the form of inverted .tsidx files 3. files stored in index containing events that have fields specified in data model 4. each bucket in each index may contain multiple .tsidx files
what happens when you run a search in Splunk?
1. scans .tsidx lexicon for the search keywords 2. looks up the locations in teh .tsidx posting list 3. retrieves the associated events from the rawdata file
comparison operators
< > <= >= != = == LIKE
boolean operators
AND OR NOT XOR
erex
a search time extraction command -don't have to know regex, just provide example values (extracts based on examples) -shouldn't be used in saved reports | erex <fieldname> examples="example1[,example2...]"
rex
a search time extraction command -must write regex, includes data that matches pattern -can be used in saved reports | rex [field=<field>] (<regex>)
ad hoc data model acceleration
acceleration summary built on search head after user selects dataset and enters pivot editor -temporary -works on all data set types -takes place automatically for data models that haven't been persistently accelerated -exists only for duration of user's pivot -runs over all time
persistent data model acceleration
acceleration summary can be used with pivot editor and with tstats command -exists as long as data model exists -better than ad hoc because summaries maintained on an on-going basis -defined before using -scoped to particular time ranges
what are the 2 types of data model acceleration?
ad hoc persistent
what does the eval command do?
allows you to calculate and manipulate field values in your report
when reports are accelerated, what is created?
an acceleration summary
appendpipe example
appendpipe [stats sum(count) as count by usage]
when is the time range selected for search macros?
at search time
what character is used with search macros?
backticks (`)
fieldsummary command
calculates a variety of summary stats for al or a subset of fields and displays summary info as results table
what are search macros?
can be a full search string or a portion os a search that can be reused in multiple places
case function
case(X1,Y1,X2,Y2...) -X1 is a boolean expression -if X1 evaluates to TRUE the result is Y1 -if X1 evaluates to FALSE, the next boolean (X2) is evaluated, etc.
are field values case sensitive or insensitive when used with eval?
case-sensitive
tostring options
commas duration hex
how do you check contents of search macros before they execute?
control-shift-E
what are pivot reports created based on?
data sets
what does the stats command do
enables you to calculate statistics on data
eval syntax
eval fieldname1= expression1 [,fieldname2= expression2...]
how long does extraction last?
extraction only exists for duration of search, it doesn't persist as a knowledge object
what fields are in a fieldsummary results table?
field count distinct_count is_exact max mean min numeric_count stdev values
multikv command
for table-formatted events, it creates an event for each row (field names derived from header row of each event) -fields: extract only specified fields -filter: include only table rows containing at least one field value from a specified list
what are search macros useful for?
frequently run searches with similar syntax
what clauses can we use to access an accelerated data model summary?
from & by
eventstats command
generates summary statistics of all existing fields in your search results and saves them as values in new fields
what are data models?
hierarchically structured datasets that generate searches and drive pivots
if function
if(X,Y,Z) -X is a boolean expression -if X evaluates to TRUE, the result evaluates to Y -if X evaluates to FALSE, the result evaluates to Z -non-numeric values must be in "" -fields are case-sensitive
is indexed data modified when eval is used?
no, no new data is written into the index
tstats command
performs statistical queries on indexed fields in tsidx files (generating command) -must be the first command in a search -can search unaccelerated data models (but will be much slower) | tstats stats-function [summariesonly=boolean] [from data model=data_model-name] [where search-query] [by field-list]
printf syntax
printf("format", [arguments])
job information tokens
provide data specifics to a search job
server tokens
provide details about your Splunk deployment -$server.version$ -$server.build$ -$server.serverName$
result tokens
provide field values from the first row returned by the search associated with the alert -$result.fieldname$
search metadata tokens
provide metadata about the alert and associated search
what are lookups useful for?
pulling data from standalone files at search time and adding them to search results
replace function
replace(X,Y,Z) -X,Y,Z are all strings -Y is regex -returns a string formed by substituting Z for every occurrence of Y in X
how to accelerate a saved report
reports --> edit acceleration --> save
typeof function
returns a string that represents the data type of the argument
what does a KV Store do?
saves and retrieves data in collections of key-value pairs
how do you accelerate a data model?
settings --> data models --> edit --> edit accleration
how do you create a new lookup definition?
settings --> lookups --> lookup definitions --> new lookup definition
how do you add a geospatial lookup table file?
settings --> lookups --> lookup table files --> new lookup table file
what search mode must the search be set to for acceleration?
smart or fast (verbose is changed to smart automatically)
what command helps Splunk read XML files?
spath
spath syntax
spath [input=<field>] [output=<field>] [path=<datapath> |[<datapath>] optional args: -input -output -path
spath function of the eval command
spath(X,Y) -X: input source field -Y: XML or JSON formatted location path to the value you want to extract from X
what is the difference between eventstats and streamstats?
streamstats calculates stats for each result row at the time the command encounters it both work on the entire results
substr command
substr(X, Y, Z) returns a substring of X, starting at the index specified by Y, with a length of Z
appendpipe command
takes existing results and pushes them into the sub-pipeline, then appends the results of the sub-pipeline as new lines to the outer search results are displayed in-line
which command allows you to access geospatial lookups?
the geom command
what is placed in parenthesis at the end of the search macro name?
the number of arguments the search macro takes ex: mymacro(2)
what happens if you delete all reports that use a summary?
the summary is automatically deleted
tonumber syntax
tonumber(numstr[,base]) numstr may be field name or literal string value
conversion functions
tostring tonumber printf: builds a string value based on a string format and optional arguments
tostring syntax
tostring(field, "option")
how do you define a geospatial lookup?
upload a KML or KMZ file
how can you backfill gaps
use fill_summary_index.py
what can you do with the DB connect app?
use lookups to reference fields in an external SQL database -import database data for indexing, analysis, and visualization -export machine data to an external data base -use SQL queries to build dashboards
using external lookups
uses scripts or executables to populate events with field values from an external source -must be a python script or binary executable
when are alerts triggered?
when results of a search meet specific conditions that are defined
outputlookup command
writes search results to specified static lookup table (CSV) or KV Store collection -if a lookup file exists, it is overwritten
fieldsummary syntax
| fieldsummary [maxvals=num] [field-list] (without [ ]) optional args: -maxvals: max distinct values to return for each field -field-list:list of fields to calculate stats for