This document discusses Splunk's data modeling capabilities and how they enable faster analytics over raw machine data. It introduces data models, which allow domain knowledge to be shared and reused. Data models map data onto hierarchical structures and enable non-technical users to build reports without using the Splunk search language. The document covers best practices for building data models and how pivot searches are generated from the underlying data model objects. It also discusses managing, securing, and accelerating analytics with data models.
Splunk Ninjas: New features, pivot, and search dojoSplunk
Besides seeing the newest features in Splunk Enterprise and learning the best practices for data models and pivot, we will show you how to use a handful of search commands that will solve most search needs. Learn these well and become a ninja.
Boosting Documents in Solr by Recency, Popularity and Personal Preferences - ...lucenerevolution
See conference video - http://www.lucidimagination.com/devzone/events/conferences/revolution/2011
Attendees with come away from this presentation with a good understanding and access to source
code for boosting and/or filtering documents by recency, popularity, and personal preferences. My
solution improves upon the common “recipe” based solution for boosting by document age. The
framework also supports boosting documents by a popularity score, which is calculated and
managed outside the index. I will present a few different ways to calculate popularity in a scalable
manner. Lastly, my solution supports the concept of a personal document collection, where each
user is only interested in a subset of the total number of documents in the index.
Boosting Documents in Solr by Recency, Popularity, and User PreferencesLucidworks (Archived)
Presentation on how to and access to source code for boosting and/or filtering documents by recency, popularity, and personal preferences. My solution improves upon the common "recip" based solution for boosting by document age.
Building a real time big data analytics platform with solrTrey Grainger
Having “big data” is great, but turning that data into actionable intelligence is where the real value lies. This talk will demonstrate how you can use Solr to build a highly scalable data analytics engine to enable customers to engage in lightning fast, real-time knowledge discovery.
At CareerBuilder, we utilize these techniques to report the supply and demand of the labor force, compensation trends, customer performance metrics, and many live internal platform analytics. You will walk away from this talk with an advanced understanding of faceting, including pivot-faceting, geo/radius faceting, time-series faceting, function faceting, and multi-select faceting. You’ll also get a sneak peak at some new faceting capabilities just wrapping up development including distributed pivot facets and percentile/stats faceting, which will be open-sourced.
The presentation will be a technical tutorial, along with real-world use-cases and data visualizations. After this talk, you'll never see Solr as just a text search engine again.
Warehousing Your Hits - The Why and How of Owning Your DataScott Arbeitman
These are the slides from my recent presentation at Melbourne' Web Analytics Wednesdays. I talk about transitioning from collecting your data in primary digital analytics systems to storing them in a data warehouse or data lake.
Optiq is a dynamic query planning framework. It can potentially help integrate Pentaho Mondrian and Kettle with various SQL, NoSQL and BigData data sources.
SplunkLive Sydney Enterprise Security & User Behaviour AnalyticsSplunk
This session will review Splunk’s two premium solutions for information security organizations: Splunk for Enterprise Security (ES) and Splunk User Behavior Analytics (UBA). Splunk ES is Splunk's award-winning security intelligence solution that brings immediate value for continuous monitoring across SOC and incident response environments – allowing you to quickly detect and respond to external and internal attacks, simplifying threat management while decreasing risk. Splunk UBA is a new technology that applies unsupervised machine learning and data science to solving one of the biggest problems in information security today: insider threat. You’ll learn how Splunk UBA works in tandem with ES, or third-party data sources, to bring significant automated analytical power to your SOC and Incident Response teams. We’ll discuss each solution and see them integrated and in action through detailed demos.
Getting Started With Splunk It Service IntelligenceSplunk
Are you currently using Splunk to troubleshoot and monitor your IT environment? Do you want more out of Splunk but don’t know how? Here’s your chance to learn more about Splunk IT Service Intelligence (Splunk ITSI) and get hands-on with it for the very first time. We’ll kick off this session with a discussion on the concept of services, KPIs and entities and demonstrate how to use them in Splunk IT Service Intelligence. We’ll help you build custom visualizations and dashboards for personalized service-centric views. We’ll teach you how to navigate across multiple KPIs, entities and events with built-in visualizations and intelligently troubleshoot and resolve problems faster using Splunk ITSI. We’ll also show you how to create correlations across KPIs easily and be alerted of “notable events” to catch these emerging problems quickly. At the end of this session, you will leave with an understanding of the unique monitoring approach Splunk ITSI delivers to maximize the value of your data in Splunk and how to accelerate visibility into your critical IT services.
Splunk Ninjas: New features, pivot, and search dojoSplunk
Besides seeing the newest features in Splunk Enterprise and learning the best practices for data models and pivot, we will show you how to use a handful of search commands that will solve most search needs. Learn these well and become a ninja.
Boosting Documents in Solr by Recency, Popularity and Personal Preferences - ...lucenerevolution
See conference video - http://www.lucidimagination.com/devzone/events/conferences/revolution/2011
Attendees with come away from this presentation with a good understanding and access to source
code for boosting and/or filtering documents by recency, popularity, and personal preferences. My
solution improves upon the common “recipe” based solution for boosting by document age. The
framework also supports boosting documents by a popularity score, which is calculated and
managed outside the index. I will present a few different ways to calculate popularity in a scalable
manner. Lastly, my solution supports the concept of a personal document collection, where each
user is only interested in a subset of the total number of documents in the index.
Boosting Documents in Solr by Recency, Popularity, and User PreferencesLucidworks (Archived)
Presentation on how to and access to source code for boosting and/or filtering documents by recency, popularity, and personal preferences. My solution improves upon the common "recip" based solution for boosting by document age.
Building a real time big data analytics platform with solrTrey Grainger
Having “big data” is great, but turning that data into actionable intelligence is where the real value lies. This talk will demonstrate how you can use Solr to build a highly scalable data analytics engine to enable customers to engage in lightning fast, real-time knowledge discovery.
At CareerBuilder, we utilize these techniques to report the supply and demand of the labor force, compensation trends, customer performance metrics, and many live internal platform analytics. You will walk away from this talk with an advanced understanding of faceting, including pivot-faceting, geo/radius faceting, time-series faceting, function faceting, and multi-select faceting. You’ll also get a sneak peak at some new faceting capabilities just wrapping up development including distributed pivot facets and percentile/stats faceting, which will be open-sourced.
The presentation will be a technical tutorial, along with real-world use-cases and data visualizations. After this talk, you'll never see Solr as just a text search engine again.
Warehousing Your Hits - The Why and How of Owning Your DataScott Arbeitman
These are the slides from my recent presentation at Melbourne' Web Analytics Wednesdays. I talk about transitioning from collecting your data in primary digital analytics systems to storing them in a data warehouse or data lake.
Optiq is a dynamic query planning framework. It can potentially help integrate Pentaho Mondrian and Kettle with various SQL, NoSQL and BigData data sources.
SplunkLive Sydney Enterprise Security & User Behaviour AnalyticsSplunk
This session will review Splunk’s two premium solutions for information security organizations: Splunk for Enterprise Security (ES) and Splunk User Behavior Analytics (UBA). Splunk ES is Splunk's award-winning security intelligence solution that brings immediate value for continuous monitoring across SOC and incident response environments – allowing you to quickly detect and respond to external and internal attacks, simplifying threat management while decreasing risk. Splunk UBA is a new technology that applies unsupervised machine learning and data science to solving one of the biggest problems in information security today: insider threat. You’ll learn how Splunk UBA works in tandem with ES, or third-party data sources, to bring significant automated analytical power to your SOC and Incident Response teams. We’ll discuss each solution and see them integrated and in action through detailed demos.
Getting Started With Splunk It Service IntelligenceSplunk
Are you currently using Splunk to troubleshoot and monitor your IT environment? Do you want more out of Splunk but don’t know how? Here’s your chance to learn more about Splunk IT Service Intelligence (Splunk ITSI) and get hands-on with it for the very first time. We’ll kick off this session with a discussion on the concept of services, KPIs and entities and demonstrate how to use them in Splunk IT Service Intelligence. We’ll help you build custom visualizations and dashboards for personalized service-centric views. We’ll teach you how to navigate across multiple KPIs, entities and events with built-in visualizations and intelligently troubleshoot and resolve problems faster using Splunk ITSI. We’ll also show you how to create correlations across KPIs easily and be alerted of “notable events” to catch these emerging problems quickly. At the end of this session, you will leave with an understanding of the unique monitoring approach Splunk ITSI delivers to maximize the value of your data in Splunk and how to accelerate visibility into your critical IT services.
Supporting Enterprise System Rollouts with SplunkErin Sweeney
At Cricket Communications, Splunk started as a way to correlate all of our data into one view to help our operations team keep processes humming. Then we gave secured access to our developers, now they’re addicted. In fact, Splunk is critical in helping us speedup deployment of new systems (like our recent multi-million dollar billing system implementation). Learn how we use Splunk to display key metrics for the business, track overall system health, track transactions, optimize license usage, and support capacity
planning.
Julian Harty, Sr. Sales Engineer, Splunk reviews the internals of how a Splunk search is performed, use of job inspector, search log, and gives a review of where and when to use certain commands.
Webinar: Was ist neu in Splunk Enterprise 6.5Splunk
Splunk Enterprise 6.5 bietet fundamentale Weiterentwicklungen im Bereich Machine Learning, Datenanalysen, Plattform Management und ist damit im Betrieb kostengünstiger.
In unserem Webinar zeigen wir Ihnen eine Produktdemo und Sie erfahren folgendes:
- Nutzen Sie Machine Learning, um vorherzusagen, aufzudecken und das zu verhindern, was für Ihr Unternehmen am wichtigsten ist
- Verwenden Sie Tabellen, um Daten vorzubereiten und zu analysieren, ohne die Splunk Suchsprache (SPL) zu nutzen
- Senken Sie die Speicherkosten, indem Sie historische Daten zu Hadoop auslagern
- Nutzen Sie kostenlose Entwickler/Testlizenzen, um neue Datenquellen und Anwendungsfälle zu erforschen
- Verarbeiten Sie kritische Daten ohne Unterbrechung, da im Lizenzmodell die Sperre der Suche bei Lizenzüberschreitungen entfernt wurde
Die aktuelle Version von Splunk Enterprise 6.5 hilft Ihnen dabei, den Mehrwert aus Ihren Daten und Ihrer Investition in Splunk zu maximieren. Mit den neuen Features sind Big Data Analysen noch kostengünstiger und einfacher geworden. Überzeugen Sie sich selbst in unserem Webinar.
Field Extractions: Making Regex Your BuddyMichael Wilde
This presentation was given by Michael Wilde, Splunk Ninja at Splunk's Worldwide User Conference 2011. A demonstration accompanied this presentation. Link is forthcoming.
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk
Besides seeing the newest features in Splunk Enterprise and learning the best practices for data models and pivot, we will show you how to use a handful of search commands that will solve most search needs. Learn these well and become a ninja.
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunk
Presented at SplunkLive! Frankfurt 2018:
Splunk Data Collection Architecture
Apps and Technology Add-ons
Demos / Examples
Best Practices
Resources and Q&A
CMGT410 v19Project Charter TemplateCMGT410 v19Page 2 of 3P.docxmccormicknadine86
CMGT/410 v19
Project Charter Template
CMGT/410 v19
Page 2 of 3Project Charter Template
Instructions: With your team members, create a project charter using the template below. Some example information has been provided to assist you. Please replace all example information in the template with information specific to the project charter you and your fellow team members create.1.0 Project Identification
Name
Company Web interface for big data analytics
Description
Design, develop, and implement a data analytics application leveraging the millions of data records in company databases.
Sponsor
Matthew Sullivan, David Dosaima
Project Manager
Daniel Kathii
Project Team Resources
2.0 Business Reasons for Project
· Too produce company analytic assessments based on database records.
· To process and understand millions of data records to make more informed business decisions
· To allow for more efficient data record analysis
· To allow for more efficient trend analysis3.0 Project Objectives (Purpose)
· To develop a web interface and application leveraging existing data storage repositories
· To develop a web interface for complex data queries leveraging Boolean and elastic search capabilities
· To develop a nodal analysis tool to visualize data records and databases objects.4.0 Project Scope
· In scope:
· Initiate web server/infrastructure for online dealings
· Provide development and user customer services options
· Market place
· Develop enterprise object model ontology
· Develop object driven search and discovery based on simple, complex and temporal database queries
· Out of scope:
· 3rd party recruitment options 5.0 Requirements – User Stories
User Story
Description
Search and Discovery
“As <user role – data analyst > I want to <action – discover data records through complex search criteria> so that <goal reduce time of relevant data discovery>”
Data Visualization
“As <user role – data analyst > I want to <action – view data records and links in a nodal data chart> so that <goal – data relationships can be viewed>”
Account lookup
“As <user role – UI account > I want to <action – view account info> so that <goal – management can query user adoption>”
Temporal Search and Discovery
“As <user role – data analyst > I want to <action – query data based on temporal filters> so that <goal – data can be filtered by time and space>”
Data Organization
“As <user role – data analyst > I want to <action – save, organize and save persistent searches> so that <goal – users can be notified of new data that meets historical/saved searches>”
6.0 Milestone Dates
Item
Milestone
Date
I.
Stand up cloud test/development server
1st Dec 2019
II.
Develop UI prototype
15th Mar 2020
III.
Networking/Information event for all departments
15th Apr 2020
IV.
Develop department toolkit, templates, resources
15th May 2020
V.
Implementation and communication to stakeholder groups
15th Jul 2020
VI.
Website launch
12th Sep 2020
VII.
Evaluation, consultations, lessons learned
1st ...
[WSO2Con USA 2018] Patterns for Building Streaming AppsWSO2
This slide deck explains how to enable digital transformation through streaming analytics and how easily streaming applications can be implemented.
Watch video: https://wso2.com/library/conference/2018/07/wso2con-usa-2018-patterns-for-building-streaming-apps/
SplunkLive! Tampa: Splunk Ninjas: New Features, Pivot, and Search Dojo Splunk
Besides seeing the newest features in Splunk Enterprise and learning the best practices for data models and pivot, we will show you how to use a handful of search commands that will solve most search needs. Learn these well and become a ninja.
Besides seeing the newest features in Splunk Enterprise and learning the best practices for data models and pivot, we will show you how to use a handful of search commands that will solve most search needs. Learn these well and become a ninja.
As the world moves to an era where data is the most valuable asset, being able to efficiently process large volumes of data in real time can help to gain a competitive advantage for businesses. Then, making business decision within milliseconds has become a mandatory need in many domains. Streaming analytics play a key role in making these decisions and is also a vital part of the digital transformation of businesses. WSO2 Stream Processor provides a high performance, lean, enterprise-ready streaming solution to solve data integration and analytics challenges. It provides real-time, interactive, predictive and batch processing technologies to deal with large volumes of data and generate meaningful decisions/output from it. This session explains how to enable digital transformation through streaming analytics and how easily streaming applications can be implemented.
- The Architecture of WSO2 Stream Processor
- Understanding streaming constructs
- Patterns of processing data in real time, incremental and with intelligence
- Applying patterns when building streaming apps
- Deployment patterns
Besides seeing the newest features in Splunk Enterprise and learning the best practices for data models and pivot, we will show you how to use a handful of search commands that will solve most search needs. Learn these well and become a ninja.
What is Splunk? At the end of this session you’ll have a high-level understanding of the pieces that make up the Splunk Platform, how it works, and how it fits in the landscape of Big Data. You’ll see practical examples that differentiate Splunk while demonstrating how to gain quick time to value.
Splunk Ninjas: New Features, Pivot, and Search DojoSplunk
Besides seeing the newest features in Splunk Enterprise and learning the best practices for data models and pivot, we will show you how to use a handful of search commands that will solve most search needs. Learn these well and become a ninja.
(ATS6-APP01) Unleashing the Power of Your Data with DiscoverantBIOVIA
In the fast-paced, high demand environment of manufacturing, it’s almost impossible to find the time to gather large amounts of data and organize it into a common context. This session presents best practices for using Discoverant hierarchies and Direct Connects to provide your end-users with on-demand and scheduled self-service access to their contextualized data, freeing their time for the more important trending and analysis activities that enable improved process performance and predictability.
.conf Go 2023 - Das passende Rezept für die digitale (Security) Revolution zu...Splunk
.conf Go 2023 presentation:
"Das passende Rezept für die digitale (Security) Revolution zur Telematik Infrastruktur 2.0 im Gesundheitswesen?"
Speaker: Stefan Stein -
Teamleiter CERT | gematik GmbH M.Eng. IT-Sicherheit & Forensik,
doctorate student at TH Brandenburg & Universität Dresden
.conf Go 2023 presentation:
De NOC a CSIRT
Speakers:
Daniel Reina - Country Head of Security Cellnex (España) & Global SOC Manager Cellnex
Samuel Noval - Global CSIRT Team Leader, Cellnex
Splunk - BMW connects business and IT with data driven operations SRE and O11ySplunk
BMW is defining the next level of mobility - digital interactions and technology are the backbone to continued success with its customers. Discover how an IT team is tackling the journey of business transformation at scale whilst maintaining (and showing the importance of) business and IT service availability. Learn how BMW introduced frameworks to connect business and IT, using real-time data to mitigate customer impact, as Michael and Mark share their experience in building operations for a resilient future.
Data foundations building success, at city scale – Imperial College LondonSplunk
Universities have more in common with modern cities than traditional places of learning. This mini city needs to empower its citizens to thrive and achieve their ambitions. Operationalising data is key to building critical services; from understanding complex IT estates for smarter decision-making to robust security and a more reliable, resilient student experience. Juan will share his experience in building data foundations for a resilient future whilst enabling digital transformation at Imperial College London.
Splunk: How Vodafone established Operational Analytics in a Hybrid Environmen...Splunk
Learn how Vodafone has provided end-to-end visibility across services by building an Operational Analytics Platform. In this session, you will hear how Stefan and his team manage legacy, on premise, hybrid and public cloud services, and how they are providing a platform for complex triage and debugging to tackle use cases across Vodafone’s extensive ecosystem.
.italo operates an Essential Service by connecting more than 100 million people annually across Italy with its super fast and secure railway. And CISO Enrico Maresca has been on a whirlwind journey of his own.
Formerly a Cyber Security Engineer, Enrico started at .italo as an IT Security Manager. One year later, he was promoted to CISO and tasked with building out – and significantly increasing the maturity level – of the SOC. The result was a huge step forward for .italo.
So how did he successfully achieve this ambitious ask? Join Enrico as he reveals the key insights and lessons learned in his SOC journey, including:
Top challenges faced in improving security posture
Key KPIs implemented in order to measure success
Strategies and approaches applied in the SOC
How MITRE ATT&CK and Splunk Enterprise Security were utilised
Next steps in their maturity journey ahead
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
"Impact of front-end architecture on development cost", Viktor TurskyiFwdays
I have heard many times that architecture is not important for the front-end. Also, many times I have seen how developers implement features on the front-end just following the standard rules for a framework and think that this is enough to successfully launch the project, and then the project fails. How to prevent this and what approach to choose? I have launched dozens of complex projects and during the talk we will analyze which approaches have worked for me and which have not.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
3. Analytics Big Picture
Pivot
Enables non-technical users to build complex
reports without the search language
Data
Model
Provides more meaningful representation
of underlying raw machine data
Analytics
Store
Acceleration technology delivers up to 1000x
faster analytics over Splunk Enterprise 5
6. Splunk Search Language
search and filter | munge | report | clean-up
sourcetype=access_combined source = "/home/ssorkin/banner_access.log.2013.6.gz"
| eval unique=(uid + useragent) | stats dc(unique) by os_name
| rename dc(unique) as "Unique Visitors" os_name as "Operating System"
7. Hurdles
index=main source=*/banner_access* uri_path=/js/*/*/login/* guid=* useragent!=*KTXN* useragent!=*GomezAgent* clientip!=206.80.3.67
clientip!=198.144.207.62 clientip!=97.65.63.66 clientip!=175.45.37.78 clientip!=209.119.210.194 clientip!=212.36.37.138 clientip!=204.156.84.0/24
clientip!=216.221.226.0/24 clientip!=207.87.200.162 | rex field=uri_path "/js/(?<t>[^/]*)/(?<v>[^/]*)/login/(?<l>[^/]*)” | eval license = case(l LIKE "prod%" AND
t="pro", "enterprise", l LIKE "trial%" AND t="pro", "trial", t="free", "free”) | rex field=v "^(?<vers>d.d)” | bin span=1d _time as day | stats values(vers) as vers
min(day) as min_day min(eval(if(vers=="5.0", _time, null()))) as min_day_50 dc(day) as days values(license) as license by guid | eval type =
if(match(vers,"4.*"), "upgrade", "not upgrade") + "/" + if(days > 1, "repeat", "not repeat")| search license=enterprise | eval _time = min_day_50| timechart
count by type| streamstats sum(*) as *
•
Simple searches easy… Multi-stage munging/reporting is hard!
•
Need to understand data’s structure to construct search
•
Non-technical users may not have data source domain knowledge
•
Splunk admins do not have end-user search context
8. Data Model Goals
•
Make it easy to share/reuse domain knowledge
•
Admins/power users build data models
•
Non-technical users interact with data via pivot UI
10. What is a Data Model?
A data model is a search-time mapping of data onto a hierarchical structure
Encapsulate the knowledge
needed to build a search
Pivot reports are build on top
of data models
Data-independent
Screenshot here
11. A Data Model is a Collection of Objects
Screenshot here
19. Object Attributes
Auto-extracted – default and predefined fields
Eval expression – a new field based
on an expression that you define
Lookup – leverage an existing lookup
table
Regular expression – extract a new
field based on regex
Geo IP – add geolocation fields such
as latitude, longitude, country, etc.
20. Object Attributes
Set field types
Configure various flags
Note: Child object configuration can differ from parent
21. Best Practices
Use event objects as often as possible
– Benefit from data model acceleration
Resist the urge to use search objects instead of event objects!!
– Event based searches can be optimized better
Minimize object hierarchy depth when possible
– Constraint based filtering is less efficient deeper down the tree
Event object with deepest tree (and most matching results) first
– Model-wide acceleration only for first event object and its
descendants
22. Warnings!
Object constraints and attributes cannot contain pipes or subsearches
A transaction object requires at least one event or search object in the data model
Lookups used in attributes must be globally visible (or at least visible to the app
using the data model)
No versioning on data models (and objects)!
29. Using the Splunk Search Language
Object Search String
| datamodel <modelname> <objectID> search
Example:
| datamodel WebIntelligence HTTP_Request search
Behind the scenes:
sourcetype=access_* OR sourcetype=iis* uri=* uri_path=* status=* clientip=* referer=*
useragent=*
30. Under the hood: Pivot Search String Generation
Pivot search = object search + filters + reporting + formatting
Example:
(sourcetype=access_* OR sourcetype=iis*) status=2*
uri=* uri_path=* status=* clientip=* referer=* useragent=*
| stats count AS "Count of HTTP_Sucess" by ”useragent"
| sort limit=0 "useragent" | fields - _span
| fields "useragent" "Count of HTTP_Success"
| fillnull "Count of HTTP_Success"
| fields "useragent" *
31. Using the Splunk Search Language
Pivot Search String
| pivot <modelname> <objectID> [statsfns, rowsplit, colsplit, filters, …]
Example:
| pivot WebIntelligence HTTP_Request count(HTTP_Request) AS "Count of HTTP_Request" SPLITROW status
AS "status" SORT 0 status
Behind the scenes:
sourcetype=access_* OR sourcetype=iis* uri=* uri_path=* status=* clientip=* referer=* useragent=*
| stats count AS "Count of HTTP_Request" by "status"
| sort limit=0 "status" | fields - _span
| fields "status", "Count of HTTP_Request"
| fillnull "Count of HTTP_Request"
| fields "status" *
32. Warnings
• | datamodel and | pivot are generating commands
– They must be at the beginning of the search string
•
Use objectIDs NOT user-visible object names
34. Data Model on Disk
Each data model is a separate JSON file
Lives in <myapp>/local/data/models
(or <myapp>/default/data/models for
pre-installed models)
Has associated conf stanzas
and metadata
35. Editing Data Model JSON
At your own risk!
Models edited via the UI are validated
Manually edited data models: NOT SUPPORTED
Exception: installing a new model by adding the file to
<myapp>/<local OR default>/data/models is probably okay
36. Deleting a Data Model
Use the UI for appropriate cleanup
Potential for bad state if manually deleting model on disk
37. Interacting With a Data Model
Use data model builder and pivot UI – safest option!
Use REST API – for developers (see docs for details)
Use | datamodel and | pivot Splunk search commands
39. Data Model Acceleration
Admin or power user
Backend magic
Acceleration
Non-technical user
Run search using on-disk acceleration
Run a pivot report
No acceleration
Kick off ad-hoc acceleration and run search
40. Model-Wide Acceleration
Only accelerates first eventbased object and descendants
Does not accelerate search and
transaction-based objects
Pivot search:
| tstats count AS "Count of HTTP_Success" from datamodel="WebIntelligence" where
(nodename="HTTP_Request") (nodename="HTTP_Request.HTTP_Success") prestats=true | stats count AS
"Count of HTTP_Success”
41. Ad-Hoc Object Acceleration
Kick off acceleration on pivot page (re) load for non-accelerated models
and search/transaction objects
Amortize cost of ad-hoc acceleration over repeated pivoting on
same object
Pivot search:
| tstats count AS "Count of HTTP_Success" from sid=1379116434.663 prestats=true | stats count AS
"Count of HTTP_Success”
Splunk 6 takes large-scalemachine data analytics to the next level by introducing three breakthrough innovations:Pivot – opens up the power of Splunk search to non-technical users with an easy-to-use drag and drop interface to explore, manipulate and visualize data Data Model – defines meaningful relationships in underlying machine data and making the data more useful to broader base of non-technical usersAnalytics Store – patent pending technology that accelerates data models by delivering extremely high performance data retrieval for analytical operations, up to 1000x faster than Splunk 5Let’s dig into each of these new features in more detail.
What is Data Model, and why do I care?Building a Data ModelManagement, Acceleration, and BeyondThe Future!Q&A
-The Splunk search language is very expressive. - Can perform a wide variety of tasks ranging from filtering to data munging and reporting- There are various search commands for complex transformations and statistics (e.g. correlation, prediction etc)
What does the search do?Basically, first it normalizes the individual accesses, which should be representable as a model object.Next it aggregates by guid to create an "instance" object, which should be representable in a DM.It calculates a field on that instance object, "type".Then it builds a timechart. of those, using a special "_time" value.Low overhead to start but learning curve quickly gets steepObtaining website usage metrics should not require understanding Apache vs IIS formatAdmins won’t know apriori what questions are being asked of the data…so they can’t provide canned dashboards for all scenariosBackup search for example: eventtype=pageview | eval stage_2=if(searchmatch("uri=/download*"), _time, null()) | eval stage_1=if(searchmatch("uri=/product*"), _time, null()) | eval stage_3=if(searchmatch("uri=*download_track*"), _time, null()) | stats min(stage_*) as stage_* by cookie | search stage_1=* | where isnull(stage_2) OR stage_2 >= stage_1 | where isnull(stage_3) OR stage_3 >= stage_2 | eval stage = case(isnull(stage_2), "stage_1", isnull(stage_3), "stage_2", 1==1, "stage_3") | stats count by stage | reverse | accum count as cumulative_count | reverse | streamstats current=f max(cumulative_count) as stage_1_count last(cumulative_count) as prev_count
What are the important “things” in your data?E.g. WebIntelligence might haveHTTPAccessHTTPSuccessUser SessionHow are they related?There’s more than one “right” way to define your objects
Constraints filter down to a set of a dataAttributes are the fields and knowledge associated with the objectBoth are inherited!
A child object is a type of its parent object: e.g. An HTTP_Success object is a type of HTTP_AccessAdding a child object is essentially a way of adding a filter on the parentsA parent-child relationship makes it easy to do queries like “What percentage of my HTTP_Access events are HTTP_Success events?”
Constraints are essentially the search broken down into a hierarchy, attributes are the associated fields and knowledge
Arbitrary searches that include transforming commands to define the dataset that they representFix example here? TODO
Enable the creation of objects that represent transactionsUse fields that have already been added to the model via event or search objects
This is how we capture knowledge
Required: Only events that contain this field will be returned in PivotOptional: The field doesn't have to appear in every event Hidden: The field will not be displayed to Pivot users when they select the object in PivotUse this for fields that are only being used to define another attribute, such as an eval expression Hidden & Required: Only events that contain this field will be returned, and the field will be hidden from use in Pivot
Be careful about lookup permissions – must be available in the context where you want to use them