"How to" Webinar: Sending Data to Sumo Logic

Sumo Logic Confidential
Data Collection
June 2016
How-To Webinar
Welcome.
To give everyone a
chance to successfully
connect, we’ll start at
10:05 AM Pacific.

At the completion of this webinar, you will be able to…
Design a Sumo Logic deployment that fits your
organization
Install Collectors
Create your Data Sources
Understand Local File Configuration Management

High-Level Data Flow

Sumo Logic Data Flow
Data Collection Search & Analyze Visualize & Monitor
Alerts
Dashboards
Collectors
Sources
Operators
Charts
1 2 3

Sumo Logic ConfidentialSumo Logic Confidential
Enterprise Logs are Everywhere
Custom App
Code
Server / OS
Virtual
Databases
Network
Open
Source
Middleware
Content
Delivery
IaaS,
PaaS
SaaS Security

Designing Your Deployment
• Sumo Logic Data
Collection is
infinitely flexible.
• Design a Sumo
Logic deployment
that's right for
your organization.
• Installed versus
Hosted Collectors.

Host A
Collectors and Sources
Apache Access
Apache Error
Collector
A
Host B
Collector
B
Host C
Collector
C
Apache Access
Apache Error
IIS Logs
IIS W3C Logs

Collectors

Collector and Deployment Options
Collector
Cloud Data
Collection
Centralized
Data
Collection
Local Data
Collection
Collector
CollectorCollector
Collector
Hosted Collectors Installed Collectors

Source Types
S3 Bucket
 Any data written to S3 buckets via AWS, Lambda
Scripts, custom Apps
HTTPS
 Akamai, Log Appender Libraries, etc.
Google
 Google API
Typical Scenarios
AWS Only Customers, while it's possible to
rely on Cloud Data Collection entirely, this is
not typical. These source types are normally
just part of the overall collection strategies
Benefits/Drawbacks
+ No Software Installation
- S3 Latency issues
- Https Post Caching Need
Cloud Data Collection
Most Data is generated in the Cloud and by Cloud Services and is collected via Sumo Logics Cloud Integrations.

Local Data Collection
The Sumo Logic Collector is installed on all target Hosts and, where possible, sends log data produced on those target Hosts directly to
Sumo Logic Backend via https connection.
Source Types
Local Files
 Operating Systems, Middleware, Custom Apps,
etc.
Windows Events
 Local Windows Events
Docker
 Logs and Stats
Syslog (dedicated Collector)
 Network Devices, Snare, etc
Script (dedicated Collector)
 Cloud API’s, Database Content, binary data
Typical Scenarios
Customers with large amounts of (similar)
servers, using orchestration/automation,
mostly OS and application logs
- On Premise Datacenters
- Cloud Instances
Benefits/Drawbacks
+ No Hardware Requirement
+ Automation (Chef/Puppet/Scripting)
- Outbound Internet Access Required
- Resource Usage on Target

Collector Deployment – Local Collectors

Source Types
Syslog
 Operating Systems, Middleware, Custom
Applications, etc
Windows Events
 Remote Windows Events
Script
 Cloud API’s, Database Content, binary data
Typical Scenarios
Customers with mostly Windows
Environments or existing logging
infrastructure (syslog/logstash)
- On Premise Datacenters
Benefits/Drawbacks
+ No Outbound Internet Access
+ Leverage existing logging Infrastructure
- Scale
- Dedicated Hardware
- Complexity (Failover, syslog rules)
Centralized Data Collection
The Sumo Logic Collector is installed on a set of dedicated machines, these collect log data from the target Hosts via various remote
mechanisms and forward the data to the Sumo Logic Backend. This can be accomplished by either using Sumo Logic syslog source
type or by running Syslog Servers (syslog-ng, rsyslog), write to file, and collect from there.

Collector Deployment – Centralized Collector

Deployment Options Summary
Collector Benefits Drawbacks
Local
• Direct access to source logs
• Ease of troubleshooting
• No additional HW requirements
• More Complex Management
• Resource usage on target host
• Need for outbound internet access
Centralize
d
• Fewer collectors and sources
• Simplified management
• Target hosts don’t need outbound
internet access
• Need for dedicated hardware
• More complex setup (users, permissions)
• Harder to troubleshoot
• Requires careful planning in order to scale
Hosted
• Agentless
• Build it into your infrastructure (S3)
• Direct HTTP POST
• Requires local script to POST or curl
messages
Resources:
 Design Your Deployment
 Best Practices: Local and Centralized Data Collection

Sources

Defining a Source
A single Collector can have
multiple Sources.
Key fields to define when
configuring any Source type:
• Name
• Description
• Historical Data
• Source Host
• Source Category
• File path
– Excluding syslog
• Timestamp Parsing

Source Specific: Remote File
Required for remote collection:
• Listening port
• Remote login credentials
– Username and password
– Local SSH
• Absolute file path

Source Specific: Syslog
Required for Syslog collection:
• Protocol
• Listening port

Source Specific: Windows Event Collection
Required for Windows Event Collection:
• Remote specific:
– Remote host name(s)
– Windows Domain
– Username / password
• Windows Event Type

Source Specific: Windows Performance Collection
Required for Windows Performance Collection:
• Remote specific:
– Remote host name(s)
– Windows Domain
– Username / password
• Frequency
• Perfmon Queries

Source Specific: Script
Required for script based collection:
• Execution frequency
• Command type
• Path to script
• Script to execute
• Working directory

Source Specific: HTTP
Required for HTTP Source:
• How to treat incoming POST
requests
After Configuration:
• Use URL to send POST
messages to the collector

Source Specific: Amazon S3 and AWS sources
Required for Amazon S3:
• IAM
– Key ID
– Security Key
• Bucket name
• Path expression
• Scan interval

Configuration: Filtering Source Data
• Regular expressions are used to create rules to filter data sent from a Source.
• The filters affect only data sent to Sumo Logic; logs on your end remain intact.
• Filter Types
– Exclude Filter (Black List)
– Include Filter (White List)
– Hash Filter (i.e. Replace credit card number with unique randomly generated code)
– Mask Filter (i.e. Mask each character with #)
– Note
• Exclude filters override all other filter types for a specific value
• Mask and hash filters are applied after exclusion and inclusion filters

Configuration: Filtering Files (Blacklisting)
• Blacklist files or set of files that shouldn’t be ingested

Metadata

Metadata Fields
Name Description
_collector Name of the collector this data came from
_source Name of the source this data came through
_sourceHost Hostname of the server this data came from
_sourceName Name of the log file (including path)
_sourceCategory Category designation of source data
Tags added to your messages when data is collected
Host A
Apache Access
Apache Error
Collector
A

Host A
Metadata Field Usage
Apache Access
_sourceCategory =
WS/Apache/Access
Apache Error
_sourceCategory =
WS/Apache/Error
Collector
A
Host B
Collector
B
Host C
Collector
C
Apache Access
_sourceCategory =
WS/Apache/Access
Apache Error
_sourceCategory =
WS/Apache/Error
IIS Logs
_sourceCategory =
WS/IIS
IIS W3C Logs
_sourceCategory =
WS/IIS/W3C
Sample Searches for
_sourceCategory:
= WS/Apache/Access
= WS/Apache/*
= WS/*

Source Category Best Practices
• Recommended nomenclature for Source Categories
Component1/Component2/Component3…
• From least descriptive to most descriptive
Networking/Firewall/Cisco/FWSM
Networking/Firewall/Cisco/ASA
Networking/Firewall/PAN/PA7050
Networking/Router/Cisco/2821
• Note: Not all types of logs need to have the same amount of levels.
• Benefits
– Simple search scoping by using wild cards anywhere in the string
– Simple, intuitive and self-maintaining partitions/index
– Simple and self maintaining RBAC rules
• Blog Post: Good SourceCategory, Bad SourceCategory

Automation

Automating Deployments
• Silent installation
 Use sumo.conf
 Provide name, credentials and source file parameter for initial setup only
• Local Configuration Collector Management
 Manage configuration locally using a JSON file with Chef/Puppet
 Available for both new and existing collectors
• Collector Management API
 Define an initial Source configuration for your Collectors using a JSON file
 Retrieve and update Collector Configuration from an HTTP endpoint

Installed Collector Deployment Tips
• Install using Collector Guidelines/Requrements
• Access Keys
– Used for collector registration and API
– ID/Key Pair instead of user/pass
• Especially important when storing credentials on disk
• Collector Logs
– Logs in: $SUMO_HOME/logs
– Current Log: $SUMO_HOME/logs/collector.log
– Check for Out of Memory Errors
– Increase memory if needed as described on Support Site Post

Questions?
Additional Resources
Search Video Library and Documentation
Search/Post to Community Forums
Search, post, respond
Submit/vote for feature requests
Submit Tips & Tricks
Open a Support Case
Sumo Logic Services
Customer Success, Professional Services,
Training

Thank You!
April 2016

"How to" Webinar: Sending Data to Sumo Logic

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to "How to" Webinar: Sending Data to Sumo Logic

Similar to "How to" Webinar: Sending Data to Sumo Logic (20)

More from Sumo Logic

More from Sumo Logic (19)

Recently uploaded

Recently uploaded (20)

"How to" Webinar: Sending Data to Sumo Logic

Editor's Notes