We wanted to give access to all data coming from your Vmware environment, this includes performance logs and a host of other data. The Vmware environment can be very much like a black box and visibility is key. This is something that is unqique to our vmware app in that we provide access to perf and log data all in one place.Splunk can maintain data granularity and persist data as far back as desired and still provide very good performance in retrieving it. This is something other solutions are not capable of doing.Finally while the Vmware environment provides a lot of data ranging from performance to logs, it can be affected by other technologies in the infrastructure. With Splunk you can correlate any data to get an end-to-end view of your environment. Again a very unique aspect of using Splunk.
The Vmware app 2.0 has a completely new workflow and while the data in the background is the same as previous versions, the ability to troubleshoot has been greatly improved.The app now allows the user to praoctively monitor what is going wrong in their environment through health dashboards and use the workflow to drill down to find the root cause of the problem. The improved workflow visualizes the key areas in the environment that are facing trouble and allows for quicker investigation of issues.
For deeper analysis you can use the performance views to dig through the granular 20second metrics from VC. VC generally expires the 20 second metrics in 2 hours but we can persist it over as long as desired. This allows for more accurate troubleshooting and analysis of performance issues in your environment.Further more you can not only identify performance issues but also delve directly into the ESX and VC logs to find events of note and to correlate them with the performance issues in order to narrow down the cause
The Vmware solutionhas 3 components. On the left is your Vmware environment and on the right is your Splunk indexer/searchhead, drawn as one unit for simplicity. The three components are the Vmware app, which has all the dashboards and visualizations; the Splunk_TA_vcenter which sits on a Splunk forwarder on a windows vCenter server, this is responsible for identifying all the VC logs that need to be forwarded to the indexer; and finally there is the FA VM. The Forwarder Appliance Virtual Machine. This is a machine running Splunk and a component called Splunk_TA_vmware which is responsible for making API calls to the ESX hosts and VC to collect performance, ESX log, hierarchy, inventory, task and event data. The yellow lines indicate these data. There is a ratio of the number of FA VMs to hosts around 1:30 but we will cover that later.
This is the actual FA VM. We ship it as a single .ova file (open virtual appliance) downloadable from Splunk base. It is essentially a machine running CentOS 5.7 with a Splunk forwarder and a component called Splunk_TA_vmware that contains perl modules that connect to Vmware’sperl SDK and make all the api calls. The reason for providing a data collection VM is to make it easier for you the user to get up and running faster. You do have to install the Vmwareperl SDK yourself.We provide root access to the OS so you can keep it upgraded as you likeWe do support building your own Data collection appliance on RedHat. The docs cover what is supported and what is not.
Support vSphere 4.1 and aboveIt is highly recommended you have installed your vCenter on Vmware’s recommended hardware. It is likely you have already done this. But I stress on it because we do make a number of API calls to the VC and the impact to the VC is minimal when it is installed on recommended hardware.If VC logs are desired we only support the vCenter on windows, not the linux appliance. Having the linux appliance however does not mean you cannot get all other data such as perf, esx logs, tasks and events etc. All this data comes from the API so it does not matter which VC you are running.
Ensure Splunk is installed on reference hardware because the app is search heavy and is constantly performing data summarization.The app makes heavy use of summary indexing and cannot work without it. Enterprise licenses definitely have summary indexing but some other licenses like Education licenses may not.
The first step prior to the main installation steps is to create a limited permission service accounts. This has to be done manually on a windows machine.
On Splunkbase-Splunk App for Vmware-Splunk Technology Add-on for VMware vCenter-Splunk Forwarder Virtual Appliance for Vmware
The first two steps are very simple and involve unzipping apps onto the indexer/searchhead or fowarder on the vCenter. I will focus most of my time today on the third step which relates to the configuration of the FA VM.
The FA VM Deployment and configuration can be broken up into 3 major best practice areas.The first being to deploy and resource it correctly, second to create service accounts on the esx hosts using the tools we provide and finally to get your engine running by configuring data collection.
FA VM needs network access to be able to send and receive data from your vCenter servers and ESX/i hosts and send data to your Splunk indexer. We recommend you install the FA VM into the same subnet as the VC and hosts. Ensure DNS is configured on the FA VM. DNS configuration allows the FA to use the list of ESX hosts provided by the vc, in order to connect to the hosts.Set time zones between your FA VM and indexer(s), recommended to use NTP, this is enabled by default. Also check your ESX hosts and make sure they are in the same timezone as each other.
Increase the resources on your FA VM while deploying. Increasing the reservation will ensure this machine gets the resources it needs to crawl the hosts. Doing these steps will save you a lot of time later on if you want to monitor more hosts as you can just add hosts without worrying about whether it can manage that much load. This will also reduce your management hassles with fewer FA VMs.
You can use the scripts that are provided with the FA VM to remotely create/re-permission local limited-permission service accounts on your ESX hosts. Remember the vCenter account is created manually.
The script is called enginebuilder.py. It is important to use this script at all times because it helps parallelize the data collection by splitting the conf files efficiently and also does a credentials check before it creates the files. It checks the credentials on your vc and all your esx hosts.
This is how the enginebuilder.py script splits up the conf files. It creates 4 separate conf files, one for each major data type. The inputs.conf file then uses these configuration files as parameters to a scripted input. The conf files specify where to collect what data from. E.g. perf data is only collected from hosts, hierarchy and inventory only from the VC, and so on.
You can use the enginbuilder.py to further split up data collection if you have more than one FA VM. You just tell it how many hosts per FA VM and it splits it up into a set of conf files as just files where the enginebiulder is being run. And for the other FAs it creates tar balls for you to copy to the desired FA VM and use enginbuilder on that FA VM to untar it correctly.
This is an example of how a scaled environment might look like. One FA VM is dedicated to collecting data from the VC, and may collect from some hosts too, the second FA only collects from hosts.
SplunkLive! Washington DC May 2013 - Splunk App for VMware
Provide insight into all VMware dataPersist granular data over time for analyticsGain visibility into other infrastructure layersMonitoring VMware Environments
Data Collected5Data type Data descriptionPerformance Performance metrics for hosts, VMs and clustersESX/i and VC logs Logs from your physical hosts and virtual centersTask/EventsTasks/Events performed on physical hosts and virtualcentersHierarchy/InventoryHierarchy of VMware environment and inventory dataabout hosts and VMs
Proactive Monitoring6• Visualize state of environment based onhealth dashboards.• Narrow down problems to sections ofenvironment with topology tree• Drill down workflow to find root cause
Comprehensive Operational Analytics7• Uses 20 second performance metrics forhistorical trends and statistical comparisons• Analyze ESX, VC logs and exceptions with pre-packaged topology-based filters• Track environment and user changes usingtasks and events, for security and changeanalyses
Platform For End-to-End Visibility8• Correlate virtualization data with othertechnologies in your IT stack• Harness data from large-scale distributedVMware deployments• Scale to handle any volume of data acrosstechnologies and data centers
VMware Prerequisites14Ensure you have supported versions of VMware components– vCenter server 4.1, 5.0, 5.0 Update 1, 5.1– vCenter server on Windows (for VC logs)– ESX/i hosts 4.1, 5.0. 5.0 Update 1, 5.1 on 64 bit x86 CPUsInstall vCenter on recommended VMware reference hardware
Splunk Prerequisites15We support Splunk versions 4.3.5 and aboveEnsure Splunk is installed on reference hardwareEnsure license supports summary indexingEnsure you have adequate licensing volume– 500 MB - 1 GB per host per day for default config
Create Service Account on vCenter16vCenter Server (Windows)
Deploy and Resource the FA VMDeploy andResource theFA VMCreate serviceaccountsConfigure datacolection20
Deploying the FA VM with Required Access21FAVMvCenter Server (Windows)ESX/i hostsNetwork accessDNS configuredSet time zones
FA VM Resources for Best ResultsOut of the box2 vCPUs, 250 Mhz reservation4 GB memory, 128 MB reservationCan monitor up to 20 hosts (25 VMs per host)Increase to4 vCPUs or 2 vCPUs with 2 cores, 4 Ghz reservationIncrease memory reservation to 2 GBCan monitor up to 30 hosts (25 VMs per host)22
Create Service AccountsDeploy andResource theFA VMCreate serviceaccountsConfigure datacollection23
Create Service Accounts24FAVMESX/i host 1ESX/i host 2vCenter Server (Windows)Create manually
Use Script to Create Service Accounts25Using the script minimizes errors and saves timeCreate accounts on all hosts in a VC– Expects the same admin account for all hosts– Creates the same account on all hostsCreate account on one host, managed or unmanagedPermission existing accounts on hosts– Can be AD accounts
Configure Data CollectionDeploy andResource theFA VMCreate serviceaccountsConfigure datacollection26
Use Script to Engine Files for Data Collection27Using the script minimizes errors and saves timeScript creates .conf files that are used by inputs.confCreates engine-<datatype>.conf files– Specify what data to collect and what entity to collect from– Splits the .conf files for parallelized data collection