Managing an EPM platform is not for the faint of heart – and going at it without a plan can leave you frustrated, nervous, and accountable if trouble strikes. But how do you prepare?
This presentation helps you get all of your EPM planning in one place with an EPM Punch List. We’ll talk through all the areas you should be concerned about to keep your Hyperion and Oracle EPM applications running smoothly, and give you solid, actionable strategies so that you are prepared for the worst.
2. www.datavail.com 2
Chuck Czajkowski, Solutions Architect
Over 20 years IT and Networking experience
On the support team at Hyperion Solutions in the early 2000’s working
as an Environmental Support Lead handling complex issues
Product Issues Manager for the Common Technology Group at Oracle
after acquisition
Worked closely with many dev teams (HFM, Planning, Essbase, FDM,
Workspace, Shared Services, Weblogic) to facilitate worldwide support
training for new releases, patches (PSEs/PSUs), and ensure overall
stability and functionality.
Joined the Accelatis team at the beginning of 2014 supporting all
customers currently running the platform.
At Datavail, works with the Sales and Marketing team to identify and
create solutions for new and existing customers
Presenter
3. www.datavail.com 3
Director at Datavail, overseeing the Technical Middleware and
Oracle Applications Practice.
15+ years of Oracle EPM / Hyperion product experience with
leading companies like IBM and H&R Block.
22+ years of IT industry experience with complete project
development and ongoing production support cycles.
Specialties include automation, utility/script development,
product installation, performance tuning, and troubleshooting
a variety of Hyperion modules.
Presenter
Dave Shay,
Practice Director – Technical Middleware, Oracle Applications
5. www.datavail.com 5
1. Experience
Industry Specializations
Professional Services
Wholesale Distribution
Oracle Financial Services
Oil and Gas
Industrial Manufacturing
Oracle Accelerate for Midsize
Companies
Natural Resources
Aerospace and Defense
Chemicals
Financial Services
Life Science
Travel & Transportation
Enterprise Performance
Reporting Cloud Service (EPRCS)
Financial Close & Consolidation
Cloud Service (FCCS)
Oracle Essbase
Planning Budgeting Cloud Service
(E/PBCS)
Oracle Hyperion Data
Relationship Management
Oracle Hyperion
Financial Management
Oracle Hyperion
Planning
Product Specializations
10+ years
of delivering
100’s of
consultants
On-Prem &
Cloud Experience
6. www.datavail.com 6
2. Commitment
To The Hyperion & EPM
Market
Datavail has invested millions
into developing best-in-class
services for the Hyperion/EPM
market
Acquired the only software
platform specifically developed
to manage EPM & Hyperion
environments
To You
We have put significant time and energy into
making sure what you see today matters to
YOU
We will invest in the relationship and
embrace behavior that exhibits ‘Covenant
over Contract’
7. www.datavail.com 7
Professional Services
Top quality, experienced consultants with years of experience
Best Practices leveraged by Accelerators
3. Three Keys Under One Roof
IP – Accelatis Software Platform
The only commercially available software product designed specifically to improve
performance, reduce labor, and deliver RCA – developed for EPM & Hyperion
Managed Services
Capability that goes way beyond just a block of hours
24x7 coverage, best in class ITSM platform, ticket workflow, service management,
security posture, a multi-tiered onshore/offshore delivery model
9. www.datavail.com 9
Today’s Agenda
EPM Environments – A Team Effort
Gathering Initial Information & Data
Conceptualize the Punch List
DR Consideration
RCA and Correction
The goal of today’s seminar is
to provide knowledge which
can and should be used as
part of an overall plan for your
EPM environment..
12. www.datavail.com 12
Networking
Team specific to network communications
and might be responsible for the following:
• Firewalls
• Routers
• Switches
• Wide Area Network
• Remote Access VPN
Network Speed can sometimes cause user
performance issues
Different Layers…
13. www.datavail.com 13
Database
DBAs, especially in larger companies, have
exclusive access to the databases and have
to be contacted for any questions regarding
operations.
Specialized expertise
Can have completely different schedules
with regards to backup and retention
Probably the most important EPM aspect
Different Layers…
14. www.datavail.com 14
Servers / Hardware
Usually the server itself and it’s associated
components including the Operating System
management and down to the actual disks in
the server or SAN would be included.
OS Patching would be handled by this team
Firmware and other system related patching
User accounts and external authentication
mechanisms often fall under this team
Many times server teams are unaware of the
software nuances that come with products like
EPM
Different Layers…
15. www.datavail.com 15
Application
Deal with day to day operation
Can have completely different needs than
other layers when it comes to retention and
uptime
All other layers need to be working properly
in order for the Application to function well
Difficult to understand the relationship
between resources and application
performance
Different Layers…
17. www.datavail.com 17
Start Your Rolodex
Gather the names and email of the responsible parties
for your own EPM installation
Quick introduction and ask for information goes far
Start to create your playbook when it comes to the EPM
environment so that there is a place to go to when
things are not working correctly
Understand your user trends and what that might do to
each of the teams involved so that you know when
things are broken vs. busy.
21. www.datavail.com 21
Best Practice Service Framework
Uptime
Monitoring
Root Cause
Analysis
Audit &
Compliance
Tracking
Optimization
& Tuning
End User
& IT Support
Load Testing
Intelligent
Health Checks
Business
Process
Monitoring
Functional
Administration
24x7 Incident
Response
Performance
Monitoring
Environment
Replication
Management
Dashboards
Report
Development
Log Analysis &
Management
Regression
Testing
Backup
Retention
User Experience
Monitoring
Calculation
Trending
Enhancements
22. www.datavail.com 22
Monthly Tasks
Regression test Oracle
software updates in all
environments to ensure
patching has not had an effect
on EPM
Review close activities and
how long it took.
Identify places where things
could have been improved
Assess capacity / proactive
growth management
Best Practice Tasks (Examples)
Quarterly Tasks
Respond to auditor requests
for SOX audit and/or Change
Management policy
adherence; Provide reports
required to pass audits.
Provide additional resource
support for Quarter-End
activities.
Clean up unused account or
objects in the past 90 days
Annual Tasks
Provide additional resource
support for Year-End
activities.
Review goals set last year
and assess where you are.
Create new goals for next
year and create milestones
to ensure movement.
Roll the year.
23. www.datavail.com 23
System Management
Tracking Changes
Monitoring
Log Management
Root Cause Analysis
Help Desk / Support Process
Performance Testing
Focus Areas
Implementations / Upgrades / Migrations
Continuing Optimization
Automation
Security
User Trends and Tracking
Software Management
Project Management
24. www.datavail.com 24
Focus Area Task Task Description Frequency Priority
Automation Execute Dense restructure Defragment an Essbase/Planning cube identified in above task Weekly Med
Continuing Opt Adjust Essbase (BSO) data and index caches Tune the caches for Essbase to improve performance. Monthly Low
Security Apply Hyperion patches
Bring Hyperion 11.1.2.4 current with the Hyperion and Oracle Middleware patches. Oracle publishes "critical"
security updates quarterly.
Quarterly High
System Mgmt Assess new Hyperion patches & Oracle CPU Determine which new patches would be applicable to the environment. Quarterly High
Monitoring Confirm success of overnight data integration Review logs from the overnight FDMEE process. Remediate failed data loads. Daily High
Continuing Opt Essbase fragmentation assessment Examine Essbase clustering ratio for each cube. Determine if defragmentation is needed to improve performance Weekly Med
Tracking Changes Essbase/Planning outline compare Run a report that affirms Essbase outlines are synchronized across environments, and show any differences found. Monthly Low
Automation Execute FDMEE maintenance scripts The tables within FDMEE do not self-prune, and data accumulates indefinitely without intervention. Quarterly Low
Automation Execute targeted LCM migration
On an as-needed basis, migrate specific application artifacts from one environment to another after SOX controls
have been adhered to. Provide segragation of duties (the person who edited/created the artifact should not be the
same person who migrates it).
Monthly Low
System Mgmt FDMEE sync across environments Syncronize FDMEE rules, scripts, and other settings from Production down to 1 lower environment. Monthly
Monitoring Health check: Essbase backups complete
Verify the overnight Essbase rolling 7 day backups complete every day. This backup process would be installed by
Datavail.
Daily Med
Monitoring Health check: LCM backups complete
Verify the overnight Hyperion Life Cycle Management rolling 7 day backups complete every day. This backup
process would be installed by Datavail.
Daily Med
Monitoring Health check: sufficient disk space
Produce an alert if any Hyperion server is in danger of running low on disk space. Hyperion services will crash
and/or become corrupt when the disk becomes completely full.
Daily High
Monitoring Health check: system is online Verify all elements of the system are available to users. No services are offline. Daily High
System Mgmt HFM sync across environments Syncronize metadata, data, reports, calc scripts, CalcMgr rules, etc., from Production down to 1 lower environment Monthly High
System Management Hyperion hardware capacity assessment Determine if the hardware is sufficient for current and projected business/technical needs. Quarterly High
Log Management Monitor for Essbase crashes (.XCP) Alert when Essbase crash dump files (ess*****.xcp) are detected. Daily High
Log Management Monitor for Essbase IBH (BSO) Check for Invalid Block Headers, which is evidence of Essbase data file corruption. Weekly High
Log Management Monitor HFM event log for evidence of paging Evaluate if HFM tuning-related settings should be modified due to how the application evolves over time. Weekly Med
Log Management Monitor size of FDMEE outbox folder The FDMEE folders do not self-prune and will accumulate files indefinitely Weekly Med
Performance Testing Monitor/trend Essbase calculation times Review Essbase calculation runtimes and determine if any calculations are starting to trend in the wrong direction. Monthly Med
Project Management Monthly business review with SDM & technical team Client and Datavail technical team meets to review open tickets, SLA/SLO metrics, etc. Monthly N/A
Software Management On-premises EPM license adherance audit Review user security provisioning counts and compare against owned licenses, for Oracle licensing compliance. Quarterly Low
25. www.datavail.com 25
Task Task Description Frequency Priority
Execute Dense restructure Defragment an Essbase/Planning cube identified in above task Weekly Med
Adjust Essbase (BSO) data and index caches Tune the caches for Essbase to improve performance. Monthly Low
Apply Hyperion patches
Bring Hyperion 11.1.2.4 current with the Hyperion and Oracle Middleware patches. Oracle publishes "critical" security
updates quarterly.
Quarterly High
Assess new Hyperion patches & Oracle CPU Determine which new patches would be applicable to the EPM environment. Quarterly High
Confirm success of overnight data integration Review logs from the overnight FDMEE process. Remediate failed data loads. Daily High
Essbase fragmentation assessment Examine Essbase clustering ratio for each cube. Determine if defragmentation is needed to improve performance Weekly Med
Essbase/Planning outline compare Run a report that affirms Essbase outlines are synchronized across environments, and show any differences found. Monthly Low
Execute FDMEE maintenance scripts The tables within FDMEE do not self-prune, and data accumulates indefinitely without intervention. Quarterly Low
Execute targeted LCM migration
On an as-needed basis, migrate specific application artifacts from one environment to another after SOX controls have been
adhered to. Provide segragation of duties (the person who edited/created the artifact should not be the same person who
migrates it).
Monthly Low
FDMEE sync across environments Syncronize FDMEE rules, scripts, and other settings from Production down to 1 lower environment. Monthly
Health check: Essbase backups complete Verify the overnight Essbase rolling 7 day backups complete every day. This backup process would be installed by Datavail. Daily Med
Health check: LCM backups complete
Verify the overnight Hyperion Life Cycle Management rolling 7 day backups complete every day. This backup process would
be installed by Datavail.
Daily Med
Health check: sufficient disk space
Produce an alert if any Hyperion server is in danger of running low on disk space. Hyperion services will crash and/or become
corrupt when the disk becomes completely full.
Daily High
Health check: system is online Verify all elements of the system are available to users. No services are offline. Daily High
HFM sync across environments Syncronize metadata, data, reports, calc scripts, CalcMgr rules, etc., from Production down to 1 lower environment Monthly High
Hyperion hardware capacity assessment Determine if the hardware is sufficient for current and projected business/technical needs. Quarterly High
Monitor for Essbase crashes (.XCP) Alert when Essbase crash dump files (ess*****.xcp) are detected. Daily High
Monitor for Essbase IBH (BSO) Check for Invalid Block Headers, which is evidence of Essbase data file corruption. Weekly High
Monitor HFM event log for evidence of paging Evaluate if HFM tuning-related settings should be modified due to how the application evolves over time. Weekly Med
Monitor size of FDMEE outbox folder The FDMEE folders do not self-prune and will accumulate files indefinitely Weekly Med
Monitor/trend Essbase calculation times Review Essbase calculation runtimes and determine if any calculations are starting to trend in the wrong direction. Monthly Med
Monthly business review with SDM & technical
team
Client and Datavail technical team meets to review open tickets, SLA/SLO metrics, etc. Monthly N/A
On-premises EPM license adherance audit Review user security provisioning counts and compare against owned licenses, for Oracle licensing compliance. Quarterly Low
27. www.datavail.com 27
DR Considerations
Routine Tests of Failover
Include Patches and
Changes for DR
Performance Test
Ensure Process of Cutover
is well known
DR can be just a single app
– include all levels needed
Wise Words….
“Give me six hours to chop
down a tree and I will spend the
first four sharpening the axe.”
- Abraham Lincoln
“By failing to prepare, you
are preparing to fail.”
- Benjamin Franklin
29. www.datavail.com 29
Know what technologies are in play
Pull in expertise as needed for diagnosis
Know where to find all the information you need
Quickly Identify the Scope
Use Deductive Reasoning
Once the trigger is identified, work on solutions
that will stop it from happening in the future
30. www.datavail.com 30
User
Problem
RCA Methodology
Systemic performance
Intermittent
performance
Server failing
Feature failing
Process failing
Symptom
User Error/
Misrepresentation
Config Change
Data Change/Population
App Change
Config Sync
Network
Failed Component
Resource Bottleneck
Software Bug
CauseAnalysis Actions
Verify
Poor Config
Config
Change
Data
Change
Application
Change
Configs out
of sync
Rule Out
Network
Analyze and
Rule Out
Logs
Alerts
Look for
reported
Errors
Database
App Server
Web Server
Benchmarks
Analyze
Performance
32. www.datavail.com 32
Datavail’s Oracle Application Support
Solution
Comprehensive
Service
Solves for technical stack,
functional, and user support
needs
24x7
Service Team of
500+ Oracle
professionals
Accelatis APM
Platform
reduces labor
Fixed-price
contract with
customer delight
guarantee
33. www.datavail.com 33
Best Practice Service Framework
Uptime
Monitoring
Root Cause
Analysis
Audit &
Compliance
Tracking
Optimization
& Tuning
End User
& IT Support
Load Testing
Intelligent
Health Checks
Business
Process
Monitoring
Functional
Administration
24x7 Incident
Response
Performance
Monitoring
Environment
Replication
Management
Dashboards
Report
Development
Log Analysis &
Management
Regression
Testing
Backup
Retention
User Experience
Monitoring
Calculation
Trending
Enhancements
34. www.datavail.com 34
Uptime
Monitoring
Root Cause
Analysis
Audit &
Compliance
Tracking
Load Testing
Intelligent
Health Checks
Performance
Monitoring
Management
Dashboards
Log Analysis &
Management
Regression
Testing
User Experience
Monitoring
Calculation
Trending
Best Practice Service Framework
With Accelatis APM Platform
Backup
Retention
Environment
Replication
Business
Process
Monitoring
Functional
Administration
Optimization
& Tuning
24x7 Incident
Response
End User
& IT Support
Enhancements
Report
Development
Management
Dashboards
Uptime
Monitoring
Regression
Testing
Performance
Monitoring
Load Testing
Root Cause
Analysis
Intelligent
Health Checks
Audit &
Compliance
Tracking
User Experience
Monitoring
Calculation
Trending
Log Analysis &
Management
35. www.datavail.com 35
Benefits of Accelatis
Monitors 3,000 telemetry attributes
Reduces labor by up to 50%
Automation reduces human error
Delivers more comprehensive service
Best in class made affordable; no need
to compromise
You can break down your EPM environment using the basic building blocks that most shops use for almost anything sitting on the network. Its entirely possible that these areas are all the same to you and your company but the majority of the time, this is NOT the case and if you take the time now to know these areas, what they mean to your EPM system and who is responsible for them, it will save you time and effort when you really need to pull these pieces together. The most urgent issues require you to have this at the tip of your finger in order to minimize downtime for your financial organization – every minute you save and get folks back online is revenue for your company that has not been wasted. Even in times of smooth sailing, you’ll be at the mercy of different teams changing things . with either by patches, or settings, or new policies & acquisitions– you’ll still want to know as much as you can about these areas and be able to speak to any one of them with confidence.
The first layer we can talk about is Networking. These teams are traditionally responsible for everything between the network card in the server to the network card of the client. There are many different devices and technologies in this arena and you see a few of them listed here. Networks offer their own set of possible issues and depending on the experience and skill of your network team, these are issues that can either be quickly identified and worked around, or fester for days without anyone noticing and could be the entire reason why your end users are suffering. Before I was introduced to Hyperion, I was one of 3 systems engineers working in a world class datacenter in the Nasdaq complex. I can not tell you how many hardware failures we had to deal with. Luckily, we worked in an environment where literally, everything in that network was redundant – 2 different internet providers, 2 different sets of PIX firewalls and heavy duty Catalyst switches – all the way down to redundant teamed NIC cards in the servers. Reason why we did it like that is because failures on either a single port, or a cable, or even a routing module were so common, that we HAD to have the ability to remove one entire set of devices so that we can fix, or repair anything in the network, without loosing connectivity. I used to think that network equipment was so reliable, and now I know that’s not really the case!
Next area we should bring up would be the Database. Just like the other areas, sometimes there is no database team and the same team manages the servers, all the DBs and network – and that’s ok. The technology has come a long way and while things seem to just run once they are up, it is important to know that because the environments, sizes of the teams and databases, and demands on the server itself do fluctuate and change over time, you should be keeping an eye on your database and at the very least, KNOW the team who is supporting it. If you get into the real nuts and bolts of the technology, DBAs have a VERY specialized role to play and when it comes to the performance and I/O, they can work magic with settings, caches and indexes. Unless you and the database team understand what each other are doing, you might be setting yourself up for some scenarios which can hurt the entire team and slow their momentum during a busy cycle. The database is paramount in EPM environments. You can completely remove your servers from the picture – take your existing DB – and with some know how, completely rebuild your EPM environment without much trouble … well, maybe a little trouble, but all things considered – that database is really important. You should work hard to understand what this team does, when they do it, and who is in charge.
It’s where your applications live and thrive. The actual servers and hardware that your team uses is someone’s responsibility. You’ve got lots and lots of possible things here that come into play. Like anything else, it can be a simple setup or a very complex one. Physical or virtual, each has their own set of confidences and challenges. This is also the team that would be in charge of a whole lot of patching – not just operating system patches, but there are firmware updates for components, domain group policy, OS patches and yes, EPM patches usually fall under this groups jurisdiction. The reason why the Accelatis platform was originally conceived was because of an inherent disconnect between the Server teams and the finance folks running the applications. We can all attest to the fact that just because the memory and CPU looks good, does NOT mean your application is running well. We also know that the application, while depending on many things to execute well, needs the server to give it space to do it’s thing. I’ve talked to many teams when I was working the support floor at Hyperion that didn’t have any idea about server activities like patching and backups. Matter of fact, more than once we found server backup routines running DURING close periods – you can only imagine what that looked like when it came to EPM performance. Lesson is that you have to be joined at the hip with the server team.
And the Application – the lovely application itself. You, if that is you, have to do all the day to day things that keep it up and running. The week to week, monthly tasks that allow everyone to continue working efficiently and effortlessly as you run at peek performance – no issues, no slowdowns, no angry end users. Hopefully, that’s you. EPM is a challenging application suite to size correctly, architect well, install cleanly, configure and build solidly and continue to work at the same level as time goes on. It tends to get cranky as it gets older. It’s like the proverbial old man yelling “Get Off My Lawn!!’. The relationship between you and your application is one that should never be taken for granted and even if the boss can’t appreciate the hard work you put in to keeping it running, we certainly can. And if you are saying to yourself, “Wow, I never have any issues with my apps” or “I’ve never logged a help desk ticket with Oracle”, then consider yourself lucky! But this session will help you as well because after all, there is no EPM environment that should not be documented, looked at, and maintained with a punch list that can work as either a master “to do” list, a validation to your boss of the many things you have to do in order to keep things running, or even a place to go to when you have troubleshoot and do root cause analysis.
So lets start talking about the initial data you could….and should gather to start your punch list off.
Quick survey – who in the room, at some point in their working life, has had to flip through a rolodex like you see in the picture – raise your hand if you worked this or any other rolodex type in the past. Think of your first task here a building the rolodex. Names, emails of people who are responsible for the different layers that your EPM system depends on. It might seem like overkill, but have a quick phone call – chat about what you are doing and that you wanted to introduce yourself as the app owner, or the finance director, or whatever the case may be. Build a list of friendlies that you can work with to build a complete picture of the environments. These are going to be folks that you might have to call if something seems off or not working right – it’s a great idea to talk to them before anything goes wrong because we all know that pressure when the chips are down and people are scrambling to find out what is going on. It’s no fun, that’s for sure. You might even ask if they have a playbook already written for the financial applications – would be a huge help if that was the case.
So really, it could be the as simple as a few introductions, some info exchange, but that should be the first part of your punch list – it’s like having the fridge magnet of your favorite pizza place right there for when you really need it!
Along with the folks who are responsible for the different layers that go into your EPM environment, there is static information that you should have at your fingertips for a few reasons – the punch list SHOULD contain some periodic grab of these things for a few reasons. First would be that you know what these properties and objects look like, where they are and because many of them are things that you might be called on to get if Oracle needs to help you troubleshoot something. The EPM system registry report can be run from any server in the EPM environment and contains a master framework of your environment. There are relationships in that file that you can use to understand which servers are running which products, what the settings are (if held in the registry) and what files might be included and stored in the database to make your application behave the way it’s configured to. This same repot with some additional arguments turns into the EPM Deployment report. It’s similar information but organized in a slightly more user friendly way. It also includes any and every time the config tool was run to include times, activities and what was changed if anything. Consulting companies were not thrilled to hear that Oracle was storing this data – because I remember when they released this functionality) – but it’s a good way to see the timeline of your EPM environment from first install, to every change since. Some products have specific configuration files that are critical – I don’t pretend to know them all, but Essbase is a perfect example as it has a file called the ESSBASE.CFG file. You can make changes in this file which have a huge effect on the product. Things ranging from log rotation to cache sizes and parallel execution can be set from here and it’s something that should be looked at periodically for changes. At the very least, one time for your punch list. The last few things require some access that might not be possible to get if you have any security restrictions on the servers of the databases. Registry settings in Windows used to be way more important before Oracle started storing data in the EPM registry which resides in the Foundation database. It does still contain data in both the HKLM\Software\Hyperion and HKLM\Software\SysWOW64\Hyperion branches. You can and should export those as point in time snapshots and as ongoing checks to ensure only approved changes have taken place. Finally, the database holds information on products like Planning and HFM. You should take the time to have these properties exported so that they are accounted for in the same way as registry settings might be.
You have your rolodex of names and responsible parties, you have all your settings looked at, verified, exported and added as important artifacts in your punch list. Now it’s time to think about all the things you do and should be doing for the proper upkeep of your applications.
What you see here is a slide that we use to show all the different ways in which Datavail can come in and help companies achieve a best practice framework. Our services team works in all these areas and when you are thinking about your own list of things to do, I’d like you to use this as a sound board and get you thinking about things that you should be doing. Each section might have a number of tasks associated with them and from here, you can round out the list to ensure you are executing your own best practice and creating a playbook that others will follow as well. I’ll take for example uptime monitoring. It’s possible that for this task you have a simple list of pages to hit each morning – daily task in your punch list to ensure the apps are up and running. Many companies will have some sort of monitoring happening on either services or websites, but the actual log in to verify functionality might be a task that IT simply can’t do. As you move to the different areas, you might not personally have any tasks to do, and that’s OK because this is YOUR punch list.
Also start to think about your tasks in a periodic manner. When our managed services team does this for customers, they will break it down into tasks that need to be done every day, every week, monthly, quarterly and annually. By doing this, the things that don’t happen all the time won’t be forgotten and important maintenance will not be forgotten. In the example here notice that we have things like regression testing to ensure that patches have not had a negative effect on your apps. Unless you spell out things like this, they are usually the first to get dropped when things get busy. We can’t stress the importance of putting these types of verification into your punch list so that you are never caught off-guard by performance surprises. Another item which is more than pressing a button or truncating a log might be quarterly audit reporting which in itself, can be lots and lots of separate line items of things to do. Even if the punch list contains only the parent items, it’s ok. The idea is that you have an accountable list of actionable items for you and your team which is to be followed each and every period. You’ll get into some very different concepts and ideas the more people you talk to. Many companies have their own rules and regulations which might dictate the frequency of these things and many might already have processes and automations in place to achieve these things. Only by taking a wide look at what you do each period will you be able to make your punch list a great working document.
You can also use this list of focus areas to setup your punch list. This list is actually the list of things our Accelatis platform helps with, but it lends itself very well to the punch list creation. On this list are areas to think about like Log Management. To you, this might be tracking the size of the logs that are on the disc and to ensure that you are keeping logs at a usable size. Essbase is infamous for spawning huge log files. Oracle did move to their ODL format and many of their logs rotate on their own which is supposed to help you manage them. However, many of these logs files roll so quickly that by the time you catch a problem, it might not exist at all. There are a series of services logs for EPM that overwrite themselves each and every time a service is restarted – if you know what I’m talking about, then you know because the knee jerk reaction when something is not working is to restart the service, but if you don’t know about these log behaviors, then you just might miss a step making RCA harder than it has to be. The other thing is that even though ODL is widely used, Oracle has still not de-commissioned the old logs like HFM and it’s hsveventlog.log, or Essbase and it’s crazy large ESSBASE.LOG. So you still have to consider these logs and how to best keep them at bay. Again, the focus areas will provide yet another angle for you to think about for your punch list.
This is a punch list that our services team for a very particular customer. I wanted to show it to you because it has the task, a little description of the task, the frequency in which it’s to be executed and a column for priority. I think that’s a fantastic thing to think about because lets face it, some things ARE more critical and need that additional weight to ensure that if time only allows a few things, and one is high priority, you better bet that’s the one you want to work on first to ensure it’s completed. The additional benefit of this list is that you can quantify effort in doing these tasks. We know from talking and working with many customers, that most shops struggle with the things they are supposed to be doing, and the things that they are able to get done. It’s hard to execute all things well when you simply don’t have the staff or time to get things done. This might double as the request for another person on the team or maybe even a justification of your time and what it really takes to manage and care for a healthy EPM environment.
Same list here with some HIGH priority highlights – as I mentioned before, each company will have their own things that are of high value so mark them as such, make sure you and your team understand that although all things are done, reality sometimes does not allow for that to be true.
Disaster Recovery - Remember Disaster recovery.
Like those little motivational posts that grace my facebook feed each and every day, I wanted to add some quotes from gentlemen that I didn’t know, and that I quite frankly am not 100% sure said these things exactly, but the internet tells me it’s them, and they both did some great things so…..read them and know that when you think about your EPM environment, they are both applicable. EPM is not a RonCo product and you can not “Set It, and Forget It”. DR in my experience is something that is setup, and rarely looked at again. Don’t get me wrong, I’ve worked with companies who absolutely build, refine, test, maintain and do all the things you should do with a DR environment, but that is completely atypical. Most DR environments I see are either architected in such a way, that they can not be tested or simply are never tested. You should be able to at the very least, do routine testing of your DR environment. Even if it’s quarterly or annually, it’s something that you should keep in mind because if you ever needed it to work, you don’t want to rely on prayer and warm thoughts. Part of routine maintenance for your PROD and DEV systems should also roll over to DR. Make sure those tasks are performed here too. Many DR environments require a cutover which would involve other teams – Network team is the one on the top of my list – so if that’s the case, maybe a section of your punch list is going to include tasks to do if DR cutover is invoked. Knowledge and documentation is key and if you have it written down in a way that’s easy to understand and act on, then you are way ahead of the game.
Now that you have a run book, a list of contacts, settings, configs, changes and tasks that have kept you on top of your EPM environment, when it comes to issue resolution and correction, you have the tools to do it quickly and efficiently.
Through this punch list exercise you should have a great knowledge of what technologies are actually in play for you and your EPM environment. You have a list of who to contact for either assistance gathering data, or even diagnosing symptoms. You have the info you need to know what’s being done and when. You might be able to get the scope of the issue narrowed down quickly using deductive reasoning as you have a list of things that happen and have happened which may have had some effect on the apps. Once you and identify where the issue is and what was done to address it, you can add tasks to your punch list to stop that same thing from happening in the future.
Your RCA methodology should be the same every time and depending on what you find or what the symptom is, you can do certain things to expedite the troubleshooting process. Use the information that you get from your punch list results to help here – the methodology above was scripted using the various data points the Accelatis platform captures, but you can use the same process. We’ll walk through an example of some process that keeps failing. First things might be verify that it’s actually a problem – can YOU login and do it? If not, then well it’s not some self inflicted or one off issue for this user. OK then, we can start to rule things out because we DO have some history here – config changes? Or a change that was just made? Nope. OK, next you might look at data – did we JUST upload something? Scenarios, app changes, metadata? You can rule those out. Are the configs different on the servers? No because you have those already and can see they are ok. Is the network ok? Is there some connection that is not able to be established? Remote that one. Next you might start manually looking at logs and content. Fun right? But you are looking for errors and alerts you might have from whatever tools you have running. This process alone for EPM is laborious, but that’s why it’s done later in the process because it does take some effort to do if you don’t have a tool like ours helping you out. But if there is nothing there you can look at overall performance and benchmarks – did something change here? Some indication that things are taking longer than they should? If no, well then maybe this is a software bug and you’ll have to open that ticket with the support team at Oracle. But you can hopefully see that the time you took to build your punch list is now helping you when you really could use all the insight you can get.
Now, I’d be remiss if I didn’t mention that of course, if you end up with a list like our managed services team executes as best practice, you might want a little helping hand. You can start to think about how to get it done with minimal effort on your end.
When we talk about our services and what we do, when you put the Accelatis platform in the picture is reduces the number of hours spent on repetitive or long and cumbersome tasks and improves the ability of the team to quickly gather information and statistics on any part of the environment from network, to server, to database and app. This does allow us to do fixed prices contracts because although it would take a person hours and hours to do things, we can automate it and spend our time doing the higher value tasks like reviewing performance statistics and making recommendations for changes in the system which will help your team be even more effective and faster in EPM activities.
Here’s that same slide from earlier – as we said, the managed services team uses this as a guide when they write the punch list for customers. Now, with the addition of the Datavail Accelatis platform, see how many areas are now covered automatically without users having to go in and do anything.
The benefits of having the platform work for you is far and wide. We know that not every company has the ability to do all the things needed to maintain their environment, and maybe they need help just identifying the areas they should be focused on. We’re a robust company that offers so many ways to help in the EPM space - if we can help you help yourself, then that’s just fantastic. If you need someone to help you completely, we’ll we can do that too – and everything in between. We are going to send you a template you can use to make your own punch list – just drop by our booth and we’ll be happy to send it to you so you can fill it in with all the tasks that are important to you and your company. We’d love to talk about other things that our team does so that you might understand our best practices and what we do every day to help folks keep their EPM systems running and running well.
Any remaining time we have we can use to discuss any points in the presentation or even in general when it comes to creating your punch list. Thank you very much for giving us your time – I sincerely hope it’s been informative and that it gets you thinking about your own processes and teams and what you can do to be best in class.