SlideShare a Scribd company logo
1 of 25
Tactics for Empowering Business Analysts
CITO Research
April 8, 2015
before we begin…
 You will be on mute for the duration of the event
 Please type a message in the Questions box in the Control Panel if you can’t hear us
 There will be a Q&A session at the end – please start submitting you questions via
GoToWebinar Questions panel
 A recording of the full webinar will be available within 48 hours, and we will email the
replay link to you
meet the speakers
Dan Woods
CEO of Evolved Media & Editor of CITO Research
dwoods@citoresearch.com
I create ideas about technology products, based on a broad
technical understanding. By writing as an analyst in Forbes,
and working with Evolved Media’s clients, I see the magic in
technology and why it matters to IT buyers.
Pete Schlampp
VP Products, Platfora
pete@platfora.com
As Vice President of Products at Platfora, I am committed to designing
software that enables users to explore all dimensions of complex data through
visualizations and workflows that are simple and easy to use across all touch
points in the business. I'm responsible for driving strategy for our product
management, product marketing, UX design, and technical publications.
agenda
Big data, big dreams, big data, big challenges
The quest for data
Data, a chorus of variety
Platfora success story
Questions & answers
big data, big dreams
big data, big challenges
the hidden potential of data, big and otherwise
Distributed insights are not being harvested
• Companies analyze only 12% of the data they have, leaving roughly 88% unused
• Permanent heterogeneity: Data now lives many places in many forms
Relationship with data is undergoing transformation
• Used to be that “digital lint” (data that wasn’t cost-effective to store)
was thrown away
• Hadoopanomics allows storage of vast
amounts of data at low cost
• Storing data doesn’t make it accessible
Lack of access to all the data
• Silos
• Highly intermediated systems
• Analysts in handcuffs
Platfora disrupts traditional analyst workflows
• To empower analysts with few, if any,
prior restraints on their imagination
• Platfora creates new central repository
with new practices and capabilities:
 Data preparation
 Integrated exploration and visualization
 Data catalog
 Flexible data governance
 Shareable data objects
 Tooling and plumbing for the tricky parts
of the new world (manageable refresh)
the quest for data
shifting from data warehouse to data lake
Data warehouses operated based on different assumptions
 Adding data was expensive and only happened through a lengthy design process
 ETL processes often cleaned up the data before it arrived in the repository
 Trust in the data came from the process of curation
Assumptions change in the data lake
 Many sources of data
 Raw data must be analyzed and transformed
 Data lineage must be preserved to establish trust
 A catalog must be maintained so that data can be found
Platfora turns the data lake into something useful
finding the data
• Show data catalog; explain how it works; is it created automatically?
• Explain how it differs from a data dictionary, which is something most analysts
know about and no one wants to be responsible for building as it becomes a full
time job
• Explain data lineage and how the source of items in the catalog is preserved
• The value of this capability is vastly underestimated
• Explain some of what it enables
search
Oil
tracing it back to the source
• with all the data in one
location it is possible to trace
every data point back to the
source
• what calculations were
performed?
• who made the changes?
• was anything filtered out?
• where are the source files?
challenges of preparing big data
Messy Data
• Big data often has a low signal and variable structure
Data Preparation
• Data prep involves not only cleaning but profiling, summarizing and applying analytics
• Preparation process must be recorded
• Preparing a useable form of big data is often done by a team
Data in Real Time
• Data arrives in streams (real time) and in massive volumes
• Data is cleaned and extracted which leads to a pipeline through which data passes when it arrives
Cataloging Data
• Derived objects must be cataloged and lineage must be tracked
preparing big data
profiling and
summarization for data
cleansing and prep
sample data from
the data catalog
data, a chorus of variety
managing workloads
Almost all big data repositories have to manage workloads
It is always possible to overwhelm the amount of computing capacity available
The process of managing workloads must address these tasks:
• Managing a transformation pipeline
• Creating canonical objects
• Creating derived objects
• Managing streams of data
• Recomputing objects
pulling together multiple data sources
combine datasets
from the catalog to
enrich big data
Hadoop to crunch
big data into in-
memory lenses
visualize the data
change your questions (iterate)
drag and drop data
fields to create
visualizations
collaborate and share insights
share insights across
the enterprise ad-hoc
and on schedule
handling administration and governance
• Users must have access to work with
and iterate through data in the data lake
• IT must have confidence that security is
in place and users will not draw the
wrong conclusions from the data
• Different access for data and objects
create an open and secure environment
platfora success story
delivering customer value at rapid speed
“Platfora Big Data Analytics has enabled us to
quickly understand precisely how advertising
investments influence consumer behavior.”
- Ed Smith, CTO
The Business Challenge
• Needed a way to integrate all cross-property
data from AutoTrader.com, KellyBlueBook.com,
and thousands of local dealer websites that
reach over 40 million unique customers on a
monthly basis
• Growing desire to decrease investments in
traditional EDWs and analytic databases
The Solution
• With Platfora, they integrated and visualized TV
spot data, web site traffic metrics and car dealer
inventory to better understand the influence on
the consumer's journey
• Created Vizboards that demonstrated immediate
consumer behavior shifts influenced by OEM TV
ads during the SuperBowl
• Project to ROI in 6 weeks, significantly less time
and money than if they used legacy tools
how platfora empowers imagination
Platfora empowers analysts just like Open Source empowered developers
• Analysts can easily discover and share data from one environment
• Analysts build a common language with which to describe a business
• True alignment paves the way for advanced analytics to transform operations
• This is founded on the following capabilities:
 Data preparation of the sort big data needs
 Integrated exploration and visualization
 A data catalog
 Flexible data governance
 Shareable data objects
 Tooling and plumbing for the tricky parts of the new world (manageable refresh)
questions & answers
questions?
Thank you for joining today!
You will receive a complimentary copy of “The
Changing Role of the Business Analyst” and a link
to the webinar replay in a follow-up email.

More Related Content

More from Platfora

Views From The C-Suite: Who's Big on Big Data
Views From The C-Suite: Who's Big on Big DataViews From The C-Suite: Who's Big on Big Data
Views From The C-Suite: Who's Big on Big DataPlatfora
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...Platfora
 
Platfora Girl Geek Dinner
Platfora Girl Geek DinnerPlatfora Girl Geek Dinner
Platfora Girl Geek DinnerPlatfora
 
Platfora Data Visualization Meetup
Platfora Data Visualization MeetupPlatfora Data Visualization Meetup
Platfora Data Visualization MeetupPlatfora
 
Platfora Data Visualization Meetup
Platfora Data Visualization MeetupPlatfora Data Visualization Meetup
Platfora Data Visualization MeetupPlatfora
 
Platfora - Denver Data Science Meetup
Platfora - Denver Data Science MeetupPlatfora - Denver Data Science Meetup
Platfora - Denver Data Science MeetupPlatfora
 
Hadoop Data Reservoir Webinar
Hadoop Data Reservoir WebinarHadoop Data Reservoir Webinar
Hadoop Data Reservoir WebinarPlatfora
 

More from Platfora (7)

Views From The C-Suite: Who's Big on Big Data
Views From The C-Suite: Who's Big on Big DataViews From The C-Suite: Who's Big on Big Data
Views From The C-Suite: Who's Big on Big Data
 
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
The Big Data Gusher: Big Data Analytics, the Internet of Things and the Oil B...
 
Platfora Girl Geek Dinner
Platfora Girl Geek DinnerPlatfora Girl Geek Dinner
Platfora Girl Geek Dinner
 
Platfora Data Visualization Meetup
Platfora Data Visualization MeetupPlatfora Data Visualization Meetup
Platfora Data Visualization Meetup
 
Platfora Data Visualization Meetup
Platfora Data Visualization MeetupPlatfora Data Visualization Meetup
Platfora Data Visualization Meetup
 
Platfora - Denver Data Science Meetup
Platfora - Denver Data Science MeetupPlatfora - Denver Data Science Meetup
Platfora - Denver Data Science Meetup
 
Hadoop Data Reservoir Webinar
Hadoop Data Reservoir WebinarHadoop Data Reservoir Webinar
Hadoop Data Reservoir Webinar
 

Recently uploaded

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESmohitsingh558521
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 

Recently uploaded (20)

SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICESSALESFORCE EDUCATION CLOUD | FEXLE SERVICES
SALESFORCE EDUCATION CLOUD | FEXLE SERVICES
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 

Tactics for Empowering Business Analysts (Webinar Slides)

  • 1. Tactics for Empowering Business Analysts CITO Research April 8, 2015
  • 2. before we begin…  You will be on mute for the duration of the event  Please type a message in the Questions box in the Control Panel if you can’t hear us  There will be a Q&A session at the end – please start submitting you questions via GoToWebinar Questions panel  A recording of the full webinar will be available within 48 hours, and we will email the replay link to you
  • 3. meet the speakers Dan Woods CEO of Evolved Media & Editor of CITO Research dwoods@citoresearch.com I create ideas about technology products, based on a broad technical understanding. By writing as an analyst in Forbes, and working with Evolved Media’s clients, I see the magic in technology and why it matters to IT buyers. Pete Schlampp VP Products, Platfora pete@platfora.com As Vice President of Products at Platfora, I am committed to designing software that enables users to explore all dimensions of complex data through visualizations and workflows that are simple and easy to use across all touch points in the business. I'm responsible for driving strategy for our product management, product marketing, UX design, and technical publications.
  • 4. agenda Big data, big dreams, big data, big challenges The quest for data Data, a chorus of variety Platfora success story Questions & answers
  • 5. big data, big dreams big data, big challenges
  • 6. the hidden potential of data, big and otherwise Distributed insights are not being harvested • Companies analyze only 12% of the data they have, leaving roughly 88% unused • Permanent heterogeneity: Data now lives many places in many forms Relationship with data is undergoing transformation • Used to be that “digital lint” (data that wasn’t cost-effective to store) was thrown away • Hadoopanomics allows storage of vast amounts of data at low cost • Storing data doesn’t make it accessible Lack of access to all the data • Silos • Highly intermediated systems • Analysts in handcuffs
  • 7. Platfora disrupts traditional analyst workflows • To empower analysts with few, if any, prior restraints on their imagination • Platfora creates new central repository with new practices and capabilities:  Data preparation  Integrated exploration and visualization  Data catalog  Flexible data governance  Shareable data objects  Tooling and plumbing for the tricky parts of the new world (manageable refresh)
  • 9. shifting from data warehouse to data lake Data warehouses operated based on different assumptions  Adding data was expensive and only happened through a lengthy design process  ETL processes often cleaned up the data before it arrived in the repository  Trust in the data came from the process of curation Assumptions change in the data lake  Many sources of data  Raw data must be analyzed and transformed  Data lineage must be preserved to establish trust  A catalog must be maintained so that data can be found Platfora turns the data lake into something useful
  • 10. finding the data • Show data catalog; explain how it works; is it created automatically? • Explain how it differs from a data dictionary, which is something most analysts know about and no one wants to be responsible for building as it becomes a full time job • Explain data lineage and how the source of items in the catalog is preserved • The value of this capability is vastly underestimated • Explain some of what it enables search Oil
  • 11. tracing it back to the source • with all the data in one location it is possible to trace every data point back to the source • what calculations were performed? • who made the changes? • was anything filtered out? • where are the source files?
  • 12. challenges of preparing big data Messy Data • Big data often has a low signal and variable structure Data Preparation • Data prep involves not only cleaning but profiling, summarizing and applying analytics • Preparation process must be recorded • Preparing a useable form of big data is often done by a team Data in Real Time • Data arrives in streams (real time) and in massive volumes • Data is cleaned and extracted which leads to a pipeline through which data passes when it arrives Cataloging Data • Derived objects must be cataloged and lineage must be tracked
  • 13. preparing big data profiling and summarization for data cleansing and prep sample data from the data catalog
  • 14. data, a chorus of variety
  • 15. managing workloads Almost all big data repositories have to manage workloads It is always possible to overwhelm the amount of computing capacity available The process of managing workloads must address these tasks: • Managing a transformation pipeline • Creating canonical objects • Creating derived objects • Managing streams of data • Recomputing objects
  • 16. pulling together multiple data sources combine datasets from the catalog to enrich big data Hadoop to crunch big data into in- memory lenses
  • 18. change your questions (iterate) drag and drop data fields to create visualizations
  • 19. collaborate and share insights share insights across the enterprise ad-hoc and on schedule
  • 20. handling administration and governance • Users must have access to work with and iterate through data in the data lake • IT must have confidence that security is in place and users will not draw the wrong conclusions from the data • Different access for data and objects create an open and secure environment
  • 22. delivering customer value at rapid speed “Platfora Big Data Analytics has enabled us to quickly understand precisely how advertising investments influence consumer behavior.” - Ed Smith, CTO The Business Challenge • Needed a way to integrate all cross-property data from AutoTrader.com, KellyBlueBook.com, and thousands of local dealer websites that reach over 40 million unique customers on a monthly basis • Growing desire to decrease investments in traditional EDWs and analytic databases The Solution • With Platfora, they integrated and visualized TV spot data, web site traffic metrics and car dealer inventory to better understand the influence on the consumer's journey • Created Vizboards that demonstrated immediate consumer behavior shifts influenced by OEM TV ads during the SuperBowl • Project to ROI in 6 weeks, significantly less time and money than if they used legacy tools
  • 23. how platfora empowers imagination Platfora empowers analysts just like Open Source empowered developers • Analysts can easily discover and share data from one environment • Analysts build a common language with which to describe a business • True alignment paves the way for advanced analytics to transform operations • This is founded on the following capabilities:  Data preparation of the sort big data needs  Integrated exploration and visualization  A data catalog  Flexible data governance  Shareable data objects  Tooling and plumbing for the tricky parts of the new world (manageable refresh)
  • 25. questions? Thank you for joining today! You will receive a complimentary copy of “The Changing Role of the Business Analyst” and a link to the webinar replay in a follow-up email.

Editor's Notes

  1. Data Warehouse Notes: The amount of such highly curated data grew slowly Tribal knowledge was sufficient to keep track of what data was available Data Lake Notes: ETL processes for unstructured data and big data are far different than traditional methods Many data processing pipelines take place in the data lake Tribal knowledge is not sufficient to track the amount of data in the warehouse
  2. Moving to the data lake architecture is a challenge for companies. One of the potential benefits is to give access to all the data. How do you do it? How can end users find the data? If it is one dataset, that’s easy. But you want to enable users to add their own too. So you need the tools to be able to add, organize, and find datasets. Of course, you need governance and security. Platfora’s Data Catalog is an easy way to do this. Search across datasets to find what you need. In this case we’re looking at IoT data from a oil rigs.
  3. With a lot of users accessing the DL it’s important to trust the data you’re looking at. One of the hallmarks of the data discovery movement – where business analysts have had free reign, has been the lack of control and governance over data. Many fear that big data simply takes this one step farther. It doesn’t need to be this way. With a central Data Catalog you can trace all data that is in the DL. What calculations and transformations have taken place. Who did them.
  4. Traditional ETL tools cannot handle big data. A new generation of data prep tools has been created. Advanced machine-learning and analytics are vital when preparing big data. Data sets must be captured in the catalog as they arrive. The catalog must capture the lineage of derived data sets The signal from the big data may need to enhance other data sets or may need to be combined with other data sets. Collaboration is required.
  5. Preparing big data takes 80% of the time But this isn’t just spreadsheets and CSV files Big data comes in formats like JSON, XML, Log Files, Etc. Need to handle millions of rows Need to be able to sample rows across multiple files, search for records, filter Need a system to provide intelligence to the analyst.
  6. Data arrives and then it must proceed through a transformation pipeline These pipelines create canonical objects that are like those stored in a data warehouse. This is usually a batch process The pipelines also create derived objects created by analysts designed to serve specific needs. This is somewhere between a batch and a real time process. Some objects may be streams of data, updated in real time You cannot recompute every object before every query A derived object like a Platfora Lens may have a huge amount of processing behind it. Recomputation must be configurable
  7. Hadoop is a batch engine still. But we need real-time access. How to marry these for the BA? In-memory technology can help. The lens is the answer. With the lens we take big data and process it into a materialized view of the data, with many of the answers pre-computed in memory. Leverages the power of MapReduce and Spark. Can combine together datasets. Platfora then manages the lenses, updates them when new data arrives, makes sure they are available when the BA needs them.
  8. Once that data is crunched by Hadoop and in-memory it’s time to visualize it The modern BA works across devices and collaborates freely – must be web based and collaborative Big data requires many types of visualization: traditional and more niche
  9. But perhaps the most important factor is that the modern business analyst needs to iteratively ask questions of the data One question, leads to another, leads to another. There needs to be no penalty for asking questions: this means that answers need to come back quickly and the way to express the questions should be intuitive, like dragging and dropping fields from your lens into dropzones. Platfora has this. We automatically pick the visualization that we believe is the best fit but you change it as you need.
  10. Once insights are locked in you need to be able to collaborate and share. Distribute formatted versions of vizboards via PDF and email. Add collaborators to a document
  11. All of this can give IT a headache. The data discovery movement has created a backlash within the IT department about a lack of control of data. Went from “here is the data you need and here’s how you’re going to analyze it” to the wild, wild, west. There needs to be a balance. IT needs to be able to control access to certain data, and have the ability to deem what is canonical. The modern business analyst needs the freedom to swim within the data lake. Platfora provides the ability to secure data and the objects dervied from data. Simple model, Own/Edit/View/None. Gives IT peace of mind.
  12. One of my favorite customer stories Started in December of 2014 Combined data from multiple websites No need for traditional EDWs BAs were able to do it themselves