Global Business Intelligence (BI) software vendor, Yellowfin, and Actian Corporation, pioneers of the record-breaking analytical database Vectorwise, will host a series of Big Data and BI Best Practices Webinars.
These are the slides from that presentation.
The Big Data & BI Best Practices Webinars and associated slides examine the phenomenal growth in business data and outline strategies for effectively, efficiently and quickly harnessing and exploring ‘Big Data’ for competitive advantage.
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
Dragan Berić will take a deep dive into Lakehouse architecture, a game-changing concept bridging the best elements of data lake and data warehouse. The presentation will focus on the Delta Lake format as the foundation of the Lakehouse philosophy, and Databricks as the primary platform for its implementation.
Delta Lake delivers reliability, security and performance to data lakes. Join this session to learn how customers have achieved 48x faster data processing, leading to 50% faster time to insight after implementing Delta Lake. You’ll also learn how Delta Lake provides the perfect foundation for a cost-effective, highly scalable lakehouse architecture.
The first step towards understanding data assets’ impact on your organization is understanding what those assets mean for each other. Metadata – literally, data about data – is a practice area required by good systems development, and yet is also perhaps the most mislabeled and misunderstood Data Management practice. Understanding metadata and its associated technologies as more than just straightforward technological tools can provide powerful insight into the efficiency of organizational practices and enable you to combine practices into sophisticated techniques supporting larger and more complex business initiatives. Program learning objectives include:
- Understanding how to leverage metadata practices in support of business strategy
- Discuss foundational metadata concepts
- Guiding principles for and lessons previously learned from metadata and its practical uses applied strategy
Metadata strategies include:
- Metadata is a gerund so don’t try to treat it as a noun
- Metadata is the language of Data Governance
- Treat glossaries/repositories as capabilities, not technology
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleDATAVERSITY
Learn about using a semantic layer to enable actionable insights for everyone and streamline data and analytics access throughout your organization. This session will offer practical advice based on a decade of experience making semantic layers work for Enterprise customers.
Attend this session to learn about:
- Delivering critical business data to users faster than ever at scale using a semantic layer
- Enabling data teams to model and deliver a semantic layer on data in the cloud.
- Maintaining a single source of governed metrics and business data
- Achieving speed of thought query performance and consistent KPIs across any BI/AI tool like Excel, Power BI, Tableau, Looker, DataRobot, Databricks and more.
- Providing dimensional analysis capability that accelerates performance with no need to extract data from the cloud data warehouse
Who should attend this session?
Data & Analytics leaders and practitioners (e.g., Chief Data Officers, data scientists, data literacy, business intelligence, and analytics professionals).
The right architecture is key for any IT project. This is especially the case for big data projects, where there are no standard architectures which have proven their suitability over years. This session discusses the different Big Data Architectures which have evolved over time, including traditional Big Data Architecture, Streaming Analytics architecture as well as Lambda and Kappa architecture and presents the mapping of components from both Open Source as well as the Oracle stack onto these architectures.
Overview of Data and Analytics Essentials and FoundationsNUS-ISS
As companies increasingly integrate data across functions, the boundaries between marketing, sales and operations have been blurring. This allows them to find new opportunities that arise by aligning and integrating the activities of supply and demand to improve commercial effectiveness. Instead of conducting post-hoc analyses that allow them to correct future actions, companies generate and analyze data in near real-time and adjust their operations processes dynamically. Transitioning from static analytics outputs to more dynamic contextualized insights means analytics can be delivered with increased relevance closer to the point of decision.
This talk will cover the analytics journey from descriptive, predictive and prescriptive analytics to derive actionable and timely insights to improve customer experience to drive marketing, salesforce and operations excellence.
[DSC Europe 22] Lakehouse architecture with Delta Lake and Databricks - Draga...DataScienceConferenc1
Dragan Berić will take a deep dive into Lakehouse architecture, a game-changing concept bridging the best elements of data lake and data warehouse. The presentation will focus on the Delta Lake format as the foundation of the Lakehouse philosophy, and Databricks as the primary platform for its implementation.
Delta Lake delivers reliability, security and performance to data lakes. Join this session to learn how customers have achieved 48x faster data processing, leading to 50% faster time to insight after implementing Delta Lake. You’ll also learn how Delta Lake provides the perfect foundation for a cost-effective, highly scalable lakehouse architecture.
The first step towards understanding data assets’ impact on your organization is understanding what those assets mean for each other. Metadata – literally, data about data – is a practice area required by good systems development, and yet is also perhaps the most mislabeled and misunderstood Data Management practice. Understanding metadata and its associated technologies as more than just straightforward technological tools can provide powerful insight into the efficiency of organizational practices and enable you to combine practices into sophisticated techniques supporting larger and more complex business initiatives. Program learning objectives include:
- Understanding how to leverage metadata practices in support of business strategy
- Discuss foundational metadata concepts
- Guiding principles for and lessons previously learned from metadata and its practical uses applied strategy
Metadata strategies include:
- Metadata is a gerund so don’t try to treat it as a noun
- Metadata is the language of Data Governance
- Treat glossaries/repositories as capabilities, not technology
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleDATAVERSITY
Learn about using a semantic layer to enable actionable insights for everyone and streamline data and analytics access throughout your organization. This session will offer practical advice based on a decade of experience making semantic layers work for Enterprise customers.
Attend this session to learn about:
- Delivering critical business data to users faster than ever at scale using a semantic layer
- Enabling data teams to model and deliver a semantic layer on data in the cloud.
- Maintaining a single source of governed metrics and business data
- Achieving speed of thought query performance and consistent KPIs across any BI/AI tool like Excel, Power BI, Tableau, Looker, DataRobot, Databricks and more.
- Providing dimensional analysis capability that accelerates performance with no need to extract data from the cloud data warehouse
Who should attend this session?
Data & Analytics leaders and practitioners (e.g., Chief Data Officers, data scientists, data literacy, business intelligence, and analytics professionals).
The right architecture is key for any IT project. This is especially the case for big data projects, where there are no standard architectures which have proven their suitability over years. This session discusses the different Big Data Architectures which have evolved over time, including traditional Big Data Architecture, Streaming Analytics architecture as well as Lambda and Kappa architecture and presents the mapping of components from both Open Source as well as the Oracle stack onto these architectures.
Overview of Data and Analytics Essentials and FoundationsNUS-ISS
As companies increasingly integrate data across functions, the boundaries between marketing, sales and operations have been blurring. This allows them to find new opportunities that arise by aligning and integrating the activities of supply and demand to improve commercial effectiveness. Instead of conducting post-hoc analyses that allow them to correct future actions, companies generate and analyze data in near real-time and adjust their operations processes dynamically. Transitioning from static analytics outputs to more dynamic contextualized insights means analytics can be delivered with increased relevance closer to the point of decision.
This talk will cover the analytics journey from descriptive, predictive and prescriptive analytics to derive actionable and timely insights to improve customer experience to drive marketing, salesforce and operations excellence.
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
Data Mesh is an innovative concept addressing many data challenges from an architectural, cultural, and organizational perspective. But is the world ready to implement Data Mesh?
In this session, we will review the importance of core Data Mesh principles, what they can offer, and when it is a good idea to try a Data Mesh architecture. We will discuss common challenges with implementation of Data Mesh systems and focus on the role of open-source projects for it. Projects like Apache Spark can play a key part in standardized infrastructure platform implementation of Data Mesh. We will examine the landscape of useful data engineering open-source projects to utilize in several areas of a Data Mesh system in practice, along with an architectural example. We will touch on what work (culture, tools, mindset) needs to be done to ensure Data Mesh is more accessible for engineers in the industry.
The audience will leave with a good understanding of the benefits of Data Mesh architecture, common challenges, and the role of Apache Spark and other open-source projects for its implementation in real systems.
This session is targeted for architects, decision-makers, data-engineers, and system designers.
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Simplilearn
The presentation about Big Data Analytics will help you know why Big Data analytics is required, what is Big Data analytics, the lifecycle of Big Data analytics, types of Big Data analytics, tools used in Big Data analytics and few Big Data application domains. Also, we'll see a use case on how Spotify uses Big Data analytics. Big Data analytics is a process to extract meaningful insights from Big Data such as hidden patterns, unknown correlations, market trends, and customer preferences. One of the essential benefits of Big Data analytics is used for product development and innovations. Now, let us get started and understand Big Data Analytics in detail.
Below are explained in this Big Data analytics tutorial:
1. Why Big Data analytics?
2. What is Big Data analytics?
3. Lifecycle of Big Data analytics
4. Types of Big Data analytics
5. Tools used in Big Data analytics
6. Big Data application domains
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
This course will enable you to:
1. Understand the different components of the Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
2. Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
3. Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
4. Get an overview of Sqoop and Flume and describe how to ingest data using them
5. Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
6. Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
7. Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
8. Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
9. Gain a working knowledge of Pig and its components
10. Do functional programming in Spark
11. Understand resilient distribution datasets (RDD) in detail
12. Implement and build Spark applications
13. Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
14. Understand the common use-cases of Spark and the various interactive algorithms
15. Learn Spark SQL, creating, transforming, and querying Data frames
Learn more at https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. I’ll include use cases so you can see what approach will work best for your big data needs.
Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves.
Delta Lake, an open-source innovations which brings new capabilities for transactions, version control and indexing your data lakes. We uncover how Delta Lake benefits and why it matters to you. Through this session, we showcase some of its benefits and how they can improve your modern data engineering pipelines. Delta lake provides snapshot isolation which helps concurrent read/write operations and enables efficient insert, update, deletes, and rollback capabilities. It allows background file optimization through compaction and z-order partitioning achieving better performance improvements. In this presentation, we will learn the Delta Lake benefits and how it solves common data lake challenges, and most importantly new Delta Time Travel capability.
This presentation explains what data engineering is and describes the data lifecycles phases briefly. I used this presentation during my work as an on-demand instructor at Nooreed.com
Big data architectures and the data lakeJames Serra
With so many new technologies it can get confusing on the best approach to building a big data architecture. The data lake is a great new concept, usually built in Hadoop, but what exactly is it and how does it fit in? In this presentation I'll discuss the four most common patterns in big data production implementations, the top-down vs bottoms-up approach to analytics, and how you can use a data lake and a RDBMS data warehouse together. We will go into detail on the characteristics of a data lake and its benefits, and how you still need to perform the same data governance tasks in a data lake as you do in a data warehouse. Come to this presentation to make sure your data lake does not turn into a data swamp!
Snowflake + Power BI: Cloud Analytics for EveryoneAngel Abundez
Learn how Power BI and Snowflake can work together to bring a best-in-class data and analytics experience to your enterprise. You can combine Snowflake’s easy to use, robust, and scalable data platform with Power BI’s data visualization, built-in AI, and collaboration platform to create a data-driven culture for everyone.
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichDatabricks
The term “Lambda Architecture” stands for a generic, scalable and fault-tolerant data processing architecture. As the hyper-scale now offers a various PaaS services for data ingestion, storage and processing, the need for a revised, cloud-native implementation of the lambda architecture is arising.
In this talk we demonstrate the blueprint for such an implementation in Microsoft Azure, with Azure Databricks — a PaaS Spark offering – as a key component. We go back to some core principles of functional programming and link them to the capabilities of Apache Spark for various end-to-end big data analytics scenarios.
We also illustrate the “Lambda architecture in use” and the associated tread-offs using the real customer scenario – Rijksmuseum in Amsterdam – a terabyte-scale Azure-based data platform handles data from 2.500.000 visitors per year.
Apache Big Data Conference 2016, Vancouver BC: Talk by Andreas Zitzelsberger (@andreasz82, Principal Software Architect at QAware)
Abstract: On large-scale web sites, users leave thousands of traces every second. Businesses need to process and interpret these traces in real-time to be able to react to the behavior of their users. In this talk, Andreas shows a real world example of the power of a modern open-source stack. He will walk you through the design of a real-time clickstream analysis PAAS solution based on Apache Spark, Kafka, Parquet and HDFS. Andreas explains our decision making and presents our lessons learned.
Data Architecture is foundational to an information-based operational environment. Without proper structure and efficiency in organization, data assets cannot be utilized to their full potential, which in turn harms bottom-line business value. When designed well and used effectively, however, a strong Data Architecture can be referenced to inform, clarify, understand, and resolve aspects of a variety of business problems commonly encountered in organizations.
The goal of this webinar is not to instruct you in being an outright Data Architect, but rather to enable you to envision a number of uses for Data Architectures that will maximize your organization’s competitive advantage. With that being said, we will:
Discuss Data Architecture’s guiding principles and best practices
Demonstrate how to utilize Data Architecture to address a broad variety of organizational challenges and support your overall business strategy
Illustrate how best to understand foundational Data Architecture concepts based on “The DAMA Guide to the Data Management Body of Knowledge” (DAMA DMBOK)
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...Simplilearn
This presentation about Spark SQL will help you understand what is Spark SQL, Spark SQL features, architecture, data frame API, data source API, catalyst optimizer, running SQL queries and a demo on Spark SQL. Spark SQL is an Apache Spark's module for working with structured and semi-structured data. It is originated to overcome the limitations of Apache Hive. Now, let us get started and understand Spark SQL in detail.
Below topics are explained in this Spark SQL presentation:
1. What is Spark SQL?
2. Spark SQL features
3. Spark SQL architecture
4. Spark SQL - Dataframe API
5. Spark SQL - Data source API
6. Spark SQL - Catalyst optimizer
7. Running SQL queries
8. Spark SQL demo
This Apache Spark and Scala certification training is designed to advance your expertise working with the Big Data Hadoop Ecosystem. You will master essential skills of the Apache Spark open source framework and the Scala programming language, including Spark Streaming, Spark SQL, machine learning programming, GraphX programming, and Shell Scripting Spark. This Scala Certification course will give you vital skillsets and a competitive advantage for an exciting career as a Hadoop Developer.
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
Simplilearn’s Apache Spark and Scala certification training are designed to:
1. Advance your expertise in the Big Data Hadoop Ecosystem
2. Help you master essential Apache and Spark skills, such as Spark Streaming, Spark SQL, machine learning programming, GraphX programming and Shell Scripting Spark
3. Help you land a Hadoop developer job requiring Apache Spark expertise by giving you a real-life industry project coupled with 30 demos
What skills will you learn?
By completing this Apache Spark and Scala course you will be able to:
1. Understand the limitations of MapReduce and the role of Spark in overcoming these limitations
2. Understand the fundamentals of the Scala programming language and its features
3. Explain and master the process of installing Spark as a standalone cluster
4. Develop expertise in using Resilient Distributed Datasets (RDD) for creating applications in Spark
5. Master Structured Query Language (SQL) using SparkSQL
6. Gain a thorough understanding of Spark streaming features
7. Master and describe the features of Spark ML programming and GraphX programming
Learn more at https://www.simplilearn.com/big-data-and-analytics/apache-spark-scala-certification-training
In this session, Sergio covered the Lakehouse concept and how companies implement it, from data ingestion to insight. He showed how you could use Azure Data Services to speed up your Analytics project from ingesting, modelling and delivering insights to end users.
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...DATAVERSITY
Developing a Data Strategy for your organization can seem like a daunting task. The opportunity in getting it right can be significant, however, as data drives many of the key initiatives in today’s marketplace: digital transformation, marketing, customer centricity, and more. This webinar will help de-mystify Data Strategy and Data Architecture and will provide concrete, practical ways to get started.
Driving Data Intelligence in the Supply Chain Through the Data Catalog at TJXDATAVERSITY
Roles and responsibilities are a critical component of every Data Governance program. Building a set of roles that are practical and that will not interfere with people’s “day jobs” is an important consideration that will influence how well your program is adopted. This tutorial focuses on sharing a proven model guaranteed to represent your organization.
Join Bob Seiner for this lively webinar where he will dissect a complete Operating Model of Roles and Responsibilities that encompasses all levels of the organization. Seiner will detail the roles and describe the most effective way to associate people with the roles. You will walk out of this webinar with a model to apply to your organization.
In this session Bob will share:
- The five levels of Data Governance roles
- A proven Operating Model of Roles and Responsibilities
- How to customize the model to meet your requirements
- Setting appropriate role expectations
- How to operationalize the roles and demonstrate value
Improving Data Literacy Around Data ArchitectureDATAVERSITY
Data Literacy is an increasing concern, as organizations look to become more data-driven. As the rise of the citizen data scientist and self-service data analytics becomes increasingly common, the need for business users to understand core Data Management fundamentals is more important than ever. At the same time, technical roles need a strong foundation in Data Architecture principles and best practices. Join this webinar to understand the key components of Data Literacy, and practical ways to implement a Data Literacy program in your organization.
Architect’s Open-Source Guide for a Data Mesh ArchitectureDatabricks
Data Mesh is an innovative concept addressing many data challenges from an architectural, cultural, and organizational perspective. But is the world ready to implement Data Mesh?
In this session, we will review the importance of core Data Mesh principles, what they can offer, and when it is a good idea to try a Data Mesh architecture. We will discuss common challenges with implementation of Data Mesh systems and focus on the role of open-source projects for it. Projects like Apache Spark can play a key part in standardized infrastructure platform implementation of Data Mesh. We will examine the landscape of useful data engineering open-source projects to utilize in several areas of a Data Mesh system in practice, along with an architectural example. We will touch on what work (culture, tools, mindset) needs to be done to ensure Data Mesh is more accessible for engineers in the industry.
The audience will leave with a good understanding of the benefits of Data Mesh architecture, common challenges, and the role of Apache Spark and other open-source projects for its implementation in real systems.
This session is targeted for architects, decision-makers, data-engineers, and system designers.
Big Data Analytics | What Is Big Data Analytics? | Big Data Analytics For Beg...Simplilearn
The presentation about Big Data Analytics will help you know why Big Data analytics is required, what is Big Data analytics, the lifecycle of Big Data analytics, types of Big Data analytics, tools used in Big Data analytics and few Big Data application domains. Also, we'll see a use case on how Spotify uses Big Data analytics. Big Data analytics is a process to extract meaningful insights from Big Data such as hidden patterns, unknown correlations, market trends, and customer preferences. One of the essential benefits of Big Data analytics is used for product development and innovations. Now, let us get started and understand Big Data Analytics in detail.
Below are explained in this Big Data analytics tutorial:
1. Why Big Data analytics?
2. What is Big Data analytics?
3. Lifecycle of Big Data analytics
4. Types of Big Data analytics
5. Tools used in Big Data analytics
6. Big Data application domains
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
This course will enable you to:
1. Understand the different components of the Hadoop ecosystem such as Hadoop 2.7, Yarn, MapReduce, Pig, Hive, Impala, HBase, Sqoop, Flume, and Apache Spark
2. Understand Hadoop Distributed File System (HDFS) and YARN as well as their architecture, and learn how to work with them for storage and resource management
3. Understand MapReduce and its characteristics, and assimilate some advanced MapReduce concepts
4. Get an overview of Sqoop and Flume and describe how to ingest data using them
5. Create database and tables in Hive and Impala, understand HBase, and use Hive and Impala for partitioning
6. Understand different types of file formats, Avro Schema, using Arvo with Hive, and Sqoop and Schema evolution
7. Understand Flume, Flume architecture, sources, flume sinks, channels, and flume configurations
8. Understand HBase, its architecture, data storage, and working with HBase. You will also understand the difference between HBase and RDBMS
9. Gain a working knowledge of Pig and its components
10. Do functional programming in Spark
11. Understand resilient distribution datasets (RDD) in detail
12. Implement and build Spark applications
13. Gain an in-depth understanding of parallel processing in Spark and Spark RDD optimization techniques
14. Understand the common use-cases of Spark and the various interactive algorithms
15. Learn Spark SQL, creating, transforming, and querying Data frames
Learn more at https://www.simplilearn.com/big-data-and-analytics/big-data-and-hadoop-training
Data Lakehouse, Data Mesh, and Data Fabric (r1)James Serra
So many buzzwords of late: Data Lakehouse, Data Mesh, and Data Fabric. What do all these terms mean and how do they compare to a data warehouse? In this session I’ll cover all of them in detail and compare the pros and cons of each. I’ll include use cases so you can see what approach will work best for your big data needs.
Big data is a term that describes the large volume of data – both structured and unstructured – that inundates a business on a day-to-day basis. But it’s not the amount of data that’s important. It’s what organizations do with the data that matters. Big data can be analyzed for insights that lead to better decisions and strategic business moves.
Delta Lake, an open-source innovations which brings new capabilities for transactions, version control and indexing your data lakes. We uncover how Delta Lake benefits and why it matters to you. Through this session, we showcase some of its benefits and how they can improve your modern data engineering pipelines. Delta lake provides snapshot isolation which helps concurrent read/write operations and enables efficient insert, update, deletes, and rollback capabilities. It allows background file optimization through compaction and z-order partitioning achieving better performance improvements. In this presentation, we will learn the Delta Lake benefits and how it solves common data lake challenges, and most importantly new Delta Time Travel capability.
This presentation explains what data engineering is and describes the data lifecycles phases briefly. I used this presentation during my work as an on-demand instructor at Nooreed.com
Big data architectures and the data lakeJames Serra
With so many new technologies it can get confusing on the best approach to building a big data architecture. The data lake is a great new concept, usually built in Hadoop, but what exactly is it and how does it fit in? In this presentation I'll discuss the four most common patterns in big data production implementations, the top-down vs bottoms-up approach to analytics, and how you can use a data lake and a RDBMS data warehouse together. We will go into detail on the characteristics of a data lake and its benefits, and how you still need to perform the same data governance tasks in a data lake as you do in a data warehouse. Come to this presentation to make sure your data lake does not turn into a data swamp!
Snowflake + Power BI: Cloud Analytics for EveryoneAngel Abundez
Learn how Power BI and Snowflake can work together to bring a best-in-class data and analytics experience to your enterprise. You can combine Snowflake’s easy to use, robust, and scalable data platform with Power BI’s data visualization, built-in AI, and collaboration platform to create a data-driven culture for everyone.
Lambda Architecture in the Cloud with Azure Databricks with Andrei VaranovichDatabricks
The term “Lambda Architecture” stands for a generic, scalable and fault-tolerant data processing architecture. As the hyper-scale now offers a various PaaS services for data ingestion, storage and processing, the need for a revised, cloud-native implementation of the lambda architecture is arising.
In this talk we demonstrate the blueprint for such an implementation in Microsoft Azure, with Azure Databricks — a PaaS Spark offering – as a key component. We go back to some core principles of functional programming and link them to the capabilities of Apache Spark for various end-to-end big data analytics scenarios.
We also illustrate the “Lambda architecture in use” and the associated tread-offs using the real customer scenario – Rijksmuseum in Amsterdam – a terabyte-scale Azure-based data platform handles data from 2.500.000 visitors per year.
Apache Big Data Conference 2016, Vancouver BC: Talk by Andreas Zitzelsberger (@andreasz82, Principal Software Architect at QAware)
Abstract: On large-scale web sites, users leave thousands of traces every second. Businesses need to process and interpret these traces in real-time to be able to react to the behavior of their users. In this talk, Andreas shows a real world example of the power of a modern open-source stack. He will walk you through the design of a real-time clickstream analysis PAAS solution based on Apache Spark, Kafka, Parquet and HDFS. Andreas explains our decision making and presents our lessons learned.
Data Architecture is foundational to an information-based operational environment. Without proper structure and efficiency in organization, data assets cannot be utilized to their full potential, which in turn harms bottom-line business value. When designed well and used effectively, however, a strong Data Architecture can be referenced to inform, clarify, understand, and resolve aspects of a variety of business problems commonly encountered in organizations.
The goal of this webinar is not to instruct you in being an outright Data Architect, but rather to enable you to envision a number of uses for Data Architectures that will maximize your organization’s competitive advantage. With that being said, we will:
Discuss Data Architecture’s guiding principles and best practices
Demonstrate how to utilize Data Architecture to address a broad variety of organizational challenges and support your overall business strategy
Illustrate how best to understand foundational Data Architecture concepts based on “The DAMA Guide to the Data Management Body of Knowledge” (DAMA DMBOK)
Spark SQL Tutorial | Spark SQL Using Scala | Apache Spark Tutorial For Beginn...Simplilearn
This presentation about Spark SQL will help you understand what is Spark SQL, Spark SQL features, architecture, data frame API, data source API, catalyst optimizer, running SQL queries and a demo on Spark SQL. Spark SQL is an Apache Spark's module for working with structured and semi-structured data. It is originated to overcome the limitations of Apache Hive. Now, let us get started and understand Spark SQL in detail.
Below topics are explained in this Spark SQL presentation:
1. What is Spark SQL?
2. Spark SQL features
3. Spark SQL architecture
4. Spark SQL - Dataframe API
5. Spark SQL - Data source API
6. Spark SQL - Catalyst optimizer
7. Running SQL queries
8. Spark SQL demo
This Apache Spark and Scala certification training is designed to advance your expertise working with the Big Data Hadoop Ecosystem. You will master essential skills of the Apache Spark open source framework and the Scala programming language, including Spark Streaming, Spark SQL, machine learning programming, GraphX programming, and Shell Scripting Spark. This Scala Certification course will give you vital skillsets and a competitive advantage for an exciting career as a Hadoop Developer.
What is this Big Data Hadoop training course about?
The Big Data Hadoop and Spark developer course have been designed to impart an in-depth knowledge of Big Data processing using Hadoop and Spark. The course is packed with real-life projects and case studies to be executed in the CloudLab.
What are the course objectives?
Simplilearn’s Apache Spark and Scala certification training are designed to:
1. Advance your expertise in the Big Data Hadoop Ecosystem
2. Help you master essential Apache and Spark skills, such as Spark Streaming, Spark SQL, machine learning programming, GraphX programming and Shell Scripting Spark
3. Help you land a Hadoop developer job requiring Apache Spark expertise by giving you a real-life industry project coupled with 30 demos
What skills will you learn?
By completing this Apache Spark and Scala course you will be able to:
1. Understand the limitations of MapReduce and the role of Spark in overcoming these limitations
2. Understand the fundamentals of the Scala programming language and its features
3. Explain and master the process of installing Spark as a standalone cluster
4. Develop expertise in using Resilient Distributed Datasets (RDD) for creating applications in Spark
5. Master Structured Query Language (SQL) using SparkSQL
6. Gain a thorough understanding of Spark streaming features
7. Master and describe the features of Spark ML programming and GraphX programming
Learn more at https://www.simplilearn.com/big-data-and-analytics/apache-spark-scala-certification-training
In this session, Sergio covered the Lakehouse concept and how companies implement it, from data ingestion to insight. He showed how you could use Azure Data Services to speed up your Analytics project from ingesting, modelling and delivering insights to end users.
DAS Slides: Building a Data Strategy - Practical Steps for Aligning with Busi...DATAVERSITY
Developing a Data Strategy for your organization can seem like a daunting task. The opportunity in getting it right can be significant, however, as data drives many of the key initiatives in today’s marketplace: digital transformation, marketing, customer centricity, and more. This webinar will help de-mystify Data Strategy and Data Architecture and will provide concrete, practical ways to get started.
Driving Data Intelligence in the Supply Chain Through the Data Catalog at TJXDATAVERSITY
Roles and responsibilities are a critical component of every Data Governance program. Building a set of roles that are practical and that will not interfere with people’s “day jobs” is an important consideration that will influence how well your program is adopted. This tutorial focuses on sharing a proven model guaranteed to represent your organization.
Join Bob Seiner for this lively webinar where he will dissect a complete Operating Model of Roles and Responsibilities that encompasses all levels of the organization. Seiner will detail the roles and describe the most effective way to associate people with the roles. You will walk out of this webinar with a model to apply to your organization.
In this session Bob will share:
- The five levels of Data Governance roles
- A proven Operating Model of Roles and Responsibilities
- How to customize the model to meet your requirements
- Setting appropriate role expectations
- How to operationalize the roles and demonstrate value
Improving Data Literacy Around Data ArchitectureDATAVERSITY
Data Literacy is an increasing concern, as organizations look to become more data-driven. As the rise of the citizen data scientist and self-service data analytics becomes increasingly common, the need for business users to understand core Data Management fundamentals is more important than ever. At the same time, technical roles need a strong foundation in Data Architecture principles and best practices. Join this webinar to understand the key components of Data Literacy, and practical ways to implement a Data Literacy program in your organization.
This presentation, by big data guru Bernard Marr, outlines in simple terms what Big Data is and how it is used today. It covers the 5 V's of Big Data as well as a number of high value use cases.
BI congres 2014-5: from BI to big data - Jan Aertsen - PentahoBICC Thomas More
7de BI congres van het BICC-Thomas More: 3 april 2014
Reisverslag van Business Intelligence naar Big Data
De reisbranche is sterk in beweging. Deze presentatie zal een reis door klassieke en moderne BI bestemmingen zijn, toont een serie snapshots van verschillende use cases in de reisbranche. Tijdens de sessie benadrukken we de capaciteit en flexibiliteit die een BI-tool nodig heeft om u te begeleiden op uw reis van klassieke BI-implementaties naar de moderne big data uitdagingen .
How to successfully implement Business Intelligence into your organisation.
A completely agnostic and independent view from a market leader in delivering technology transformation.
Details on how to build a strategy to successfully execute on and more importantly how to get the business to adopt Business Intelligence into their day to day role.
Essential tool kit for any organisation looking to invest in Business Intelligence.
In this webinar Laura Madsen provides an overview of her new book, "Healthcare Business Intelligence: A Guide to Empowering Successful Data Reporting and Analytics."
Increasing regulatory pressures on healthcare organizations have created a national conversation on data, reporting and analytics in healthcare. Behind the scenes, business intelligence (BI), business analytics, and data warehousing (DW) capabilities are key drivers that empower these functions.
BSI Teradata: The Case of the Dropped Mobile CallsTeradata
The BSI team helps Intergalactic Telephone Corp., whose customers are experiencing dropped mobile phone calls. Using Teradata Aster, Teradata Hybrid Storage, Aprimo and Tableau, the team develops some novel approaches to analyzing call detail records to identify the top potential defectors and to improve customer satisfaction with a powerful combination of real-time apolgies, credits on bills, software upgrades, and free femtocell boosters for selected customers. Watch the video at http://bit.ly/zDfmJH.
The Rensselaer Institute for Data Exploration and Applications is addressing new modes of data exploration and integration to enhance the work of campus researchers (and beyond). This talk outlines the "data exploration" technologies being explored
Brief deck for the 3 most important steps on data exploration.
- Web Scraping (Import.io)
- Data Cleaning (Spreadsheets)
- Data Visualization (Tableau)
Semana de las comunicaciones 2015
TEI of IBM Information Management SolutionsIBM Analytics
Originally Published on Dec 11, 2014
In October 2014, Forrester Consulting worked with IBM on a commissioned study to analyze the total economic impact that IBM’s Information Management solutions have on three specific big data use cases to help its customers solve important business problems. Through interviews and data aggregation, Forrester concluded that IBM Information Management solutions have the following financial impact on a representative organization: ROI – 148%; total benefit (PV) - $3.2 million.
Self Service Buisness Intelligence - Tech TalkBrandix i3
Microsoft MVP, Gogula Ariyalingam covers how self service BI could be used to mine insights from day to day life data sets. A very practical, step by step guide on mining insights with self service BI tools.
http://www.actian.com/
Watch Glen Rabie, CEO of Yellowfin, and Fred Gallahger, GM of Actian Vectorwise take you through 7 of the Best Practices for Big Data and BI.
Architecting for Big Data: Trends, Tips, and Deployment OptionsCaserta
Joe Caserta, President at Caserta Concepts addressed the challenges of Business Intelligence in the Big Data world at the Third Annual Great Lakes BI Summit in Detroit, MI on Thursday, March 26. His talk "Architecting for Big Data: Trends, Tips and Deployment Options," focused on how to supplement your data warehousing and business intelligence environments with big data technologies.
For more information on this presentation or the services offered by Caserta Concepts, visit our website: http://casertaconcepts.com/.
Big Data brings big promise and also big challenges, the primary and most important one being the ability to deliver Value to business stakeholders who are not data scientists!
Big Data, NoSQL, NewSQL & The Future of Data ManagementTony Bain
It is an exciting and interesting time to be involved in data. More change of influence has occurred in the database management in the last 18 months than has occurred in the last 18 years. New technologies such as NoSQL & Hadoop and radical redesigns of existing technologies, like NewSQL , will change dramatically how we manage data moving forward.
These technologies bring with them possibilities both in terms of the scale of data retained but also in how this data can be utilized as an information asset. The ability to leverage Big Data to drive deep insights will become a key competitive advantage for many organisations in the future.
Join Tony Bain as he takes us through both the high level drivers for the changes in technology, how these are relevant to the enterprise and an overview of the possibilities a Big Data strategy can start to unlock.
In this presentation at DAMA New York, Joe started by asking a key question: why are we doing this? Why analyze and share all these massive amounts of data? Basically, it comes down to the belief that in any organization, in any situation, if we can get the data and make it correct and timely, insights from it will become instantly actionable for companies to function more nimbly and successfully. Enabling the use of data can be a world-changing, world-improving activity and this session presents the steps necessary to get you there. Joe explained the concept of the "data lake" and also emphasizes the role of a strong data governance strategy that incorporates seven components needed for a successful program.
For more information on this presentation or Caserta Concepts, visit our website at http://casertaconcepts.com/.
Incorporating the Data Lake into Your Analytic ArchitectureCaserta
Joe Caserta, President at Caserta Concepts presented at the 3rd Annual Enterprise DATAVERSITY conference. The emphasis of this year's agenda is on the key strategies and architecture necessary to create a successful, modern data analytics organization.
Joe Caserta presented Incorporating the Data Lake into Your Analytics Architecture.
For more information on the services offered by Caserta Concepts, visit out website at http://casertaconcepts.com/.
Miss the Webinar launch of Yellowfin 7.3+ – the latest version of our Business Intelligence (BI) and analytics platform? Don’t sweat it.
We thoughtfully recorded the session for your on-demand viewing pleasure. Discover why our governance means data you can trust with Yellowfin 7.3+ here: https://www.yellowfinbi.com/blog/2017/04/yellowfin-7-3-launch-webinar-recording-and-slides
If you’d like to get your hands on a copy, or want a personalized demonstration of Yellowfin 7.3+, simply contact sales@yellowfin.bi
Why watch?
Watch the launch video to see how Yellowfin 7.3+ will cement Yellowfin’s position as the best modern analytics platform for delivering governed data and BI throughout the enterprise.
Be among the first to see how we’re adding enhanced data governance features to drive trustworthy enterprise analytics, more data source connections to support our customer’s growing needs, and a range of new visualization options including JavaScript charts and D3 integration.
The highlights
Yellowfin 7.3+ contains major additions and improvements, including:
Better data governance:
Yellowfin’s content approval workflow has been extended to include Yellowfin Views (metadata), meaning whole data preparation layers can run through an authorized sign-off process before being published – just like any reports, dashboards or Storyboards.
A Change Management Module:
A new Change Management Module provides complete visibility over all content imported and exported within any Yellowfin instance in one place. Enjoy full control when promoting trusted data sources and BI content into your production environment.
More data source connections:
Easily connect to even more data sources than ever before. Imagine the possibilities with connectors for Snowflake, Apache HBase, JSON and direct connection to SAP BW’s BEx layer. Satisfy your Big Data and data streaming needs with Yellowfin 7.3+.
Unlimited charting options:
Boost user satisfaction and adoption with a range of new visualization options, with the ability to use your favorite JavaScript charting libraries within Yellowfin, including D3 integration.
Miss the Webinar launch of Yellowfin 7.3 – the latest version of our Business Intelligence (BI) and analytics platform? Don’t sweat it.
We thoughtfully recorded the session for your on-demand viewing pleasure.
Discover how to set your data free with Yellowfin 7.3: http://www.yellowfinbi.com/YFCommunityNews-Yellowfin-7-3-Launch-Webinar-Recording-and-Slides-245644
Yellowfin 7.3 will be generally available Friday 25 November. If you’d like to get your hands on a copy, or want a personalized demonstration of Yellowfin 7.3, simply contact sales@yellowfin.bi
Why watch?
Watch the launch of Yellowfin 7.3 and discover how to conduct deeper, faster data analysis and create more compelling BI content in less time. Be among the first to see how we’re adding Data Preparation, Set Analysis and new charting features to substantiate Yellowfin’s position as the broadest end-to-end modern BI and analytics platform on the market.
Yellowfin 7.3 contains major additions and improvements, including:
¥ Data preparation: Quickly and independently prepare trustworthy data for reporting and analysis, without waiting for IT. Be empowered to integrate, access and act on more data sources in less time with Yellowfin’s new Data Preparation Module.
¥ Set Analysis: Enjoy the flexibility to answer complex business questions faster, and with greater ease. Build sophisticated charts, to create comparative analysis, in a few simple steps.
¥ New Content Creation Canvas: Create a culture of data-based decision-making and engage users throughout your organization with BI content built specifically for their particular needs. Experience the freedom and flexibility to quickly create customized analytic content and infographics to suit any user type or situation – no coding required.
¥ Connectors: See how Yellowfin’s latest range of connectors and pre-built dashboards for third-party Web applications are connecting people and their data.
BI Dashboard Best Practices Webinar 2016 (Slides) Yellowfin
Miss our BI Dashboard Best Practices Webinar series?
Well, don’t sweat it! We thoughtfully recorded the Webinar for your on-demand viewing pleasure: https://youtu.be/MTZpOrhuwms
Why watch?
Dashboards were rated as the most important Business Intelligence (BI) technology for implementing business-driven BI and analytics in TDWI’s best practices report, ‘Business-Driven Business Intelligence and Analytics’.
But, data visualization expert, Stephen Few, has declared that the majority of today’s BI dashboards fail.
Watch this Webinar recording to learn how to deliver best practice dashboards, achieve high ROI and create BI success; not failure.
Watch this Webinar recording to learn how to:
•Design dashboards that provide instant insight (and nothing more!)
•Create dashboards that are relevant to job function, appropriate for mode of consumption, relate to business strategy and prompt accurate, timely action
•Deliver the right dashboards, to the right people, at the right time, to drive faster and better fact-based decision-making
Data Visualization Best Practice Webinar presentation slidesYellowfin
Business Intelligence (BI) investment is booming. And, the amount of data available for reporting and analysis is skyrocketing.
So, it’s never been harder, or more important, to quickly uncover and communicate the actionable insights within your data. But, how do you separate the gold from the guff, and deliver value from your BI deployment?
View Yellowfin's Data Visualization Best Practices presentation and learn how to choose, design and deliver the best visualizations to effectively communicate the significance of your metrics and trends to your BI users.
And, watch the on-demand version of Yellowfin's Data Visualization Best Practices Webinar here: http://www.yellowfinbi.com/YFCommunityNews-Data-Visualization-Best-Practices-Webinar-Recording-and-slides-230726
Discover how to:
•Absorb more information quickly
•Uncover new relationships, patterns and business opportunities
•Identify and act on emerging trends fast
•Empower more people to make smarter decisions
•Unlock the value of your data and BI deployment
Making healthcare analytics fast, easy and flexibleYellowfin
Why watch?
Does your company operate in the healthcare system? But, are you struggling to measure and improve the quality and value of the services you deliver?
Watch Yellowfin and Proskriptive’s joint webinar – Making healthcare analytics fast, easy and flexible – to find out how: http://www.yellowfinbi.com/YFCommunityNews-Healthcare-Analytics-Webinar-Recording-and-Slides-198119
Discover how predictive and prescriptive analytics can easily and affordably reveal opportunities for quality improvement, operational efficiency, cost savings and enhanced patient care.
Experience a new cloud-based healthcare analytics platform designed for you: Payers, providers and those that provide products and services to the healthcare industry.
“Proskriptive healthcare domain expertise and predictive modelling – delivered and visualized via Yellowfin’s business-user-friendly reports, charts, dashboards and interface – will provide healthcare workers with interactive summaries of crucial cost, operations and quality metrics that deliver a complete patient view.”
– Proskriptive CTO, Justin Richie
View to learn how to:
•See how easy it is to track, analyze and act on data to ensure quality improvement, operational efficiency, cost savings and enhanced patient care
•Discover how to identify emerging problems and develop preventative care plans to boost resource efficiencies and patient care
•Learn how to quickly combine and assess data from a range of source systems to improve overall patient care and resource management
•Benefit from hearing about real-world customer success stories
•Understand how this dedicated healthcare analytics platform overcomes the complexity, high costs and lengthy implementation times of the legacy BI products that have plagued the industry
“Together, Proskriptive and Yellowfin make it easier for healthcare professionals to manage patient care via an accessible, highly flexible and scalable solution that works in conjunction with a clients’ existing data, analysis, and management systems.”
– Yellowfin General Manager for North America, Justin Wright
Data-driven Storytelling Best Practices Webinar (presentation slides)Yellowfin
View these slides from Yellowfin’s Webinar series, Data-driven Storytelling Best Practices, and learn how to leverage the power of storytelling to unlock the insights in your data and instigate change.
To watch this Webinar, search for "Yellowfin Data-driven Storytelling Best Practices Webinar" on YouTube.
Watch this Webinar to:
•Learn how turning insights into stories can help drive decision-making in your organization
•Discover the different storytelling mediums and when to use them
•Learn how to make your storytelling more interactive and compelling
Yellowfin: A data-driven Storytelling pioneer
Yellowfin was the first Business Intelligence (BI) vendor to market (November 2012) with a fully integrated and interactive PowerPoint-like presentation module for BI content, Storyboard.
Storyboard’s presentation-oriented User Interface enables the fact-based insight gleaned from BI to be naturally incorporated into organizational decision-making processes, while empowering a wider array of people from non data-centric backgrounds to share and benefit from BI.
To find out more about Storyboard, GO HERE:
http://www.yellowfinbi.com/YFCommunityNews-Yellowfin-launches-Storyboard-redefines-information-delivery-paradigm-for-BI-122371
Like to find out more?
Please don’t hesitate to contact us for additional information.
To get started with Yellowfin today, simply visit www.yellowfinbi.com and click “Try It Free”.
Discover how we’re striving to deliver you the most accessible and engaging BI and data-driven storytelling capabilities possible.
Stay in touch via the Yellowfin LinkedIn Group, and for regular news and updates, follow Yellowfin (@YellowfinBI) on Twitter.
Why is storytelling important?
Data analysis is more than just data and charts. It’s about finding and telling a story.
Data-driven storytelling inspires action by connecting logic with emotion. As data volumes grow, and people are overloaded with information, the power of storytelling with data is becoming more important. Effective storytelling can help you understand the significance of your data and make better decisions.
Embedded BI Best Practices: Webinar slidesYellowfin
If you haven’t already embedded Business Intelligence (BI) functionality into your application, chances are, your competitors have an advantage.
BI technology is becoming a common component of business applications used across all major job functions, departments and industries.
Gartner predicts that within two years, 25 percent of analytics capabilities will be embedded in business applications, up from only five percent in 2010. If you haven’t already added embedded BI to your software, the time is now.
Watch this Embedded BI Best Practices Webinar recording (https://www.youtube.com/watch?v=8BxPdco9ab8&feature=youtu.be), and view this slide deck, to discover how to successfully add an analytics module to your product suite, and dramatically enhance its salability and market value – minus the development stress you might expect.
Delight your customers and build your brand – all while avoiding the strain of independently developing, maintaining and supporting an in-house BI module.
Big Data Analytic with Hadoop: Customer StoriesYellowfin
Why watch?
Looking to analyze your growing data assets to unlock real business benefits today? But, are you sick of all the Big Data hype and whoopla?
Watch this on-demand Webinar from Actian and Yellowfin – Big Data Analytics with Hadoop – to discover how we’re making Big Data Analytics fast and easy:
Learn how a telecommunications provider has already transformed its business using Big Data Analytics with Hadoop.
Hold on as we go from data in Hadoop to predictive analytics in just 40-minutes.
Learn how to combine Hadoop with the most advanced Big Data technologies, and world’s easiest BI solution, to quickly generate real business value from Big Data Analytics.
What will you learn?
Discover how Actian’s market-leading Big Data Analytics technologies, combined with Yellowfin’s consumer-oriented platform for reporting and analytics, makes generating value from Big Data Analytics faster and easier than you thought possible.
Join us as we demonstrate how to:
• Connect to, prepare and optimize Big Data in Hadoop for reporting and analytics.
• Perform predictive analytics on streaming Big Data: Learn how to empower all your analytics stakeholders to move from historical reports to predictive analytics and gain a sustainable competitive advantage.
• Communicate insights attained from Big Data: Optimize the value of your Big Data insights by learning how to effectively communicate analytical information to defined user groups and types.
This Webinar is ideal if…
• You want to act on more data and data types in shorter timeframes
• You want to understand the steps involved in achieving Big Data success – both front and back end
• You want to see how market leaders are leveraging Big Data to become data-driven organizations today
Looking to analyze and exploit Big Data assets stored in Hadoop? Then this Webinar is a must.
Data Sourcing Best Practices for Reporting (Webinar slides)Yellowfin
Why watch?
Are you trapped in reporting hell?
Do you spend hours struggling to manually produce the reports management demands? Are you working with disparate islands of outdated data? And, after all that hard work, are the reports produced inaccurate and untrustworthy?
Watch this on-demand Webinar from SolveXia and Yellowfin – Data Sourcing Best Practices for Reporting – to discover how to build reliable supply chains of data in just 30-minutes. Learn how to quickly and easily go from source data to killer report – every time.
Only dependable and repeatable processes can produce quality data and reports. Ensure your reporting generates the business insights you need. Let SolveXia and Yellowfin show you how.
What will you learn?
Think the ability to deliver world-class, up-to-date and accurate reports that anyone can access, analyze and act on is important? Then this Webinar is a must.
Watch the on-demand version to learn how to:
•Create business critical reports on which you and your organization can rely
•Deliver sleek, sexy and intuitive charts, reports and dashboards to anyone, anywhere, anytime on any device
•Become the information Superhero you were meant to be!
The data that underpins any reporting system must be managed properly to make sure it’s clean, relevant and delivered in a timely manner to maximize the ability of enterprise BI solutions to produce actionable insights. Do you know how?
Making Big Data Analytics with Hadoop fast & easy (webinar slides)Yellowfin
Looking to analyze your Big Data assets to unlock real business benefits today? But, are you sick of all the theories, hype and whoopla?
View these slides from Actian and Yellowfin’s "Big Data Analytics with Hadoop" Webinar to discover how we’re making Big Data Analytics fast and easy.
Hold on as we go from data in Hadoop to dashboard in just 40-minutes.
Learn how to combine Hadoop with the most advanced Big Data technologies, and world’s easiest BI solution, to quickly generate real business value from Big Data Analytics.
Watch as we use live CDR data stored in Hadoop – quickly connecting, preparing, optimizing and analyzing this data in a tangible real-world use case from the telecommunications industry – to easily deliver actionable insights to anyone, anywhere, anytime.
To learn more about Yellowfin, and to try its intuitive Business Intelligence platform today, go here: http://www.yellowfinbi.com
To learn more about Actian, and its next generation suite of Big Data technologies, go here: http://www.actian.com/
Real-world state of the BI market: Webinar presentation slidesYellowfin
Thinking about implementing a Business Intelligence (BI) solution, but aren’t sure how to navigate your way through an increasingly noisy business analytics software industry? Relax. We’ve got you covered.
Check out the recording of our recent Webinar series – The real-world state of the BI market. Discover the inside knowledge you need to ensure BI and analytics success.
And the best bit? It’ll only take 30 minutes of your time.
What will you learn?
Listen in as we dissect the results of 2013’s Wisdom of Crowds Business Intelligence Market Study – the BI industry’s most in-depth research report into major global implementation, usage and technology developments.
The real-world state of the BI market Webinar, and associated slides, outline:
•How to take advantage of the latest trends shaping the BI marketplace
•Why new generation consumer-oriented BI is set to dominate at the expensive of ‘traditional’ BI
•Why Yellowfin was rated No.1 in DAS’ competitive ranking of the world’s foremost BI vendors
If you’re thinking of deploying BI, you need to see beyond the hype. Take advantage of our expert analysis and explanation of the industry’s most prevalent, tangible and relevant trends.
Global Business Intelligence (BI) and Analytics vendor, Yellowfin, officially released the latest version of its BI software, Yellowfin 6.3, during a series of launch Webinars held across Tuesday 28 and Wednesday 29 May 2013.
Analysts and customers have described Yellowfin’s new ‘Timeline’ module, the centerpiece of Yellowfin 6.3, as “Facebook for BI”.
The major additions and improvements within Yellowfin 6.3 focus on three key product areas in collaboration, usability and data visualization.
View these - the Yellowfin 6.3 Webinar launch presentation slides - to find out more.
Watch the recording of the Webinar launch of Yellowfin 6.3 to discover a better way to collaborative with your data: http://www.yellowfinbi.com/YFCommunityNews-Business-Intelligence-Analytics-vendor-Yellowfin-launches-Facebook-for-BI-131677
SaaS data access & integration best practices for Business IntelligenceYellowfin
Chances are, you’re using more SaaS applications and services than ever before. In fact, Gartner has identified Cloud as one of 2013’s four biggest technology trends.
According to a recent survey by Dimensional Research, 92% of IT executives agree that cloud solutions and services offer clear business benefits – saving time and money for capturing vital enterprise data, while enhancing flexibility.
But, what happens when you want to collate and analyze that data for competitive advantage?
The same research report found that 88% of IT executives are experiencing significant challenges with managing different SaaS business applications or services. And, 67% report issues integrating data between SaaS applications.
The point? An overwhelming majority of enterprise IT executives are concerned that their SaaS data is trapped in silos, unable to be accessed and integrated to provide business users with the unified and timely analytics they need to make crucial business decisions.
Sound familiar?
This presentation - SaaS data access & integration best practices for Business Intelligence – demonstrates how to easily access your SaaS data in the Cloud for advanced reporting, from any source, in real-time.
Provide your business users with the right insights, at the right time, via a self-service BI environment – minus the complex, lengthy and expensive data migration and integration processes.
Wisdom of crowds business intelligence market study findings overviewYellowfin
The latest edition – based on 859 responses from professionals with first-hand experience using vendor products and services – analyzes market place trends throughout 2011 and assess user perceptions towards BI for the coming year. The study also compares and ranks 17 of the world’s foremost BI vendors, their solution and associated services. Yellowfin achieved the equal highest overall ranking (4.57 out of five), as well as best outright score in the study’s “Emerging Business Intelligence Vendors” sub-group.
Yellowfin outscored traditional big name players, including Microsoft, IBM, SAP Business Objects, MicroStrategy, SAS Institute and Oracle. Yellowfin also outperformed other high profile vendors, including Information Builders, Actuate, Qliktech, Tibco Spotfire, Dimensional Insight, Arcplan, Pentaho and Jaspersoft.
Vendors are ranked on a five-point scale, across 33 different criteria, based on seven categories, including: Sales experience, value, quality and usefulness of product, quality of technical support, quality and value of consulting services, integrity and whether existing clients would recommend the vendor and its product to others.
Yellowfin 6.1 - the latest release of Yellowfin's Business Intelligence (BI) software - is focused on making BI even easier. Yellowfin 6.1 enables non-technical business users with the ability to independently receive, share, explore and act on different types and volumes of data from an expansive range of data sources.
2. Your presenters Year
| Last
Yellowfin CEO, Glen Rabie
VP Sales & Services in APAC, Actian
Corporation, Jason Leonidas
3. Your presenters Year
| Last
Yellowfin CEO, Glen Rabie
General Manager, Actian
Vectorwise, Fred Gallagher
4. About Actian and Yellowfin
Making Business Intelligence easy Taking Action on Big Data
History of 100GB TPC-H Performance Benchmarks
Composite Queries Per Hour (Non-Clustered)
500,000.00
400,000.00
QphH@100GB
300,000.00
200,000.00
100,000.00
-
Non-Vectorwise Vectorwise
10. Big Data for Everyone
• Big data is not just for data scientists and bespoke
projects
• Its for decision makers and data consumers
• It needs to be anchored in the real world
Analyst
Consumers
13. Why bother with Big Data?
of organizations collect
60% more data than they can
effectively use
(MIT Sloan Management Review)
14. Why bother with Big Data?
of organizations see
70% Big Data as a big
business opportunity
(Harris Interactive)
of organizations investing
70% in Big Data initiatives
expect ROI within 1 year
(Harris Interactive)
15. Why bother with Big Data?
of organizations that
84% actively leverage Big Data
say they can now make
better decisions
(Avanade)
27. Big Data Eco-system
Social
Media Analytic
Hadoop
Databases
Storage
BIG
Search DATA NewSQL
“as-a-service”
NoSQL
Document
Operational BigTable
Database Key Value
Graph
29. Slow Query Performance is the
#1 issue in BI
BI Survey 10: Why BI Projects Fail?
1. Query Performance Too Slow
TDWI Best Practices Report
“45% Poor Query Response the top problem that will eventually
drive users to replace their current data warehouse platform.”
Gartner Magic Quadrant Data Warehousing
70% of data warehouses experience performance
constrained issues of various types
30. User Expectations
Web-Based
Business Intelligence
Users expect results in
less than 10 seconds
Mobile BI
Users expect results in less
than 3 seconds
31. Use a fast database
Traditional Database Analytical Database Clustered Database
32. Consider the hidden costs
Spend Less on Hardware
Get faster results on smaller
hardware configurations.
Spend Less Time
Database Tuning
Faster deployment and BI
projects. No more
aggregates, cubes, complex
schemas, etc
38. Give your audience what they want
Demographics Interactive Reports Statistics
KPIs Maps Collaboration
39. Visualization is powerful
Looks like Pac-man Does not look like Pac-man
169 41
Looks like Pac-man
Does not look like
Pac-man
40. Big Data Visualization Tips
• More data requires more focus
• Interactivity is essential
• Select the right metrics
• Provide context
• Support and prompt action
42. Big Data and BI Best Practices
1. Focus on what you want to achieve
2. Identify the data you have vs The data you
need
3. Use the right Big Data tool for the job
4. Use a fast database
5. Plan for a mixed architecture
6. Ensure mass distribution of your data
7. Tailor data delivery to each audience
44. | Last Year
More Information
Yellowfin
www.yellowfinbi.com
Vectorwise
www.actian.com/products/vectorwise @YellowfinBI
@ActianCorp
Feedback & Questions
pr@yellowfin.bi Yellowfin LinkedIn User Group
Vectorwise LinkedIn User Group
Editor's Notes
Point of slide – introduce presenters Glen introduces himself Glen introduces Jason
Point of slide – introduce presenters Glen introduces himself Glen introduces Fred
Point of slide – Introduce our companies and why we can talk about this topic (some attendees will not have heard of us)A little bit about our 2 companiesYellowfin – mention awards#1 in BI vendor in global Wisdom of Crowds survey#1 Mobile BI by Dresner Advisory Service#1 location Intelligence by Ventana Research Actian – mention record-breaking benchmarks Broken performance and price/performance TPC-H benchmarks by the largest margin’s ever recorded for every benchmark they have entered. And today we are going to talk about the best practices for Big Data and BI
Why Big Data? Data is the new oil. A big opportunity.
Point of slide – Establish how quickly data is growing. If you don’t have big data now you might soon.Data only grows. And Big Data is growing exponentially.Why? Growth of existing data sources, with sophistocation of computer tracking of shipments, sales, suppliers, and customers, as well as e-mail, and web traffic. Growth of new data sources and types such as geospatial, social media comments, mobile, etc
Point of slide – Communicate Big Data didn’t suddenly appear, but now technology exists to leverage it. We’ve always had big data, but now we have the tools and the cost has come down enough to harvest and make value from it. Why is Big Data a Big Deal Now…
Point of slide – Don’t confuse your Big Data problem with Googles. They are not the same.But not all Big Data is created Equal. And Big Data is relative. Google,Facebook, Twitter –are outliers that are in a class of their own. And their requirements are significantly different to large enterprise businesses, let alone the normal enterprise business and SME. And you don’t need to have petabytes of data to have a Big Data problem.
Point of slide – Define Big Data, and what to look for to see if you have a Big Data problem.The 3 V’s fromGartners 3 is probably the most accepted definition of Big Data because it addresses the pain points … Volume – people think terabytes or petabytes Variety – structured and unstructured data such as… Velocity – includes fast query time, and also streaming data. And for BI, this is by far the most important which we will focus a lot on today.And these are important points because you can suffer from Big Data problems without having much data at all as it’s all relative to your hardware and the tools you are using.So if you have any of these pain points, your data is too big – hence Big Data.
Point of slide – Framing slide, we are talking about Big Data for consumers, not analysts.This webinar is about Big Data and BI – and therefore the focus is on assisting decision makers. Too much of the Big Data discussion focuses on data scientists with bespoke projects (hypothesis, hadoop, partitioning, etc). Today we want to focus on data consumers using this in the real world. More data consumers than there are analysts – how can we empower the masses to add value from Big Data. Big Data for everyone
Point of slide – Where big data can add value. It’s mostly marketing. What is the opportunity?Over 45% of big data deployments are spent on marketing, with spending on digital marketing set to grow form $34B to $76B by 2016This slide is about use cases
Point of slide – Show what industries are using Big Data, and how easy it is for these industries to do it.
Point of slide – 60% are already collecting more data than they can effectively use.
Point of slide Big data is an opportunity, not a burden. And 70% of businesses see it that way. And 70% also expect an ROI within 1 year of investing in Big Data initiatives – hmmm, is that a bit optimistic if it takes years to build a data warehouse?
Point of slide – And most importantly, 84% of organizations using Big Data today say they can now make better decisions – which is what it is all about.
So what is big data? Why the hype?This tongue in cheek sketch that highlights the point that there is hype around big data.Roman Stanek, founder and CEO of Good Data – “Today, the difference between success and failure is the ability to monetize a new class of data. It’s ironic that, despite billions of dollars spent on business intelligence systems, we are still data-bankrupt.What that tells us is current skills and technologies is unable to deliver on the business opportunities that can be realized by Big Data. With such high stakes, its no wonder there is hype.
The success of any Big Data project hinges on delivering greater business value.Many focus on the monetization of Big data which means driving greater revenue or creating new revenue opportunitiesBut, depending on the industry sector it also can deliver operational efficiencies and increased services levels and customer satisfaction.The potential trap for new entrants into the Big Data arena is the temptation to develop a Big Data infrastructure for all possibilities or contingencies. As we heard from Glen earlier, the ROI window is 12 months.We must maintain a strong focus on delivering against the specific business objectives and not let the technology drive the direction of the project.
To realize the potential of big data, what are the levers available you? Which will you use?Personalization – Offering a better, more targeted serviceSocial – Allowing users to communicate and share with other community membersSearch – Making it easier for customers to find what they are looking for (save time improving customer sat)Finding opportunities – How do exploit the data and drive opportunities in the business. By understanding what customers, competitors, and the market are doing we can find new opportunities to exploit. Actionable insights – As Glenn stated earlier, - making it easy for your customers, suppliers, and staff to make better decisions – traditional BI. This is the lever we are focusing on today.Badoo – Collect 10million Records each Day. They had now way of quickly identifying which cThey now run std queries in 10 to 30sec wichhalps them determine which marketing campains are converting customers.
Does the data you have match what you want to achieve? There is typically a huge gap between what we have and what we need. What additional data is required outside the normal corporate data? External feeds can make a critical difference to monetizing big DataThen there are governance issues to consider. Who owns the data Vs Who needs the data ? Are there Security & Privacy issues at play?Then there are the Physical considerations - are there documents, email images, or video, which might take a lot of space, but isn’t traditionally used for analysis. What proportion is Structured Vs Unstructured?How much of your data is just indexing to improve performance?Again, the focus must be on collecting the data you need to answer the specific business questions you have?
Every industry has very specific use cases that drive Big Data Success.In areas such as…Transportation & Logistics that are detecting Fraudbefore it happens– (Timocom)Driving Sales by incorporating Environmental Data such as weather with PoS data (Sheets)Web Traffic Monitoring to determine customer behavior– GSI Commerce When you know your goals and fully understand your data requirements then you know what data you need to collect.It is then you can make a decision on what infrastructure you need.
Again some more humor to get the message across, Hadoop is one of the most well know Big Data solutions. And many of you in the audience today will be using it or considering it for future projects.Hadoop makes a fantastic data store for web traffic and machine data because of it’s unmatched scalability, speed and fault tolerance. However it isn’t always the best for business intelligence were the majority of uses cases are SQL or relational database type applications. So when considering deploying BI for the masses you don’t want to ask them to learn a new skill-set or have deep technical know how.One important lesson we have learned is that creating reports from Hadoop was quite time consuming, and then the query performance was actually quite slow.Today most BI tools do connect to Hadoop (through HIVE) So the key take-away from this Best practice is that the Big Data ecosystem is much bigger than just Hadoop.
So what does this eco system look like?Its a huge ecosystem,with many varied solutions that don’t necessarily address all of the 3V’s – Volume, Variety and Velocity.And obviously it’s impossible to accommodate for everything in a single product.With today’s webinar being focused on Big Data for BI and analytics we will focus on that analytical database space. Its built specifically for addressing Business Intelligences and tackles the velocity (speed) issue better than the any of the others.Hadoop makes a fantastic Big Data store, and there are many other Big Data solutions outside of Hadoop in the NoSQL and NewSQL area which solve different pain points, but again are not best practice for BI.Actian has a many customers who started with Hadoop and have incorporated Vectorwise because of its speed – designed if you like for the 3 V’s.
Performance is the number 1 issue in BI today. And with rapidly growing data its only worsening.There is a lot of evidence to support this. BI Survey – say every year slow query performance is the number 1 reason by BI projects fail. TDWI Best Practices Report – Almost half (45%), said that poor query response was the top problem that will drive them to replace their current data warehouse. And Gartner – say 70% of data warehouses experience performance constrained issues
Just why is it so important to have fast performance? We are the Google generation blessed with instant answers and we have become impatient.Today User Expectations are very demanding. Studies show that BI Project value & adoption drops off dramatically when queries take longer than 10 seconds to run. And on mobile devices that drops to just only 3 seconds. The bigger your data, the slower your reports will run – a huge concern.
The solution to this is to use the right tool for the job - It can make a dramatic difference.1. Traditional databases can perform a wide variety of different workloads, but they were never designed for the challenges and complexities of Big Data – Particularity the job of slicing and dicing data. With the amount of additional hardware and BI tuning you require to get better performance, you’d much better served getting a fast, purpose built database.2. Analytical databases ARE purpose built for slicing and dicing data. They are quick, agile, and easy to get started with. 3. Clustered databases are an option as data volumes grow but they aren’t as agile, require substantially more resources and expertise to manage and implement.
What are the other benefits you gain from using a fast database?1. Slash the cost of the hardware – In many recent tests and Proof of Concepts Vectorwise consistently outperforms other databases on very small servers compared with much larger racks of servers. The hardware you see on this slide is from the 1 TB TPCH benchmark – Oracle used the large Server and VW used the small 2U =Dell Server.2. Dramatically less maintenance – Take out the cost and burden of having teams of DBAs to tune the database3. Time – Deliver usable BI in much less time without the need for deep technical know how.
Planning for a mixed architecture will allow you to bring in the variety of data sources but still deliver on user exceptions of fast BI.And this is an example of how Hadoop might be a part of it. You will certainly need to include your operational data and data from external feeds - all critical components in the big data recipe.However, without the underlying database performance to support the BI tool, even the most brilliant tool will struggle to deliver satisfactory end user adoption.IsCool Entertainment use Hadoop and Vectorwise. They are a European leader in social gaming on Facebook (number 1 in France) with 1.2 million active monthly users. Their gaming platform is built on Hadoop, and they use Vectorwise to analyze user experience. Below is the press release with the quote.“We’re using Vectorwise to investigate consumer behavior to better understand what makes our users play, interact and recommend. Fast and actionable business analytics from Vectorwise will allow us to deliver tailored offers to customers and advertising partners, and thus improve monetization of the games we develop.” – FlorianDouetteau, CTO of IsCool Entertainment. Badoo – global dating site with 150 million members. Use Hadoop for the web application and Vectorwise for the analysis. Before Vectorwise they hard-coded a custom-built analytics solution that was limited in functionality and unable to provide the level of detail their marketing and finance teams needed. “Vectorwise gives us unfettered access to our data and the ability to run ad hoc analyses without the need to have thought of the question before we asked it. This means we can now ask anything of our data and our users’ activity and get answers in just seconds.” – Ian Broadhead, BI lead at BadooNK – Socal media site that has more users in Poland than Facebook has there. 14 million active monthly users. Use Hadoop for click-stream data, such as POST and GET requests, and AdServer logs, and ad hoc queries took sometimes days to design, build and execute. Use Vectorwise for 50-90 of their largest daily queries such as banner optimization (advertising based on user/friend preference) and gaming usage (moving around the buttons/colors, etc to see how changes users). “We looked to solutions from other vendors with analytic databases, but selected Vectorwise for its superior performance and cost-effective model.”
Now you have the data you need, you need to get it into the hands of the people who really need it Big Data is a big investment, and there is no point giving only a few people in your organization access to the data. The more you share data, the more value you get from it (it doesn’t lose its value) It needs to be fast, agile, drag and drop, etc – require very little training.And then finally you can get to … next slide (the magic art of making sure your data tells a story).
If we are going to ensure mass distribution, then we need delivery tailored for each audiences needs. farmer see map of farm (agriculture), marketing see market segmentation, transportation, etcSo when thinking about Visualization then we need it to make sense for them. Multiple use casesData visualization is critically for people to consume it. Nobody sends people on a course to understand a graph. Build for level of skill of audience PacMan – Visualization best practice points (last webinar)Powerful visualization is the best way to express data – the more data you have, the more focus you need.Yellowfin to list best practice for visualizing huge amounts of data.Storytelling
Following slides make points of… If we are going to ensure mass distribution, then we need delivery tailored for each audiences needs. Marketers are going to want to see segmentation of demographics Managers are going to want to interactively drill down into their reports Data scientists are going to want to do statistical analysis Executives are going to want to keep a close eye on KPIs Demographers, people in agriculture are going to want to see things in maps So when thinking about visualizations we need to always keep the audience in mind. Data visualization is critically for people to consume it. Nobody sends people on a course to understand a graph. So build for level of skill of audience
But when you visualize it, you can get your point across much better.Should re-do this in Yellowfin.
More data requires more focusLink to clearly defined business objectivesOnly include actionable informationInteractivity is essentialStart big, drill to detailMore data doesn’t mean more reports and visualizations, it means deeper insightSelect the right metricsIt’s not enough just to decide on what aspects of your business Big Data analytics allows you to monitor. You need to decide how you’re going to track and measure those chosen aspects, and communicate them to end-users via an agreed form of measurement.Provide contextWithout additional contextual information to help users understand data visualizations, it’s impossible for a user to understand the true meaning of the results presented, what action it requires, or whether it demands any action at all. Effectively highlight the most important information:Draw the users attention to the most pertinent pieces of information firstThe most important data should occupy the most screen real estateSelect the best, not the best looking, visualization.: The data; not the visualizations, should always be made the center of attention. Never use flashy visuals and chart types when simple alternatives are capable of conveying the same message – does the third dimension on that pie chart really add to its meaning?Avoid all design aspects that are unconnected to the task of analytic communication."Perfection is achieved, not when there is nothing left to add, but when there is nothing left to remove” -- Antoine de Saint-ExuperyUse colour appropriately and sparingly to achieve maximum impact and contrastIf all colors chosen to represent different metrics or values within a chart are eye-catching, no single point will standout above the othersSelect colours based on a clear understanding of their inherent or commonly accepted symbolic or metaphoric meaning (red = bad, etc)Be consistent. For example, if data relating to second quarter sales is displayed in purple in one chart, all other charts that display data relating to second quarter sales result should also be displayed in purpleAvoid visual clutter Avoid visually gratuitous chart typesSelect the right visualization for the data and the contextSelecting the most context appropriate visualization for a particularly metric or measure requires the judicious application of a little common sense. For example, if you’re attempting to monitor or track the change in something over time, a line graph will almost always work best. Likewise, if tracking several metrics of similar proportions – a potential example might be new leads generated for the current year by marketing category (Google Ads, LinkedIn, print media, banner advertising, etc) – using a column chart or bar graph would be an effective way to visualize the minor differences in performance between each marketing channel. Conversely, a pie chart would deliver a poor user experience as, at first glance, all the portions would seem equal. Layered maps are criticalDisplays large volumes of data efficiently and helps explain the relationships between different types of dataConsider the unique informational requirements of each defined user groupWhat information are they already aware of? What information would enable them to make more efficient and effective decisions? Support and prompt actionUsers must be enabled with a range of options to share the new information and their associated thoughts with others, in order to drive appropriate resultant action. Such information collaboration and decision-making options should include, but are certainly not limited to, the ability to:Email the relevant report to pertinent and affected stakeholdersAdd contextual knowledge to the reports in question via annotations and comments (discussion threads) and have relevant users with access to those reports notifiedAdd decision-widgets to discussion threads to facilitate voting and polling to enable fast and effective collective decision-makingEmbed fully interactive dashboards and reports externally to the BI tool, on any third-party Web-based platform, to allow external stakeholders to understand and act on the emergent issue
Glen does the demo…Drive point home – lets assume we have built a dashboard for a user – dashboard. Real value is that I can browse, un-aggregated. If it was to be traditional, then (show comparison).