BDVe Webinar Series: DataBench – Benchmarking Big Data. Gabriella Cattaneo. T...Big Data Value Association
This webinar presents the DataBench project. Gabriella Cattaneo (IDC) will provide ideas on how big data benchmarking could help organizations to get better business insights and take informed decisions.
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...Denodo
Watch full webinar here: https://bit.ly/3cbpipB
Uno de los sectores en los que la transformación digital está teniendo un efecto más disruptivo es el de la fabricación. Líderes del sector manufacturero están apostando por el Big Data, la computación en la nube, la inteligencia artificial y el Internet de las Cosas (IoT) entre otras tecnologías, además de contemplar la llegada de la 5G, con el fin de:
- Automatizar los procesos de manera eficiente, para permitir una mayor producción en menor tiempo
- Crear valor añadido en los productos manufacturados
- Conectar la planta industrial con el punto de venta
- Impulsar el análisis en tiempo real de datos provenientes de diferentes cadenas de producción
Sin embargo, para alcanzar estos objetivos y llevar a cabo esta revolución tecnológica, también conocida como industria 4.0, las manufacturas tienen que enfrentarse a una serie de desafíos no negligentes. El sector industrial es el que genera más datos en el mundo, y en la era digital, la velocidad, la diversidad y el volumen exponencial de los datos pueden superar las arquitecturas de TI tradicionales. Además, la mayoría de los fabricantes se enfrentan a silos de datos, lo que hace que su tratamiento sea lento y costoso. Necesitan entonces una plataforma de TI fiable que permita integrar, centralizar y analizar datos de distintas fuentes y diferentes formatos de manera ágil y segura para poner la información al servicio del negocio.
Los expertos de Enki y Denodo te proponen este seminario online para descubrir qué es la virtualización de datos, y por qué líderes del sector apuestan por esta tecnología innovadora para optimizar su estrategia de TI y conseguir un ROI significativo gracias a un acceso más rápido, simple y unificado a los datos industriales.
Watch full webinar here: [https://buff.ly/2R4JjBX]
Organizations today are data rich and insights poor. There is data everywhere. ERP systems, CRM systems, external data, data lakes and ponds. The real question to ask is “Are the users getting the insights they need when they need where they need to drive successful business outcomes”. Data Integration is a core pillar of the “Data to Value” journey. In this session you will hear how enterprises across industries are grappling with data, insights challenges and how organizations have adopted data virtualization to accelerate their "data to value" journeys.
Watch this Denodo DataFest 2018 session to learn:
How to reduce effort to get from data to value
Hope to gain faster time to Insights
How to reduce overall cost of ownership
BDVe Webinar Series: DataBench – Benchmarking Big Data. Gabriella Cattaneo. T...Big Data Value Association
This webinar presents the DataBench project. Gabriella Cattaneo (IDC) will provide ideas on how big data benchmarking could help organizations to get better business insights and take informed decisions.
¿Cómo las manufacturas están evolucionando hacia la Industria 4.0 con la virt...Denodo
Watch full webinar here: https://bit.ly/3cbpipB
Uno de los sectores en los que la transformación digital está teniendo un efecto más disruptivo es el de la fabricación. Líderes del sector manufacturero están apostando por el Big Data, la computación en la nube, la inteligencia artificial y el Internet de las Cosas (IoT) entre otras tecnologías, además de contemplar la llegada de la 5G, con el fin de:
- Automatizar los procesos de manera eficiente, para permitir una mayor producción en menor tiempo
- Crear valor añadido en los productos manufacturados
- Conectar la planta industrial con el punto de venta
- Impulsar el análisis en tiempo real de datos provenientes de diferentes cadenas de producción
Sin embargo, para alcanzar estos objetivos y llevar a cabo esta revolución tecnológica, también conocida como industria 4.0, las manufacturas tienen que enfrentarse a una serie de desafíos no negligentes. El sector industrial es el que genera más datos en el mundo, y en la era digital, la velocidad, la diversidad y el volumen exponencial de los datos pueden superar las arquitecturas de TI tradicionales. Además, la mayoría de los fabricantes se enfrentan a silos de datos, lo que hace que su tratamiento sea lento y costoso. Necesitan entonces una plataforma de TI fiable que permita integrar, centralizar y analizar datos de distintas fuentes y diferentes formatos de manera ágil y segura para poner la información al servicio del negocio.
Los expertos de Enki y Denodo te proponen este seminario online para descubrir qué es la virtualización de datos, y por qué líderes del sector apuestan por esta tecnología innovadora para optimizar su estrategia de TI y conseguir un ROI significativo gracias a un acceso más rápido, simple y unificado a los datos industriales.
Watch full webinar here: [https://buff.ly/2R4JjBX]
Organizations today are data rich and insights poor. There is data everywhere. ERP systems, CRM systems, external data, data lakes and ponds. The real question to ask is “Are the users getting the insights they need when they need where they need to drive successful business outcomes”. Data Integration is a core pillar of the “Data to Value” journey. In this session you will hear how enterprises across industries are grappling with data, insights challenges and how organizations have adopted data virtualization to accelerate their "data to value" journeys.
Watch this Denodo DataFest 2018 session to learn:
How to reduce effort to get from data to value
Hope to gain faster time to Insights
How to reduce overall cost of ownership
Petabytes to Personalization - Data Analytics with Qubit and LookerRittman Analytics
How do you turn petabytes of customer data into a personalized retail and e-commerce experience? With Qubit, the customer personalization platform that (with the help of Google Cloud Platform and Looker) gives customers the power of real-time ad-hoc analytics. Because of the scale of data enabled by GCP and the abstraction layer of Looker, Qubit customers are able to use their Live Tap product to to make every visitor experience relevant and engaging.
At EA Connect Days 2018 in Bonn, Kati Gholam, Enterprise Architect at TUI, explored how TUI transformed their Enterprise Architecture across different geographies with differing approaches to EA. She explained why GDPR was helpful with this project and the value of communication.
Drive Business Outcomes for Big Data EnvironmentsCisco Services
Bob Eve, Director of Cisco Data Virtualization Business Unit, highlights big data business opportunities and the big data integration challenge in his recent presentation from Cisco Live 2014.
TIBCO Spotfire: Data Science in the EnterpriseTIBCO Spotfire
From Data to Insights in Internet Time
Eric Novik, Internal Analytics Group, TIBCO Spotfire
ANALYTICS AND VISUALIZATION FOR THE FINANCIAL ENTERPRISE CONFERENCE
June 25, 2013 The Langham Hotel Boston, MA
Modernizing the Enterprise Monolith: EQengineered Consulting Green PaperMark Hewitt
Are you an enterprise that recognizes the business liability inherent in the monolithic or otherwise dated enterprise software applications you have built? Does your technology represent an impediment to the needed agility and flexibility required to meet the needs of today’s business environment?
Historically, enterprise software development focused on an approach that incorporated all functionality into a single process, and replicated it across servers as additional capacity was required. Today, these large applications have become bloated and unmanageable as new features and functionality are added. And, as small changes are made to existing functionality, the requirements to update and redeploy the server-side application becomes an intractable juggernaut.
Forward-thinking organizations like Amazon and Netflix led the way toward agile processes, deconstructed software stacks, and efficient APIs. Both large and small organizations serious about embracing modern practices have followed by decoupling the front and back end of their enterprise applications, employing microservices and cloud technologies, and adopting agile methodologies. These very steps can serve to highlight additional technical deficits in old solutions and codebases, which in turn become stumbling blocks to modern development practices.
As these technology trends continue to evolve, how can your company keep pace and remain viable?
In this green paper, we discuss how CIOs, CTOs, and VPs of Engineering can lead the needed modernization with their counterparts in marketing and the business to ensure that their organizations remain competitive in today’s customer-driven and technology-led economy.
Key questions addressed include:
• Why is technical modernization vital for the business?
• What types of modernization projects are there?
• How does modernization fit into your organization?
This presentation was given by Prof. Chiara Francalanci from Politecnico di Milano during the second Virtual BenchLearning organised by the H2020 DataBench project.
Petabytes to Personalization - Data Analytics with Qubit and LookerRittman Analytics
How do you turn petabytes of customer data into a personalized retail and e-commerce experience? With Qubit, the customer personalization platform that (with the help of Google Cloud Platform and Looker) gives customers the power of real-time ad-hoc analytics. Because of the scale of data enabled by GCP and the abstraction layer of Looker, Qubit customers are able to use their Live Tap product to to make every visitor experience relevant and engaging.
At EA Connect Days 2018 in Bonn, Kati Gholam, Enterprise Architect at TUI, explored how TUI transformed their Enterprise Architecture across different geographies with differing approaches to EA. She explained why GDPR was helpful with this project and the value of communication.
Drive Business Outcomes for Big Data EnvironmentsCisco Services
Bob Eve, Director of Cisco Data Virtualization Business Unit, highlights big data business opportunities and the big data integration challenge in his recent presentation from Cisco Live 2014.
TIBCO Spotfire: Data Science in the EnterpriseTIBCO Spotfire
From Data to Insights in Internet Time
Eric Novik, Internal Analytics Group, TIBCO Spotfire
ANALYTICS AND VISUALIZATION FOR THE FINANCIAL ENTERPRISE CONFERENCE
June 25, 2013 The Langham Hotel Boston, MA
Modernizing the Enterprise Monolith: EQengineered Consulting Green PaperMark Hewitt
Are you an enterprise that recognizes the business liability inherent in the monolithic or otherwise dated enterprise software applications you have built? Does your technology represent an impediment to the needed agility and flexibility required to meet the needs of today’s business environment?
Historically, enterprise software development focused on an approach that incorporated all functionality into a single process, and replicated it across servers as additional capacity was required. Today, these large applications have become bloated and unmanageable as new features and functionality are added. And, as small changes are made to existing functionality, the requirements to update and redeploy the server-side application becomes an intractable juggernaut.
Forward-thinking organizations like Amazon and Netflix led the way toward agile processes, deconstructed software stacks, and efficient APIs. Both large and small organizations serious about embracing modern practices have followed by decoupling the front and back end of their enterprise applications, employing microservices and cloud technologies, and adopting agile methodologies. These very steps can serve to highlight additional technical deficits in old solutions and codebases, which in turn become stumbling blocks to modern development practices.
As these technology trends continue to evolve, how can your company keep pace and remain viable?
In this green paper, we discuss how CIOs, CTOs, and VPs of Engineering can lead the needed modernization with their counterparts in marketing and the business to ensure that their organizations remain competitive in today’s customer-driven and technology-led economy.
Key questions addressed include:
• Why is technical modernization vital for the business?
• What types of modernization projects are there?
• How does modernization fit into your organization?
This presentation was given by Prof. Chiara Francalanci from Politecnico di Milano during the second Virtual BenchLearning organised by the H2020 DataBench project.
Building the DataBench Workflow and Architecturet_ivanov
In the era of Big Data and AI, it is challenging to know all technical and business advantages of the emerging technologies. The goal of DataBench is to design a benchmarking process helping organizations developing Big Data Technologies (BDT) to reach for excellence and constantly improve their performance, by measuring their technology development activity against parameters of high business relevance. This paper focuses on the internals of the DataBench framework and presents our methodological workflow and framework architecture.
Real life use cases from across Europe (Walid Aoudi - Cognizant)
This presentation will present some Cognizant Big Data clients return on experiences on continental Europe and UK. The main focus will be centered on use cases through the presentation of the business drivers behind these projects. Key highlights around the big data architecture and approach solutions will be presented. Finally, the business outcomes in terms of ROI provided by the solutions implementations will be discussed.
DAMA Webinar: Turn Grand Designs into a Reality with Data VirtualizationDenodo
Watch full webinar here: https://buff.ly/2HMdbUp
What started to evolve as the most agile and real-time enterprise data fabric, data virtualization is proving to go beyond its initial promise and is becoming one of the most important enterprise big data fabrics.
Attend this session to learn:
• What data virtualization really is,
• How it differs from other enterprise data integration technologies
• Real-world examples of data virtualization in action from companies such as Logitech, Autodesk and Festo.
At the heart of this DataBench webinar is the goal to share a benchmarking process helping European organisations developing Big Data Technologies to reach for excellence and constantly improve their performance, by measuring their technology development activity against parameters of high business relevance.
The webinar aims to provide the audience with a framework and tools to assess the performance and impact of Big Data and AI technologies, by providing real insights coming from DataBench. In addition, representatives from other projects part of the BDV PPP such as DeepHealth and They-Buy-for-You will participate to share the challenges and opportunities they have identified on the use of Big Data, Analytics, AI. The perspective of other projects that also have looked into benchmarking, such as Track&Now and I-BiDaaS will be introduced.
The Briefing Room with Dr. Robin Bloor and RedPoint Global
Live Webcast Jan. 13, 2015
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=c847c54220dfb80841f3e0c63664fd08
Context is king in the realm of Big Data. With enough perspective on a customer or prospect, organizations can fine-tune their offerings in game-changing ways. Today's cutting-edge companies are viewing their customers within the context of a decade or more of interactions, and across multiple channels. How so? Real-time integration with social media and other customer channels can now result in actionable insights with serious potential.
Register for this episode of The Briefing Room to hear veteran Analyst Dr. Robin Bloor, as he describes the changing landscape of data flow, and how that impacts enterprise responsiveness. He'll be briefed by George Corugedo of RedPoint Global, who will explain how companies are leveraging Hadoop's YARN architecture to deliver a whole new array of highly responsive, data-driven enterprise applications. He'll demonstrate how RedPoint's platform running inside Hadoop can enable a wide range of both real-time and strategic data management functionality, all of which can be applied to any number of critical business processes.
Visit InsideAnalysis.com for more information.
BDVe Webinar Series: DataBench – Benchmarking Big Data. Arne Berre. Tue, Oct ...Big Data Value Association
This webinar presents the DataBench project. Arne Berre (SINTEF) will explain the efforts to characterise and reuse big data benchmarking frameworks from a technical perspective, and share details of the degree of support that DataBench will provide to other projects and big data practitioners to benchmark big data tools and applications.
Summary of three National webinars. Three V's, market, Functional areas showing most traction, Hot Revenue/ROI areas, Architecture options and using Use cases to overcome objections.,
DataBench is an EU H2020 Research & Innovation Action providing EU organisations with evidence based Big Data Benchmarks to improve Business Performance. DataBench will investigate existing Big Data benchmarking tools and projects, identify the main gaps and provide a robust set of metrics to compare technical results coming from those tools. The project will liaise closely with the BDVA, ICT-14 and 15 projects to build consensus and to reach out to key industrial communities, to ensure that benchmarking responds to real needs and problems, and will bring together Research, Academia and industry with the aim to establish the Big Data Benchmarking Community.
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Denodo
Watch full webinar here: https://bit.ly/2O2r3NP
In the last several decades, BI has evolved from large, monolithic implementations controlled by IT to orchestrated sets of smaller, more agile capabilities that include visual-based data discovery and governance. These new capabilities provide more democratic analytics accessibility that is increasingly being controlled by business users. However, given the rapid advancements in emerging technologies such as cloud and big data systems and the fast changing business requirements, creating a future-proof data management strategy is an incredibly complex task.
Catch this on demand session to understand:
- BI program modernization challenges
- What is data virtualization and why is its adoption growing so quickly?
- How data virtualization works and how it compares to alternative approaches to data integration
- How modern data virtualization can significantly increase agility while reducing costs
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench
This presentation was given by Gabriella Cattaneo and Erica Spinoni from IDC during the first Virtual BenchLearning organised by the H2020 DataBench project.
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench
This presentation is the introduction to the first DataBench Virtual BenchLearning organised by the H2020 DataBench project and held on April 29, 2020.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeWalaa Eldin Moustafa
Dynamic policy enforcement is becoming an increasingly important topic in today’s world where data privacy and compliance is a top priority for companies, individuals, and regulators alike. In these slides, we discuss how LinkedIn implements a powerful dynamic policy enforcement engine, called ViewShift, and integrates it within its data lake. We show the query engine architecture and how catalog implementations can automatically route table resolutions to compliance-enforcing SQL views. Such views have a set of very interesting properties: (1) They are auto-generated from declarative data annotations. (2) They respect user-level consent and preferences (3) They are context-aware, encoding a different set of transformations for different use cases (4) They are portable; while the SQL logic is only implemented in one SQL dialect, it is accessible in all engines.
#SQL #Views #Privacy #Compliance #DataLake
Relating Big Data Business and Technical Performance Indicators, Barbara Pernici, itAIS 2018, 12/10/2018
1. Relating Big Data
Business and Technical Performance
Indicators
Barbara Pernici, Chiara Francalanci, Angela
Geronazzo, Lucia Polidori, Stefano Ray, Leonardo Riva,
Politecnico di Milano, Italy
Arne Jørgen Berre,
SINTEF, Norway
Todor Ivanov,
University of Frankfurt, Germany
itAIS Conference, PAvia, October 12, 2018
12/12/2018 DataBench Project - GA Nr 780966 1
2. 12/12/2018 DataBench Project - GA Nr 780966 2
Main Activities
• Classify the main use cases of BDT
by industry
• Compile and assess technical
benchmarks
• Perform economic and market
analysis to assess industrial needs
• Evaluate business performance in
selected use cases
Expected Results
• A conceptual framework linking
technical and business benchmarks
• European industrial and
performance benchmarks
• A toolbox measuring optimal
benchmarking approaches
• A handbook to guide the use of
benchmarks
Building a bridge between technical and
business benchmarking
5. DataBench Project - GA Nr 780966 5
How to link technical and business benchmarking
WP2 – ECONOMIC MARKET AND
BUSINESS ANALYSIS
WP4 – EVALUATING BUSINESS
PERFORMANCE
Top-down
Bottom-up
• Focus on economic and industry analysis
and the EU Big Data market
• Classify leading Big Data technologies
use cases by industry
• Analyse industrial users benchmarking
needs and assess their relative
importance for EU economy and the
main industries
• Demonstrate the scalability, European
significance (high potential economic
impact) and industrial relevance
(responding to primary needs of users)
of the benchmarks
USE CASES = Typologies of technology
adoption in specific application domains
and/or business processes
Focus on data collection and
identification of use cases to be
monitored and measured
Evaluation of business performance of
specific Big Data initiatives
Leverage Databench toolbox
Provide the specific industrial
benchmarks to WP”
Produce the Databench Handbook, a
manual supporting the application of
the Databench toolbox
12/12/2018
6. Results from the first year of the projects
• Modeling business indicators
• Modeling technical indicators
• Relating business and technical indicators
• Two surveys
• With BDVa Benchmarking group
• Databench survey
12/12/2018 DataBench Project - GA Nr 780966 6
7. Business indicators
12/12/2018 DataBench Project - GA Nr 780966 7
Industry
Big Data
Maturity
KPI
Scope of Big Data
& Analytics
Data User
DB & Analytics
Application
Size of
Business
Data size Datasource
Finance Currently using Cost reduction
Decision
optimization task
Data
Enterpreneurs
Sales 5000 or more Gigabytes Distributed
Manufacturing
Piloting or
implementing
Time efficiency
Data driven
business
processes
Vendors in the
ICT industry
Customer
service &
support
2500 to 4999 Terabytes Centralized
Retail &
Wholesale
Considering or
evaluating for
future use
Product/service
quality
Data oriented
digital
transformation
User
companies
IT & data
operation
1000 to 2499 Petabytes
Telecom/ Media
Not using and no
plan to do so
Revenue growth
Governance risk
& compliance
250 to 999 Exabytes
Transport/
Accomodation
Customer
satisfaction
Product
management
50 to 249
Utility/Oil&Gas/
Energy
Business model
innovation
Marketing 10 to 49
Professional
services
Lauch of new
products and/or
services
Maintencance &
logistics
less than 10
Governamental/
Education
Product
innovation
Healthcare HR & Legal
R&D
Finance
8. BDVa (Big Data Value Association) Architecture
• Towards technical indicators
• Tech areas
• Types of data
12/12/2018 DataBench Project - GA Nr 780966 8
10. Technical
indicators
12/12/2018 DataBench Project - GA Nr 780966 10
Metrics Data Types
Benchmark
Data Usage
Storage Type
Processing
Type
Analytics
Type
Architecture
Patterns
Platform
Features
Execution time/
Latency
Business
Intelligence
(Tables,
Schema…)
Synthetic data
Distributed
File System
Batch Descriptive Data Preparation Fault-tolerance
Throughput
Graphs, Linked
Data
Real data
Databases/
RDBMS
Stream Diagnostic Data Pipeline Privacy
Cost Time Series, IoT
Hybrid (mix of
real and
synthetic) data
NoSQL
Interactive/(ne
ar) Real-time
Predictive Data Lake Security
Energy
consumption
Geospatial,
Temporal
NewSQL/ In-
Memory
Iterative/In-
memory
Prescriptive Data Warehouse Governance
Accuracy
Text (incl.
Natural
Language text)
Time Series
Lambda
Architecture
Data Quality
Precision
Media (Images,
Audio and
Video)
Kappa
Architecture
Veracity
Availability
Unified Batch
and Stream
architecture
Variability
Durability
Data
Management
CPU and Memory
Utilization
Data
Visualization
11. Data
Privacy
Time
series,
IoT
Geo
Spatio
Temp
Media
Image
Audio
Text
NLP
Web
Graph
BDVA Reference Model
Struct
data/
BI
Data Processing Architectures
Data Visualisation and User Interaction
Data Analytics
Data Management
Infrastructure
BigDataPriorityTechAreas
Sectors: Manufacturing, Health, Energy, Media, Telco, Finance, EO, ..
Big data
Types &
semantics
BigBench
Hobbit-IV
BigDataBench
ALOJA
TPC
Hobbit-II
Hobbit-II
Hobbit-I+III
LDBC-3
Graphalytics
LDBC-2
SocialNet
LDBC-1
SemanticPub
BigBench
BigDataBench
BigBench
2.0
YStreamB
DeepMark DeepBench
RIoTBench
BigBench
2.0
Horizontal benchmarks
Vertical
benchmarks
SenseMark
ABench
SparkBench
YCSB
SparkBench
StreamBench
Big Data
Benchmarks
related to the
BDVA Big Data
Reference model
(ongoing work)
12. Early Results from the
BDVa Questionnaire
BDVa Benchmarking group
(EU projects Hobbit and
DataBench)
12/12/2018 DataBench Project - GA Nr 780966 12
14. Relating indicators
An example: business KPI (x-axis) vs Analysis type (y-axis)
12/12/2018 DataBench Project - GA Nr 780966 14
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
We do not use them We target revenue growth We target margin growth We target cost reduction We target time efficiency We target customer
satisfaction
We target product/service
quality
Descriptive
Inferential
Predictive
Prescriptive
16. Early Results from the
Databench Business users
Survey
12/12/2018 DataBench Project - GA Nr 780966 16
17. Survey
• DataBench Survey, IDC, Interim results, 401 interviews
• Aiming at 800 (on going) + case studies
• October 2018
• 11 EU countries (7.7% in Italy)
• Final results to be presented at the European Big Data Value Forum
and in the DataBench report due in December 2018
12/12/2018 DataBench Project - GA Nr 780966 17
18. 12/12/2018 DataBench Project - GA Nr 780966 18
Source: Databench Survey, IDC, Interim results, 401 interviews, October 2018
Users recognize the relevance of business benchmarking…
41%
37%
22%
Respondents by Type of Use of BDA
UsingEvaluating
Piloting
DRAFT
22. • Provide methodologies and tools to help assess and
maximise the business benefits of BDT adoption
• Provide criteria for the selection of the most appropriate
BDTs solutions
• Provide benchmarks of European and industrial significance
• Provide a questionnaire tool comparing your choices and
your KPIs with your peers
DataBench Project - GA Nr 780966 22
Goals of DataBench
Interested in participating?
Expression of interest to become a case study and monitoring
your Big Data KPIs
Answer a survey on your Big Data experiences
12/12/2018
Gabriella Cattaneo (IDC) will provide ideas on how big data benchmarking could help organizations to get better business insights and take informed decision
Gabriella Cattaneo (IDC) will provide ideas on how big data benchmarking could help organizations to get better business insights and take informed decision
Gabriella Cattaneo (IDC) will provide ideas on how big data benchmarking could help organizations to get better business insights and take informed decision