This presentation is the introduction to the first DataBench Virtual BenchLearning organised by the H2020 DataBench project and held on April 29, 2020.
My work brings me in contact with the cutting edge in science research and the developments in public policy. I find one common thread when interacting with people in these fields. The question of how can we make more use of large public data while ensuring privacy of the individuals involved. Large amounts of specific types of data can be useful to both scientists and policy makers in myriad ways. But currently there exists no policy framework one could use to take the public in confidence and simultaneously leverage data without conflict of interest. This problem is an underrated one.
Steps taken towards such a policy initiative can be one of the defining intellectual exercise of the 21st century reaping huge benefits for science and policy spheres. And eventually these benefits can be put to used for many purposes. The building of this framework needs grounded knowledge of data science, computer science and public policy to make sure it is robust in every sense. I am an engineer who has built machine learning systems at scale and now I am transitioning into a data journalism and policy career. My past and current experience gives me a unique vantage point to look at this problem. It would be a honour to express myself at Data_Science Conference.
Mike Turner - Digital technology transformation of outpatient servicesInnovation Agency
Presentation by Mike Turner, Programme Director, Salford Royal NHS FT: Salford Royal Foundation Trust Global Digital Exemplar, Digital technology transformation of outpatient services, 2 July 2018, Haydock Park Racecourse
6th International Conference on Software Engineering (SOENG 2020)ijseajournal
6th International Conference on Software Engineering (SOENG 2020) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Software Engineering and Applications. The goal of this conference is to bring together researchers and practitioners from academia and industry to focus on understanding Modern software engineering concepts and establishing new collaborations in these areas.
My work brings me in contact with the cutting edge in science research and the developments in public policy. I find one common thread when interacting with people in these fields. The question of how can we make more use of large public data while ensuring privacy of the individuals involved. Large amounts of specific types of data can be useful to both scientists and policy makers in myriad ways. But currently there exists no policy framework one could use to take the public in confidence and simultaneously leverage data without conflict of interest. This problem is an underrated one.
Steps taken towards such a policy initiative can be one of the defining intellectual exercise of the 21st century reaping huge benefits for science and policy spheres. And eventually these benefits can be put to used for many purposes. The building of this framework needs grounded knowledge of data science, computer science and public policy to make sure it is robust in every sense. I am an engineer who has built machine learning systems at scale and now I am transitioning into a data journalism and policy career. My past and current experience gives me a unique vantage point to look at this problem. It would be a honour to express myself at Data_Science Conference.
Mike Turner - Digital technology transformation of outpatient servicesInnovation Agency
Presentation by Mike Turner, Programme Director, Salford Royal NHS FT: Salford Royal Foundation Trust Global Digital Exemplar, Digital technology transformation of outpatient services, 2 July 2018, Haydock Park Racecourse
6th International Conference on Software Engineering (SOENG 2020)ijseajournal
6th International Conference on Software Engineering (SOENG 2020) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Software Engineering and Applications. The goal of this conference is to bring together researchers and practitioners from academia and industry to focus on understanding Modern software engineering concepts and establishing new collaborations in these areas.
Text Analytics can be used in business for various purposes. Business managers, and students, should have a clear idea of the use cases and a sound general understanding of the technical basics to be competent for business innovation and development. This set of slides (excerpts) is my approach to teach the subject. Comments welcome.
Introduction – OPEN DEI Webinar "The role of the Reference Architectures in D...OPEN DEI
Introduction – OPEN DEI Webinar "The role of the Reference Architectures in Data-oriented Digital Platforms"
28 May 2020
Angelo Marguglio (Head of Smart Industry & Agri-food, Engineering)
BigDataStack Connected Consumer Pilot Demo
BigDataStack will provide retailers with optimal insights into consumer preferences and increase the effectiveness of marketing strategies to improve the consumer shopping experience. Led by Worldline, a roadmap for a major Spanish food retailer has been defined, allowing them to offer predictive shopping lists, and tailored recommendations and promotions, improving consumers’ experiences.
Webinar takeaways
The Connected Consumer use case utilizes the BigDataStack environment to implement and offer a recommender system for the grocery market.
All of the data that are used to train the analytic algorithms of the use case are corporate data provided by one of the top food retailers companies in Spain.
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector WebinarBig Data Value Association
The new data-driven industrial revolution highlights the need for big data technologies to unlock the potential in various application domains. To this end, BDV PPP projects I-BiDaaS, BigDataStack, Track & Know and Policy Cloud deliver innovative technologies to address the emerging needs of data operations and applications. To fully exploit the sustainability and take full advantage of the developed technologies, the projects onboarded pilots that exhibit their applicability in a wide variety of sectors. In the Big Data Pilot Demo Days, the projects will showcase the developed and implemented technologies to interested end-users from the industry as well as technology providers, for further adoption.
Presentation of the Big Data Europe project at the EIP Water Conference 2016 ...Martin Kaltenböck
Presentation of the Big Data Europe project (http://www.big-data-europe.eu) at the EIP Water Conference 2016 in Leeuwarden, The Netherlands. Taking place on 09/02/2016 at the Wetsus Campus in Leeuwarden, the Netherlands in the course of an ICT4Water workshop.
One of the main goals of the I-BiDaaS project is to provide a Big Data as a self-service solution that will empower the actual employees of European companies in targeted sectors (banking, manufacturing, telecom), i.e., the true decision-makers, with the insights and tools they need in order to make the right decisions in an agile way. In this big data pilot webinar, we will demonstrate in a step by step fashion the I-BiDaaS self-service solution and its application to the banking sector. In more detail, we will present an overview of the I-BiDaaS project focusing on the requirements of the CaixaBank pilot study, the I-BiDaaS architecture with its core technologies, and a step by step demo of the I-BiDaaS solution. Last but not least, we will show through CaixaBank's success story how I-BiDaaS can resolve data availability, data sharing, and breaking silos challenges in the banking domain.
DataBench is an EU H2020 Research & Innovation Action providing EU organisations with evidence based Big Data Benchmarks to improve Business Performance. DataBench will investigate existing Big Data benchmarking tools and projects, identify the main gaps and provide a robust set of metrics to compare technical results coming from those tools. The project will liaise closely with the BDVA, ICT-14 and 15 projects to build consensus and to reach out to key industrial communities, to ensure that benchmarking responds to real needs and problems, and will bring together Research, Academia and industry with the aim to establish the Big Data Benchmarking Community.
Insights beyond Human Intuition: Comprehensively Mining Survey DataInspirient
Presented at GOR 22 on 8 September 2022 together with Kantar Public. We discuss how AI technology was used to automate survey analytics at Kantar Public Germany, and detail key business benefits such as improved efficiency, quick deliverables to the client, and advanced data validation and analysis.
This work is part of Kantar Public's Public Data Innovation Hub (https://www.kantarpublic.com/de/Unsere-Expertise/daten-und-fakten/public-data-innovation-hub) and was presented in the session on 'Practical Application of AI for Better Insights' at the General Online Research Conference 2022 (https://www.conftool.org/gor22/index.php?page=browseSessions&form_session=73). Supporting write-ups are available at https://www.inspirient.com/case-studies/survey_analysis_automation.php and https://www.inspirient.com/case-studies/survey_quality_assurance.php
SC6 Workshop 1: Big Data Europe platform requirements and draft architecture:...BigData_Europe
Presentation by Martin Kaltenböck, Semantic Web Company, at the first workshop of Societal Challlenge 6 in the BigDataEurope project, taking place in Luxembourg on 18 November 2015.
http://www.big-data-europe.eu/social-sciences/
Text Analytics can be used in business for various purposes. Business managers, and students, should have a clear idea of the use cases and a sound general understanding of the technical basics to be competent for business innovation and development. This set of slides (excerpts) is my approach to teach the subject. Comments welcome.
Introduction – OPEN DEI Webinar "The role of the Reference Architectures in D...OPEN DEI
Introduction – OPEN DEI Webinar "The role of the Reference Architectures in Data-oriented Digital Platforms"
28 May 2020
Angelo Marguglio (Head of Smart Industry & Agri-food, Engineering)
BigDataStack Connected Consumer Pilot Demo
BigDataStack will provide retailers with optimal insights into consumer preferences and increase the effectiveness of marketing strategies to improve the consumer shopping experience. Led by Worldline, a roadmap for a major Spanish food retailer has been defined, allowing them to offer predictive shopping lists, and tailored recommendations and promotions, improving consumers’ experiences.
Webinar takeaways
The Connected Consumer use case utilizes the BigDataStack environment to implement and offer a recommender system for the grocery market.
All of the data that are used to train the analytic algorithms of the use case are corporate data provided by one of the top food retailers companies in Spain.
BigDataPilotDemoDays - I BiDaaS Application to the Manufacturing Sector WebinarBig Data Value Association
The new data-driven industrial revolution highlights the need for big data technologies to unlock the potential in various application domains. To this end, BDV PPP projects I-BiDaaS, BigDataStack, Track & Know and Policy Cloud deliver innovative technologies to address the emerging needs of data operations and applications. To fully exploit the sustainability and take full advantage of the developed technologies, the projects onboarded pilots that exhibit their applicability in a wide variety of sectors. In the Big Data Pilot Demo Days, the projects will showcase the developed and implemented technologies to interested end-users from the industry as well as technology providers, for further adoption.
Presentation of the Big Data Europe project at the EIP Water Conference 2016 ...Martin Kaltenböck
Presentation of the Big Data Europe project (http://www.big-data-europe.eu) at the EIP Water Conference 2016 in Leeuwarden, The Netherlands. Taking place on 09/02/2016 at the Wetsus Campus in Leeuwarden, the Netherlands in the course of an ICT4Water workshop.
One of the main goals of the I-BiDaaS project is to provide a Big Data as a self-service solution that will empower the actual employees of European companies in targeted sectors (banking, manufacturing, telecom), i.e., the true decision-makers, with the insights and tools they need in order to make the right decisions in an agile way. In this big data pilot webinar, we will demonstrate in a step by step fashion the I-BiDaaS self-service solution and its application to the banking sector. In more detail, we will present an overview of the I-BiDaaS project focusing on the requirements of the CaixaBank pilot study, the I-BiDaaS architecture with its core technologies, and a step by step demo of the I-BiDaaS solution. Last but not least, we will show through CaixaBank's success story how I-BiDaaS can resolve data availability, data sharing, and breaking silos challenges in the banking domain.
DataBench is an EU H2020 Research & Innovation Action providing EU organisations with evidence based Big Data Benchmarks to improve Business Performance. DataBench will investigate existing Big Data benchmarking tools and projects, identify the main gaps and provide a robust set of metrics to compare technical results coming from those tools. The project will liaise closely with the BDVA, ICT-14 and 15 projects to build consensus and to reach out to key industrial communities, to ensure that benchmarking responds to real needs and problems, and will bring together Research, Academia and industry with the aim to establish the Big Data Benchmarking Community.
Insights beyond Human Intuition: Comprehensively Mining Survey DataInspirient
Presented at GOR 22 on 8 September 2022 together with Kantar Public. We discuss how AI technology was used to automate survey analytics at Kantar Public Germany, and detail key business benefits such as improved efficiency, quick deliverables to the client, and advanced data validation and analysis.
This work is part of Kantar Public's Public Data Innovation Hub (https://www.kantarpublic.com/de/Unsere-Expertise/daten-und-fakten/public-data-innovation-hub) and was presented in the session on 'Practical Application of AI for Better Insights' at the General Online Research Conference 2022 (https://www.conftool.org/gor22/index.php?page=browseSessions&form_session=73). Supporting write-ups are available at https://www.inspirient.com/case-studies/survey_analysis_automation.php and https://www.inspirient.com/case-studies/survey_quality_assurance.php
SC6 Workshop 1: Big Data Europe platform requirements and draft architecture:...BigData_Europe
Presentation by Martin Kaltenböck, Semantic Web Company, at the first workshop of Societal Challlenge 6 in the BigDataEurope project, taking place in Luxembourg on 18 November 2015.
http://www.big-data-europe.eu/social-sciences/
This presentation was given by Prof. Chiara Francalanci from Politecnico di Milano during the second Virtual BenchLearning organised by the H2020 DataBench project.
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent B...DataBench
This presentation was given by Gabriella Cattaneo and Erica Spinoni from IDC during the first Virtual BenchLearning organised by the H2020 DataBench project.
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdf
DataBench Virtual BenchLearning "Big Data - Benchmark your way to Excellent Business Performance" - Introduction
1.
2. About DataBench
29/04/2020 DataBench Project - GA Nr 780966 2
DataBench Outcomes
DataBench Framework
Including a complete set of metrics for Big Data
Technologies assessment
Multiple Analysis
Assessing the European and industrial significance
of the Big Data Technologies examined by the
project
DataBench Toolbox
A tool to connect and evaluate external initiatives
DataBench Handbook
Providing guidelines to the use of the project’s
results, Framework & Toolbox, describing metrics
implementation and benchmarks
3. Virtual BenchLearning
29/04/2020 DataBench Project - GA Nr 780966 3
DataBench results Engagement
with the community Test results
2nd Virtual BenchLearning
May 28th, 2020
11:00CET
Chiara Francalanci – Politecnico di Milano
Tomás Pariente – Atos Research and Innovation
4. Keep in mind
29/04/2020 DataBench Project - GA Nr 780966 4
Q&A after the
presentation
Ask your questions on
the chat box
Answer the poll
questions
We will send you the
presentation and
other material in the
upcoming days
5. Erica Spinoni
Research Analyst, IDC European
Software - Big Data Analytics
and Digital Transformation Strategies
Our speakers
Gabriella Cattaneo
Associate Vice President, IDC4EU
European Government Consulting
Representative of IDC @BDVA
29/04/2020 DataBench Project - GA Nr 780966 5
6. This project has received funding from the European Horizon
2020 Programme for research, technological development and
demonstration under grant agreement n° 780966