Detecting eCommerce Fraud with Neo4j and LinkuriousNeo4j
Last year, the global eCommerce market represented $1.9 trillions. As the market expands worldwide, the opportunity for fraud keeps growing with fraudsters constantly refining their tactics to outsmart anti-fraud frameworks. From chargeback fraud to re-shipping scam or identity fraud, numerous types of fraud can impact your organization. While collecting data is essential to enable real-time risk assessment, many organizations don’t have the necessary tools to find the insights needed to block fraud attempts.
Neo4j and Linkurious offer a solution to tackle the eCommerce fraud challenge. Their combined technologies provide a 360° overview of organization’s data and allow real-time analysis and detection of eCommerce fraud patterns and activities.
In this webinar, you will learn about:
- The current trends of eCommerce frauds and the risks for organizations;
- The challenges of detecting fraud tentatives in real-time and the advantage of the graph approach;
- How to use Linkurious’ graph visualization and analysis software to prevent and investigate eCommerce fraud.
Top Big data Analytics tools: Emerging trends and Best practicesSpringPeople
For many IT experts, big data analytics tools and technologies are now a top priority. Let's find out the top big data analytics tools in this slide to initialize and advance the process of big data analysis.
Presented by Michelle Hirsch, Head of MATLAB Product Management, MathWorks on 28th April in Bangalore in joint languages meetup @Walmart.
Companies are scrambling to get insight from the massive quantities of data they collect but are struggling to find employees who combine the deep expertise in computer science, statistics and machine learning, and the domain expertise to truly understand the data. In this talk, Dr. Hirsch discusses how MATLAB enables engineers and scientists to apply their domain expertise to big data analytics.
Highlights:
* Accessing data in large text files, databases, or from the Hadoop Distributed File System (HDFS)
* Using virtual “tall” arrays to process out-of-core data with natural mathematical syntax
Developing machine learning models
* Integrating MATLAB analytics into production systems
About the speaker: Michelle Hirsch, Ph.D. is responsible for driving strategy and direction for MATLAB, the leading programming platform for engineers and scientists. Based outside of Boston, Massachusetts, Michelle is joining our meetup during a trip to meet with MATLAB users across India.
Supporting data: https://www.slideshare.net/CodeOps/flight-test-analysis-final
Detecting eCommerce Fraud with Neo4j and LinkuriousNeo4j
Last year, the global eCommerce market represented $1.9 trillions. As the market expands worldwide, the opportunity for fraud keeps growing with fraudsters constantly refining their tactics to outsmart anti-fraud frameworks. From chargeback fraud to re-shipping scam or identity fraud, numerous types of fraud can impact your organization. While collecting data is essential to enable real-time risk assessment, many organizations don’t have the necessary tools to find the insights needed to block fraud attempts.
Neo4j and Linkurious offer a solution to tackle the eCommerce fraud challenge. Their combined technologies provide a 360° overview of organization’s data and allow real-time analysis and detection of eCommerce fraud patterns and activities.
In this webinar, you will learn about:
- The current trends of eCommerce frauds and the risks for organizations;
- The challenges of detecting fraud tentatives in real-time and the advantage of the graph approach;
- How to use Linkurious’ graph visualization and analysis software to prevent and investigate eCommerce fraud.
Top Big data Analytics tools: Emerging trends and Best practicesSpringPeople
For many IT experts, big data analytics tools and technologies are now a top priority. Let's find out the top big data analytics tools in this slide to initialize and advance the process of big data analysis.
Presented by Michelle Hirsch, Head of MATLAB Product Management, MathWorks on 28th April in Bangalore in joint languages meetup @Walmart.
Companies are scrambling to get insight from the massive quantities of data they collect but are struggling to find employees who combine the deep expertise in computer science, statistics and machine learning, and the domain expertise to truly understand the data. In this talk, Dr. Hirsch discusses how MATLAB enables engineers and scientists to apply their domain expertise to big data analytics.
Highlights:
* Accessing data in large text files, databases, or from the Hadoop Distributed File System (HDFS)
* Using virtual “tall” arrays to process out-of-core data with natural mathematical syntax
Developing machine learning models
* Integrating MATLAB analytics into production systems
About the speaker: Michelle Hirsch, Ph.D. is responsible for driving strategy and direction for MATLAB, the leading programming platform for engineers and scientists. Based outside of Boston, Massachusetts, Michelle is joining our meetup during a trip to meet with MATLAB users across India.
Supporting data: https://www.slideshare.net/CodeOps/flight-test-analysis-final
Tools and Methods for Big Data Analytics by Dahl WintersMelinda Thielbar
Research Triangle Analysts October presentation on Big Data by Dahl Winters (formerly of Research Triangle Institute). Dahl takes her viewers on a whirlwind tour of big data tools such as Hadoop and big data algorithms such as MapReduce, clustering, and deep learning. These slides document the many resources available on the internet, as well as guidelines of when and where to use each.
What is graph all about, and why should you care? Graphs come in many shapes and forms, and can be used for different applications: Graph Analytics, Graph AI, Knowledge Graphs, and Graph Databases.
Talk by George Anadiotis. Connected Data London Meetup June 29th 2020.
Up until the beginning of the 2010s, the world was mostly running on spreadsheets and relational databases. To a large extent, it still does. But the NoSQL wave of databases has largely succeeded in instilling the “best tool for the job” mindset.
After relational, key-value, document, and columnar, the latest link in this evolutionary proliferation of data structures is graph. Graph analytics, Graph AI, Knowledge Graphs and Graph Databases have been making waves, included in hype cycles for the last couple of years.
The Year of the Graph marked the beginning of it all before the Gartners of the world got in the game. The Year of the Graph is a term coined to convey the fact that the time has come for this technology to flourish.
The eponymous article that set the tone was published in January 2018 on ZDNet by domain expert George Anadiotis. George has been working with, and keeping an eye on, all things Graph since the early 2000s. He was one of the first to note the continuing rise of Graph Databases, and to bring this technology in front of a mainstream audience.
The Year of the Graph has been going strong since 2018. In August 2018, Gartner started including Graph in its hype cycles. Ever since, Graph has been riding the upward slope of the Hype Cycle.
The need for knowledge on these technologies is constantly growing. To respond to that need, the Year of the Graph newsletter was released in April 2018. In addition, a constant flow of graph-related news and resources is being shared on social media.
To help people make educated choices, the Year of the Graph Database Report was released. The report has been hailed as the most comprehensive of its kind in the market, consistently helping people choose the most appropriate solution for their use case since 2018.
The report, articles, news stream, and the newsletter have been reaching thousands of people, helping them understand and navigate this landscape. We’ll talk about the Year of the Graph, the different shapes, forms, and applications for graphs, the latest news and trends, and wrap up with an ask me anything session.
Adding Open Data Value to 'Closed Data' ProblemsSimon Price
Drawing on cutting edge examples from the University of Bristol and the City of Bristol, Simon will discuss innovative applications of data science that derive business value from open data through enriching and integrating with confidential 'closed data'. He also highlights recent technological advances that are enabling open data science on highly sensitive closed data.
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
Dr. Pouria Amirian explains data science, steps in a data science workflow and show some experiments in AzureML. He also mentions about big data issues in a data science project and solutions to them.
In this presentation, let's have a look at What is Data Science and it's applications. We discussed most common use cases of Data Science.
I presented this at LSPE-IN meetup happened on 10th March 2018 at Walmart Global Technology Services.
At Data-centric Architecture Forum 2020 Thomas Cook, our Sales Director of AnzoGraph DB, gave his presentation "Knowledge Graph for Machine Learning and Data Science". These are his slides.
Graph applications were once considered “exotic” and expensive. Until recently, few software engineers had much experience putting graphs to work. However, the use cases are now becoming more commonplace.
This talk explores a practical use case, one which addresses key issues of data governance and reproducible research, and depends on sophisticated use of graph technology.
Consider: some academic disciplines such as astronomy enjoy a wealth of data — mostly open data. Popular machine learning algorithms, open source Python libraries, and distributed systems all owe much to those disciplines and their history of big data.
Other disciplines require strong guarantees for privacy and security. Datasets used in social science research involve confidential details about human subjects: medical histories, wages, home addresses for family members, police records, etc.
Those cannot be shared openly, which impedes researchers from learning about related work by others. Reproducibility of research and the pace of science in general are limited. Nonetheless, social science research is vital for civil governance, especially for evidence-based policymaking (US federal law since 2018).
Even when data may be too sensitive to share openly, often the metadata can be shared. Constructing knowledge graphs of metadata about datasets — along with metadata about authors, their published research, methods used, data providers, data stewards, and so on — that provides effective means to tackle hard problems in data governance.
Knowledge graph work supports use cases such as entity linking, discovery and recommendations, axioms to infer about compliance, etc. This talk reviews the Rich Context AI competition and the related ADRF framework used now by more than 15 federal agencies in the US.
We’ll explore knowledge graph use cases, use of open standards and open source, and how this enhances reproducible research. Social science research for the public sector has much in common with data use in industry.
Issues of privacy, security, and compliance overlap, pointing toward what will be required of banks, media channels, etc., and what technologies apply. We’ll look at comparable work emerging in other parts of industry: open source projects, open standards emerging, and in particular a new set of features in Project Jupyter that support knowledge graphs about data governance.
Best Data Analytics Certification Course Training Institute in Malaysia: 360DigiTMG is the best Data Analytics using Python Training Institute In Malaysia providing Data Analytics Training Classes by real-time faculty with course material.
Fireside Chat with Bloor Research: State of the Graph Database Market 2020Cambridge Semantics
Sean Martin, CTO of Cambridge Semantics, Philip Howard, Research Director at Bloor Research and co-author of “Graph Database Market Update 2020”, and Steve Sarsfield, VP of Product at Cambridge Semantics, hold a fireside chat on the State of the Graph Database Market.
This PPT Programming for data science in python mainly focus on importance of Python programming language in Python it explains the characteristic features of the programming language, its pros and cons and its applications.
Multisoft Systems is a renowned training organization that focuses on providing quality training programs to the candidates. Their “Data Science with R” training program is designed for Data/Business Analysts and anyone who has an interest in the field of Data Science. You will learn to explore R data structures and syntaxes, work with data and transform them to fit your needs, create functions and use control flow, etc.
https://www.multisoftsystems.com
4 Ways Telecoms are Using GIS & Location Intelligence.pdfCARTO
In this webinar Helen McKenzie and Carmen de la O Millán explain how geospatial solutions can optimize telco operations and create new revenue streams, drawing on CARTO and Google Cloud’s collaborations with customers such as Telus and Vodafone.
Tools and Methods for Big Data Analytics by Dahl WintersMelinda Thielbar
Research Triangle Analysts October presentation on Big Data by Dahl Winters (formerly of Research Triangle Institute). Dahl takes her viewers on a whirlwind tour of big data tools such as Hadoop and big data algorithms such as MapReduce, clustering, and deep learning. These slides document the many resources available on the internet, as well as guidelines of when and where to use each.
What is graph all about, and why should you care? Graphs come in many shapes and forms, and can be used for different applications: Graph Analytics, Graph AI, Knowledge Graphs, and Graph Databases.
Talk by George Anadiotis. Connected Data London Meetup June 29th 2020.
Up until the beginning of the 2010s, the world was mostly running on spreadsheets and relational databases. To a large extent, it still does. But the NoSQL wave of databases has largely succeeded in instilling the “best tool for the job” mindset.
After relational, key-value, document, and columnar, the latest link in this evolutionary proliferation of data structures is graph. Graph analytics, Graph AI, Knowledge Graphs and Graph Databases have been making waves, included in hype cycles for the last couple of years.
The Year of the Graph marked the beginning of it all before the Gartners of the world got in the game. The Year of the Graph is a term coined to convey the fact that the time has come for this technology to flourish.
The eponymous article that set the tone was published in January 2018 on ZDNet by domain expert George Anadiotis. George has been working with, and keeping an eye on, all things Graph since the early 2000s. He was one of the first to note the continuing rise of Graph Databases, and to bring this technology in front of a mainstream audience.
The Year of the Graph has been going strong since 2018. In August 2018, Gartner started including Graph in its hype cycles. Ever since, Graph has been riding the upward slope of the Hype Cycle.
The need for knowledge on these technologies is constantly growing. To respond to that need, the Year of the Graph newsletter was released in April 2018. In addition, a constant flow of graph-related news and resources is being shared on social media.
To help people make educated choices, the Year of the Graph Database Report was released. The report has been hailed as the most comprehensive of its kind in the market, consistently helping people choose the most appropriate solution for their use case since 2018.
The report, articles, news stream, and the newsletter have been reaching thousands of people, helping them understand and navigate this landscape. We’ll talk about the Year of the Graph, the different shapes, forms, and applications for graphs, the latest news and trends, and wrap up with an ask me anything session.
Adding Open Data Value to 'Closed Data' ProblemsSimon Price
Drawing on cutting edge examples from the University of Bristol and the City of Bristol, Simon will discuss innovative applications of data science that derive business value from open data through enriching and integrating with confidential 'closed data'. He also highlights recent technological advances that are enabling open data science on highly sensitive closed data.
Data Science as a Service: Intersection of Cloud Computing and Data SciencePouria Amirian
Dr. Pouria Amirian explains data science, steps in a data science workflow and show some experiments in AzureML. He also mentions about big data issues in a data science project and solutions to them.
In this presentation, let's have a look at What is Data Science and it's applications. We discussed most common use cases of Data Science.
I presented this at LSPE-IN meetup happened on 10th March 2018 at Walmart Global Technology Services.
At Data-centric Architecture Forum 2020 Thomas Cook, our Sales Director of AnzoGraph DB, gave his presentation "Knowledge Graph for Machine Learning and Data Science". These are his slides.
Graph applications were once considered “exotic” and expensive. Until recently, few software engineers had much experience putting graphs to work. However, the use cases are now becoming more commonplace.
This talk explores a practical use case, one which addresses key issues of data governance and reproducible research, and depends on sophisticated use of graph technology.
Consider: some academic disciplines such as astronomy enjoy a wealth of data — mostly open data. Popular machine learning algorithms, open source Python libraries, and distributed systems all owe much to those disciplines and their history of big data.
Other disciplines require strong guarantees for privacy and security. Datasets used in social science research involve confidential details about human subjects: medical histories, wages, home addresses for family members, police records, etc.
Those cannot be shared openly, which impedes researchers from learning about related work by others. Reproducibility of research and the pace of science in general are limited. Nonetheless, social science research is vital for civil governance, especially for evidence-based policymaking (US federal law since 2018).
Even when data may be too sensitive to share openly, often the metadata can be shared. Constructing knowledge graphs of metadata about datasets — along with metadata about authors, their published research, methods used, data providers, data stewards, and so on — that provides effective means to tackle hard problems in data governance.
Knowledge graph work supports use cases such as entity linking, discovery and recommendations, axioms to infer about compliance, etc. This talk reviews the Rich Context AI competition and the related ADRF framework used now by more than 15 federal agencies in the US.
We’ll explore knowledge graph use cases, use of open standards and open source, and how this enhances reproducible research. Social science research for the public sector has much in common with data use in industry.
Issues of privacy, security, and compliance overlap, pointing toward what will be required of banks, media channels, etc., and what technologies apply. We’ll look at comparable work emerging in other parts of industry: open source projects, open standards emerging, and in particular a new set of features in Project Jupyter that support knowledge graphs about data governance.
Best Data Analytics Certification Course Training Institute in Malaysia: 360DigiTMG is the best Data Analytics using Python Training Institute In Malaysia providing Data Analytics Training Classes by real-time faculty with course material.
Fireside Chat with Bloor Research: State of the Graph Database Market 2020Cambridge Semantics
Sean Martin, CTO of Cambridge Semantics, Philip Howard, Research Director at Bloor Research and co-author of “Graph Database Market Update 2020”, and Steve Sarsfield, VP of Product at Cambridge Semantics, hold a fireside chat on the State of the Graph Database Market.
This PPT Programming for data science in python mainly focus on importance of Python programming language in Python it explains the characteristic features of the programming language, its pros and cons and its applications.
Multisoft Systems is a renowned training organization that focuses on providing quality training programs to the candidates. Their “Data Science with R” training program is designed for Data/Business Analysts and anyone who has an interest in the field of Data Science. You will learn to explore R data structures and syntaxes, work with data and transform them to fit your needs, create functions and use control flow, etc.
https://www.multisoftsystems.com
4 Ways Telecoms are Using GIS & Location Intelligence.pdfCARTO
In this webinar Helen McKenzie and Carmen de la O Millán explain how geospatial solutions can optimize telco operations and create new revenue streams, drawing on CARTO and Google Cloud’s collaborations with customers such as Telus and Vodafone.
Big Data Tools: A Deep Dive into Essential ToolsFredReynolds2
Today, practically every firm uses big data to gain a competitive advantage in the market. With this in mind, freely available big data tools for analysis and processing are a cost-effective and beneficial choice for enterprises. Hadoop is the sector’s leading open-source initiative and big data tidal roller. Moreover, this is not the final chapter! Numerous other businesses pursue Hadoop’s free and open-source path.
Strategizing Big Data in Telco
Big data feels to be a very hot topic nowadays. Some industries depend on it completely, some have opportunities to roll out their strategies and execute, some just considering when it is a right time to hop in.
To my mind, Big Data is not about technology. Big data is about people generating data and data used for the benefit of people.
Big data is a pool of activities intended at processing the data a company owns (internal and external) so that to open new revenue opportunities, minimize costs and enhance UX.
I had some ideas and thoughts on what telecommunication companies may start from in formulating the Big Data Strategy and so packed some of the most important pieces of thoughts into a small presentation.
What is the difference between Small Data and Big Data?
What kind of data is used currently and which is to be relied on a new paradigm?
What kind of products are expected from telcos?
My personal ranking of operators in terms of their Big Data execution
What are the stages telcos should pass through to become a Big Data operator?
Prerequisites for Big Data transformation
Please take a look at the presentation to find answers to these questions and feel free to share your opinion.
Thanks!
Integration of Big Data Analytics with IoT and OT Systems to Turn Insights in...Alaa Mahjoub
Presentation Main Points:
A- The Role of OT & IoT Systems in Digital Business Transformation
1- What is digital business
2- Digital business platform reference architecture
3- How to use the enterprise architecture to plan and implement digital business transformation
4- Use case: transportation industry digital business platform
B- How to Integrate Big Data Analytics with IoT and OT Systems
1- Basic definitions related to big data analytics
2- Essentials of big data strategy
3- Use cases of integrating big data analytics with IoT and OT systems (in transportation and petroleum industries)
4- Big data platform integration options and their cost benefit trade-offs
Data Pioneers - Roland Haeve (Atos Nederland) - Big data in organisatiesMultiscope
Roland Haeve is cross competence manager Big Data voor Atos Nederland. Roland heeft ruim 18 jaar ICT-ervaring in het aanbieden van complete oplossingen binnen onder andere Business Intelligence (BI) en Big Data (Analytics). Big Data is voor veel bedrijven nog pionieren en uitzoeken wat de mogelijkheden zijn. In zijn presentatie zal Roland ingaan op succesvolle Big Data cases. Hij zal hierbij niet enkel inzoomen op Nederland, maar ook bredere, Europese voorbeelden meenemen.
How to Become a Big Data Professional.pdfCareervira
You will be on the right track to becoming a qualified big data professional if you follow the recommendations in this learn guide. If you want to learn more about how to build a successful career in this field and find the many courses available to gain the necessary skills and competence, explore our in-depth learn guide on "How to Become a Big Data Professional" to get started with your career. Learn all the knowledge you need to launch your career, including the necessary skills and how to pick them up.
Watch here: https://bit.ly/3i2iJbu
You will often hear that "data is the new gold". In this context, data management is one of the areas that has received more attention by the software community in recent years. From Artificial Intelligence and Machine Learning to new ways to store and process data, the landscape for data management is in constant evolution. From the privileged perspective of an enterprise middleware platform, we at Denodo have the advantage of seeing many of these changes happen.
Join us for an exciting session that will cover:
- The most interesting trends in data management.
- Our predictions on how those trends will change the data management world.
- How these trends are shaping the future of data virtualization and our own software.
LeasePlan Realizes its Next-Gen Data Strategy with a Logical Data FabricDenodo
More info here: https://bit.ly/3HsGM28
LeasePlan, a global leader in Car-as-a-Service (with approximately 1.8 million vehicles under management in 29 countries), is transforming from an analog business model to one that is fully digital. LeasePlan worked with Denodo to create a logical data fabric across all its key data sources – modern, cloud-based, and legacy – with a single point of access. Read further to understand how the logical data fabric provides a foundation for data self-service, simplifies migrations to the cloud, and helps LeasePlan to create transformational new use cases in customer experience and convenience, such as predictive vehicle maintenance programs.
Big Data kennen sehr viele IT-Experten, wenigstens haben Sie eine Vorstellung davon. In der Praxis arbeiten damit in Deutschland derzeit nur wenige. Dabei bringt Big Data ein ganz neues Momentum in moderne Softwarelösungen und ist im Kontext der Mobil-, Cloud- und Social-Veränderungen nicht wegzudenken. Big Data macht Software intelligent und damit auf eine ganz neue Art für die Benutzer erlebbar. Mit Big Data entstehen neue Softwarearchitekturen, weil Informationen völlig anders verarbeitet werden - nämlich schneller, differenzierter und oft mit dem Ziel, Schlüsse zu ziehen und Vorhersagen zu treffen.
In diesem Vortrag wird erläutert, wie moderne Softwarearchitekturen gestaltet werden, sodass Sie Big Data Paradigmen erfolgreich umsetzen und welche Vorteile sich für die zunehmend mobilen Softwarelösungen ergeben. Wir werfen zudem einen Blick auf die Potentiale und Optionen in Branchen wie Banken, Versicherung oder Handel.
Similar to Data & Analytics Framework: how public sector can profit from its immense asset, data. Raffaele Lillo, Team per la Trasformazione Digitale (20)
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Data & Analytics Framework: how public sector can profit from its immense asset, data. Raffaele Lillo, Team per la Trasformazione Digitale
1. Data & Analytics Framework:
how public sector can profit
from its immense asset, data
RAFFAELE LILLO
Chief Data Officer
@ Digital Transformation Team
—
2. Data & Analytics Framework (DAF)
● Vision & Strategy
● Ok, but what is DAF?
● Challenges
● Our Strategy
● Some Architectural Highlights
● What are we doing right now? Use
cases
Q&A
—
DAF
3. Information is a fundamental asset to
interpret social and economic phenomena,
take informed decision, improve services
to citizens, compete in the international
arena.
New Technologies allow to extract
knowledge from the immense amount of
data owned by the State.
Vision
4. Extracting value from data needs a solid
technological platform, a team of experts,
and a proper governance to govern the
generation, integration, standardization
and use of data
Strategy
7. Data & Analytics Framework (DAF) is a combination of:
● A Big Data Platform to centralize and store (data lake), manipulate
and standardize (data engine), re-distribute (API & Data Applications)
data and insights.
● A Data Team (data scientists + data engineers) which uses and
evolve the Big Data Platform to analyze data, create ML models and
build data applications and data viz.
● Laws and regulations to make this activity possible
Give us data and a platform...
8. Interoperability (aka Get out of the Silos!)
Public data is… public and all PP.AA. should have access to it
Democratizing Data (aka Open Data, API & Data Viz)
Data should be open (when legally possible), accessible by anyone (and
anything) and insightful
Data Products (aka deliver value & insights)
Machine Learning in interconnected software applications
Crowdsourcing (aka data is everywhere, let’s help us out)
Citizens (esp. civic hackers) contribute to the surfacing of knowledge
… and we shall move the PA
9. Organizational and Managerial Challenge
Central Data Office and federated analytics teams
Human Resources
Data Scientists & Data Engineers to get knowledge from data
Technology
This is the least complicated one, but still fundamental.
Legislative Challenge
Balancing Privacy and Public Interest
Data Driven Policy needs… Data (and Data
Scientists)
10. Introduction of DAF in Piano Triennale 2017-2019
DAF is one of the building blocks of the official document setting the strategy for
digitalization of the PA, and signed by the Prime Minister
DAF prototype development
TD started the development of the platform from scratch around March ‘17, and
released an Alpha version the first week of October ‘17
Experimental phase
We started working with a selected number of PA to showcase DAF, test it and
listen to PA’s needs so to fine-tune the platform before final release
Institutionalization of DAF
Introduce by law the role of a central data office for the entire PA
Our Strategy
11. Mission: Data driven decision making in efficient ways
Support PA at all levels to implement informed policies, both ex ante (policy
formulation) and ex post (policy monitoring and fine tuning).
Centralize common & non-domain specific tasks
Provide general purpose data platform once and for all, efficiency in standard
data processes, let PA focus on domain specific tasks / analysis
Economy of scope towards a center of excellence
Reach proper dimension to develop and acquire expensive and idiosyncratic
capabilities, and share them with all PA
Design and coordinate implementation of Data Policies
Help interoperability and usage of state-of-the-art standards and processes in
data management and analysis. Stimulates research and collaboration.
End Goal: Chief Data Office for the PA
12. High-Level Architectural Design
Hadoop cluster for
distributed persistence and
processing
Kubernetes cluster
manages dockerized
microservices and external
applications.
Core Managers:
microservices managing
core functionalities of DAF
External applications
natively integrated in DAF
Unique identity
management system,
integrated with HDFS
15. Machine Learning Based Applications (aka Data Products)
Lex Datafication & Citizen Assistant, Fraud Detection, Citizen
Recommendation Engine, Spending Check, Leading Indicators, etc.
Data Visualization
Thematic dashboards and infographics for citizens and firms
API for Interoperability and Open Data
Easy and standard access to data within PP.AA. and citizens
And much more… The limit is imagination
Smart city, analysis for data driven policy making, etc.
What can be done? (examples)
16. Platform, Platform, Platform
Enhance Ux/UI and functionality of the dataportal; API & bulk download;
security and role management; ingestion & standardization procedures; scale the
cluster.
Data quality, standards & Open Data
Implement concept of “standard dataset”; fight the entropy of Open Data; Open
Data in SaaS to all PA; ontologies & controlled vocabularies in Big Data platform.
Data Hackathon
We are organize an hackathon to show case DAF and the value of open data
with civic hacker in solving business and social problems. Stay tuned!
What are we doing right now?
17. Multi-Event Hackathon for Data Science and Social Good
● Online Hackathon: July 8th to September 23rd
● Onsite Hackathon: October, 20th - 21st
Two types of challenge
● Data Science Challenge: machine learning model building to solve a real business
problem → prize: > 5000e
● Civic Challenge: challenge focused on social good and data economy topics
Hack.Data - Save the Date!
hack.data.italia.it
18. Relations with PAs & Use Cases
Onboarding partner PAs; revision of their open data policies and procedures;
data stories; data science prototypes
→ Use Case: Neighborhood Map
Data application that show on map services, facts and a synthetic index of life in
the neighborhood. Working with Turin, Milan, Rome.
→ Use Case: Analysis of public contracts & “enterprise suggestor”
Analysis on public contracts dataset managed by ANAC. This led, among other
things to a data application that suggest companies that are most compatible
with a given contract a PA may want to make. Our Initiative.
What are we doing right now?
19. → Use Case: Document Classificator
Data application based on a trained neural network for automatic
classification of document, normally done manually by “ufficio protocollo”.
Requested by Regione Toscana
→ Use Case: Social Media Monitor and sentiment analysis
Data Application to understand the feeling of people in specific
fields/topics of interest. Requested by Regione Toscana.
What are we doing right now?
20. Raffaele Lillo
Chief Data Officer
raffaele@teamdigitale.governo.it
Twitter, Medium: @lilloraffa
—
Grazie!
Cooperate with us, please :)
Website
http://teamdigitale.governo.it
Forum
https://forum.italia.it/c/daf
Twitter
#DatiPubblici #DAF