SlideShare a Scribd company logo
1© Cloudera, Inc. All rights reserved.
Data for Good
2© Cloudera, Inc. All rights reserved.
Cloudera Cares &
DataKind Meetup
7 May 2015
3© Cloudera, Inc. All rights reserved.
Cloudera Cares:
An employee led and driven organization
• Launched in January 2014
• 1,400 employee hours donated in 2014
• $70k+ donated in 2014
• 20+ organizations to date
Doug Cutting participating in the
BORP Revolution Ride to help raise
funds for adaptive sports gear for
the physically challenged.
4© Cloudera, Inc. All rights reserved.
Pax Data
Doug Cutting | Chief Architect & Co-Founder
5© Cloudera, Inc. All rights reserved.
Hadoop started a revolution
6© Cloudera, Inc. All rights reserved.Click to enter confidentiality information
Now we’re winning the war
7© Cloudera, Inc. All rights reserved.Click to enter confidentiality information
How shall we govern the peace?
8© Cloudera, Inc. All rights reserved.Click to enter confidentiality information
We must not be tyrants
9© Cloudera, Inc. All rights reserved.Click to enter confidentiality information
We should use our power for good
10© Cloudera, Inc. All rights reserved.Click to enter confidentiality information
Good: Education
11© Cloudera, Inc. All rights reserved.Click to enter confidentiality information
Good: Healthcare
12© Cloudera, Inc. All rights reserved.Click to enter confidentiality information
Good: Climate
13© Cloudera, Inc. All rights reserved.Click to enter confidentiality information
How can we be trusted?
14© Cloudera, Inc. All rights reserved.Click to enter confidentiality information
Trust: Transparency
15© Cloudera, Inc. All rights reserved.
Trust: Best practices
16© Cloudera, Inc. All rights reserved.Click to enter confidentiality information
Trust: Define abuses
17© Cloudera, Inc. All rights reserved.Click to enter confidentiality information
Trust: Oversight
18© Cloudera, Inc. All rights reserved.Click to enter confidentiality information
Global effort
19© Cloudera, Inc. All rights reserved.
Our duty as professionals
20© Cloudera, Inc. All rights reserved.
Thank you!
@cutting
21© Cloudera, Inc. All rights reserved.
Cloudera Academic Partnership
Amr Awadallah | CTO & VP of Engineering
@awadallah
22© Cloudera, Inc. All rights reserved.
Cloudera + Higher Education
22
23© Cloudera, Inc. All rights reserved.
Cloudera Academic
Partnership: Overview
24© Cloudera, Inc. All rights reserved.
Impact:
Curriculum Provided
25© Cloudera, Inc. All rights reserved.
We were able to jumpstart an Introduction to Big
Data Analytics course thanks to the support of
Cloudera. The materials provided, including the
lab setup, are integral to the class.
“
”
26© Cloudera, Inc. All rights reserved.
Impact:
Enterprise Grade
Cloudera Manager
27© Cloudera, Inc. All rights reserved.
Legacy systems were preventing our labs from
mapping their genome sequences in a timely
manner. Our partnership with Cloudera will cut
the time required by scientists to deliver data
from weeks to days and, eventually, to hours.
28© Cloudera, Inc. All rights reserved.
Thank You
Get involved with the Cloudera Academic Partnership:
academic_partnerships@cloudera.com
DOING GOOD WITH DATA
30 @duncan3ross @DataKindUK
• DataKind UK is a charity that believes we can make the world better
by using data
• We work by linking data volunteers (you) with charities
COME AND JOIN DATAKIND
31 @duncan3ross @DataKindUK
DATAKIND UK TODAY
£
808 2
£850K
6,850
25 6
32 @duncan3ross @DataKindUK
WHO HAVE WE WORKED WITH?
Children
Education
Health
Young people
Advice and support
International and community
33 @duncan3ross @DataKindUK
We are hiring!
London DataDive
17-19 July
Volunteers wanted
Join us: http://www.meetup.com/DataKind-UK/
THANK YOU
CITIZENS ADVICE &
Ian Ansell, Peter Passaro,
Henry Simms & Billy Wong
318 member bureaux in England and Wales (F2F
phone, web-chat, email/letter)
2,500+ regular community locations
1,000+ ad-hoc locations
Consumer advice service (phone, email/letter)
in England, Wales and Scotland
Our website ‘Adviceguide’ providing extensive
self-help information on a wide range of topics.
2013/14
Our services
Lots of delicious data
1.Bureau Statistics
2. Bureau Evidence Forms (BEFs)
3. Web data on the Adviceguide
BUREAU ISSUE STATS
ADVICEGUIDE STATS
BUREAU ISSUE &
PROFILE STATS
The Problem
Could data science enable Citizens Advice to anticipate or
even predict changes in the issues affecting people
everyday, to act sooner to prevent problems escalating?
Identifying spike and new issues - where are the next payday loans?
The Project
1. To design a tool to harness Citizen Advice’s data so
they could better identify and react to emerging social
issues in the UK.
2. To build awareness among Citizens Advice staff of new
methods for mining and using data, and opening up the
data to staff and others.
● Original brief: Develop an Issues Early Warning
System to find the next “payday loans”
● Run two DataDives to explore the data and find
different approaches to the problem
● Run longer-term DataCorps to make sense of the
DataDive findings and develop a solution
The DataDive Experience Day 1:
I can solve all the problems
of the world with my
AWESOME DATA SCIENTIST POWERS!
The DataDive Experience Day 2:
Why are all these null values here?!?!
DataDive 1: What do we do with all
this delicious data?
● Bureau Statistics (Visitors and their Issues)
● Bureau Evidence Forms
● Google Analytics
What is the central theme across the organisation?
Issue Codes!
Bureau
Statistics
● Timestamp
● Issue Code
● Bureau ID
● Client ID
~2M visits/yr
~6M issues/yr
Trends & Issues
Exploration
Evidence
Forms
● Timestamp
● Issue Code
● Bureau ID
● Client ID
● 6 Text Fields
● ~40
Demographic
Fields
~ 50K Forms/yr
Topic Analysis &
Issues Exploration
Google
Analytics
● Timestamp
● NO ISSUE CODE!
● Sessions
● Users
● New Users
~ 16M Unique Users
Issue Code Labelling
& Data Pipelining
CAB DataCorps Project: How do we take the DataDive
work forward?
● Grand Ambition - build a prediction engine
● Needed trends across all three data types
● Evidence Forms - Better Topic Modelling
● Bureau Statistics - Look for emerging issues
● Google Analytics Data - Issue code labelling and pipeline
completion
● User Interface
DataDive 2
Citizens Advice shares their data with:
● St Mungo’s Broadway
● Northeast Child Poverty Action Committee
Elasticsearch and Kibana Save the
Day
- Struggling to get good predictions because of a
lack of contextual data
- Trend analysis was difficult because of changes
in data collection
- We already had all the evidence forms in
Elasticsearch for topic analysis
- Volunteer Ian Huston (Pivotal) started using
Kibana to explore the data
Focus Becomes the Dashboard
Final data clean up and normalisation
● Put everything into Elasticsearch
● Normalise issues codes across all 3 data types
● Other minor field normalisation
● Enrich geo data for bureau visits and evidence forms
● Evidence forms - full topic modelling
The Dashboard
Demo of the dashboard
https://drive.google.com/file/d/0B0X-Agv6DH0GZGJMbEtQdE5qUTQ/view?usp=sharing
Relationships between Issues
Motivation
● At least 30% of the CAB’s usage is by repeat
clients
● If we can offer preventive advice, we can reduce
cost and provide better service
Modelling the problem...
● Lift(B => A)
o Given B, how much more likely is A?
o = P(A|B)/P(A)
o = P(A and B)/(P(A)*P(B))
● All of the probabilities can be estimated* from case
history for each client
Time matters
● There is a temporal element to the issue counts (i.e. A must
follow B)
● If two issues happen two years apart, intuitively we would think
that the link between them is not as strong as that between two
issues that are two weeks apart
o Use exponential decay to model the “aging” of the count
Demo
Tools used - all open source
● Programming language - Python
● Statistics - Scipy
● Graph analysis - Networkx
● Web framework - Spyre
● Graph visualisation - D3.js
The Future
Dashboard and app
● give us comprehensive view of all our data
● helps to spot emerging issues and explore our
hunches
Implementation
● being integrated into Citizens Advice system
New insights already discovered
● Adviceguide Consumer section hiding key details
o just how big an issue fuel and utilities are
● Bipolar keeps cropping up in Befs around the issues of
debt
So much more than a dashboard
New analysis techniques learnt & new technologies
introduced
Excitement about data
● Kibana dashboard showcased and loved
● Could be replacing core systems, watch this space...
● Democratised our data - staff can access and play with
it
● Now, how about delivering data to the bureaux?
Citizens Advice is in love with data!
display-screen.cab-alpha.org.uk
Project CreditsDatakind:
● Emma Prest - General Manager
● Duncan Ross - Founder UK Branch
Original Data Ambassadors:
● Iago Martinez
● Arturo Sanchez Correa
● Peter Passaro
Volunteers:
● Henry Simms
● Billy Wong
● Sam Leach
● Emmanuel Lazardis
CAB Support:
● Laura Bunt
● Pete Watson
● Ian Ansell
About 30 additional volunteers who contributed at various stages!
Elasticsearch and General Data Hosting:
Google Analytics Pipelining:
Advice and Support:
Funding:
(Alan Hardy & Livia Froelicher)

More Related Content

What's hot

What's new in apache hive
What's new in apache hive What's new in apache hive
What's new in apache hive
DataWorks Summit
 
LLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveLLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
DataWorks Summit
 
Kylin Engineering Principles
Kylin Engineering PrinciplesKylin Engineering Principles
Kylin Engineering Principles
Xu Jiang
 
Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...
Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...
Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...
DataWorks Summit
 
Managing Apache HAWQ with Apache AMBARI
Managing Apache HAWQ with Apache AMBARIManaging Apache HAWQ with Apache AMBARI
Managing Apache HAWQ with Apache AMBARI
Mithun (Matt) Mathew
 
Combining Machine Learning frameworks with Apache Spark
Combining Machine Learning frameworks with Apache SparkCombining Machine Learning frameworks with Apache Spark
Combining Machine Learning frameworks with Apache Spark
DataWorks Summit/Hadoop Summit
 
Hadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and FutureHadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and Future
Vinod Kumar Vavilapalli
 
The Oracle Autonomous Database
The Oracle Autonomous DatabaseThe Oracle Autonomous Database
The Oracle Autonomous Database
Connor McDonald
 
Apache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and JapanApache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and Japan
Luke Han
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
hitesh1892
 
Empower Hive with Spark
Empower Hive with SparkEmpower Hive with Spark
Empower Hive with Spark
DataWorks Summit
 
Getting Spark ready for real-time, operational analytics
Getting Spark ready for real-time, operational analyticsGetting Spark ready for real-time, operational analytics
Getting Spark ready for real-time, operational analytics
airisData
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
Cécile Poyet
 
What’s new in Apache Spark 2.3
What’s new in Apache Spark 2.3What’s new in Apache Spark 2.3
What’s new in Apache Spark 2.3
DataWorks Summit
 
Spark Technology Center IBM
Spark Technology Center IBMSpark Technology Center IBM
Spark Technology Center IBM
DataWorks Summit/Hadoop Summit
 
Real Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with SparkReal Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with Spark
DataWorks Summit/Hadoop Summit
 
Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!
DataWorks Summit
 
6Reinventing Oracle Systems in a Cloudy World (Sangam20, December 2020)
6Reinventing Oracle Systems in a Cloudy World (Sangam20, December 2020)6Reinventing Oracle Systems in a Cloudy World (Sangam20, December 2020)
6Reinventing Oracle Systems in a Cloudy World (Sangam20, December 2020)
Lucas Jellema
 
Triple C - Centralize, Cloudify and Consolidate Dozens of Oracle Databases (O...
Triple C - Centralize, Cloudify and Consolidate Dozens of Oracle Databases (O...Triple C - Centralize, Cloudify and Consolidate Dozens of Oracle Databases (O...
Triple C - Centralize, Cloudify and Consolidate Dozens of Oracle Databases (O...
Lucas Jellema
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
Spark Summit
 

What's hot (20)

What's new in apache hive
What's new in apache hive What's new in apache hive
What's new in apache hive
 
LLAP: long-lived execution in Hive
LLAP: long-lived execution in HiveLLAP: long-lived execution in Hive
LLAP: long-lived execution in Hive
 
Kylin Engineering Principles
Kylin Engineering PrinciplesKylin Engineering Principles
Kylin Engineering Principles
 
Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...
Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...
Lessons Learned from Migration of a Large-analytics Platform from MPP Databas...
 
Managing Apache HAWQ with Apache AMBARI
Managing Apache HAWQ with Apache AMBARIManaging Apache HAWQ with Apache AMBARI
Managing Apache HAWQ with Apache AMBARI
 
Combining Machine Learning frameworks with Apache Spark
Combining Machine Learning frameworks with Apache SparkCombining Machine Learning frameworks with Apache Spark
Combining Machine Learning frameworks with Apache Spark
 
Hadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and FutureHadoop Summit Europe 2015 - YARN Present and Future
Hadoop Summit Europe 2015 - YARN Present and Future
 
The Oracle Autonomous Database
The Oracle Autonomous DatabaseThe Oracle Autonomous Database
The Oracle Autonomous Database
 
Apache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and JapanApache Kylin Use Cases in China and Japan
Apache Kylin Use Cases in China and Japan
 
Running Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache HadoopRunning Non-MapReduce Big Data Applications on Apache Hadoop
Running Non-MapReduce Big Data Applications on Apache Hadoop
 
Empower Hive with Spark
Empower Hive with SparkEmpower Hive with Spark
Empower Hive with Spark
 
Getting Spark ready for real-time, operational analytics
Getting Spark ready for real-time, operational analyticsGetting Spark ready for real-time, operational analytics
Getting Spark ready for real-time, operational analytics
 
Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It! Boost Performance with Scala – Learn From Those Who’ve Done It!
Boost Performance with Scala – Learn From Those Who’ve Done It!
 
What’s new in Apache Spark 2.3
What’s new in Apache Spark 2.3What’s new in Apache Spark 2.3
What’s new in Apache Spark 2.3
 
Spark Technology Center IBM
Spark Technology Center IBMSpark Technology Center IBM
Spark Technology Center IBM
 
Real Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with SparkReal Time Machine Learning Visualization with Spark
Real Time Machine Learning Visualization with Spark
 
Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!Analyzing the World's Largest Security Data Lake!
Analyzing the World's Largest Security Data Lake!
 
6Reinventing Oracle Systems in a Cloudy World (Sangam20, December 2020)
6Reinventing Oracle Systems in a Cloudy World (Sangam20, December 2020)6Reinventing Oracle Systems in a Cloudy World (Sangam20, December 2020)
6Reinventing Oracle Systems in a Cloudy World (Sangam20, December 2020)
 
Triple C - Centralize, Cloudify and Consolidate Dozens of Oracle Databases (O...
Triple C - Centralize, Cloudify and Consolidate Dozens of Oracle Databases (O...Triple C - Centralize, Cloudify and Consolidate Dozens of Oracle Databases (O...
Triple C - Centralize, Cloudify and Consolidate Dozens of Oracle Databases (O...
 
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...How to Boost 100x Performance for Real World Application with Apache Spark-(G...
How to Boost 100x Performance for Real World Application with Apache Spark-(G...
 

Viewers also liked

Rimsa_phd_thesis_2013
Rimsa_phd_thesis_2013Rimsa_phd_thesis_2013
Rimsa_phd_thesis_2013
Vadim Rimsa
 
Metodos na geo fisica
Metodos na geo fisicaMetodos na geo fisica
Metodos na geo fisica
Cleyciane Rodrigues
 
Sidney Matos Portifolio 2010
Sidney Matos   Portifolio 2010Sidney Matos   Portifolio 2010
Sidney Matos Portifolio 2010
Sidney Matos
 
Bipolar
BipolarBipolar
Bipolar
Emma Smith
 
Compiled Python UDFs for Impala
Compiled Python UDFs for ImpalaCompiled Python UDFs for Impala
Compiled Python UDFs for Impala
Cloudera, Inc.
 
Troubleshooting Using Cloudera Manager #cwt2015
Troubleshooting Using Cloudera Manager #cwt2015Troubleshooting Using Cloudera Manager #cwt2015
Troubleshooting Using Cloudera Manager #cwt2015
Cloudera Japan
 
Risk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedRisk Management for Data: Secured and Governed
Risk Management for Data: Secured and Governed
Cloudera, Inc.
 
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera, Inc.
 
SAQ by KR
SAQ by KRSAQ by KR
Desmayo... ¿Cuándo es peligroso?
Desmayo... ¿Cuándo es peligroso?Desmayo... ¿Cuándo es peligroso?
Desmayo... ¿Cuándo es peligroso?
UNESAR - Unidad Especializada de Arritmias
 
Prostatic artery embolization
Prostatic artery embolizationProstatic artery embolization
Prostatic artery embolization
PAIRS WEB
 
TRAUMATOLOGIA del Hombro Dr miguel Mite
TRAUMATOLOGIA del Hombro Dr miguel MiteTRAUMATOLOGIA del Hombro Dr miguel Mite
TRAUMATOLOGIA del Hombro Dr miguel Mite
tatiigomez1
 
Cuidados de enfermería en el tratamiento de ablación por radiofrecuencia del ...
Cuidados de enfermería en el tratamiento de ablación por radiofrecuencia del ...Cuidados de enfermería en el tratamiento de ablación por radiofrecuencia del ...
Cuidados de enfermería en el tratamiento de ablación por radiofrecuencia del ...
Clínica Universidad de Navarra
 
Hygiene Theory
Hygiene TheoryHygiene Theory
Hygiene Theory
Anuj Gandhi
 
Presentation1, radiological imaging of hyperparathyroidism.
Presentation1, radiological imaging of hyperparathyroidism.Presentation1, radiological imaging of hyperparathyroidism.
Presentation1, radiological imaging of hyperparathyroidism.
Abdellah Nazeer
 
Lesikar's Business communication presentation
Lesikar's Business communication presentationLesikar's Business communication presentation
Lesikar's Business communication presentation
Picard Bangladesh Limited
 
Lesikar's Business Communication
Lesikar's Business CommunicationLesikar's Business Communication
Lesikar's Business Communication
Picard Bangladesh Limited
 
Chapter 1,2,3,4 notes
Chapter 1,2,3,4 notesChapter 1,2,3,4 notes
Chapter 1,2,3,4 notes
Aruna M
 

Viewers also liked (18)

Rimsa_phd_thesis_2013
Rimsa_phd_thesis_2013Rimsa_phd_thesis_2013
Rimsa_phd_thesis_2013
 
Metodos na geo fisica
Metodos na geo fisicaMetodos na geo fisica
Metodos na geo fisica
 
Sidney Matos Portifolio 2010
Sidney Matos   Portifolio 2010Sidney Matos   Portifolio 2010
Sidney Matos Portifolio 2010
 
Bipolar
BipolarBipolar
Bipolar
 
Compiled Python UDFs for Impala
Compiled Python UDFs for ImpalaCompiled Python UDFs for Impala
Compiled Python UDFs for Impala
 
Troubleshooting Using Cloudera Manager #cwt2015
Troubleshooting Using Cloudera Manager #cwt2015Troubleshooting Using Cloudera Manager #cwt2015
Troubleshooting Using Cloudera Manager #cwt2015
 
Risk Management for Data: Secured and Governed
Risk Management for Data: Secured and GovernedRisk Management for Data: Secured and Governed
Risk Management for Data: Secured and Governed
 
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data HubCloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
Cloudera Federal Forum 2014: The Building Blocks of the Enterprise Data Hub
 
SAQ by KR
SAQ by KRSAQ by KR
SAQ by KR
 
Desmayo... ¿Cuándo es peligroso?
Desmayo... ¿Cuándo es peligroso?Desmayo... ¿Cuándo es peligroso?
Desmayo... ¿Cuándo es peligroso?
 
Prostatic artery embolization
Prostatic artery embolizationProstatic artery embolization
Prostatic artery embolization
 
TRAUMATOLOGIA del Hombro Dr miguel Mite
TRAUMATOLOGIA del Hombro Dr miguel MiteTRAUMATOLOGIA del Hombro Dr miguel Mite
TRAUMATOLOGIA del Hombro Dr miguel Mite
 
Cuidados de enfermería en el tratamiento de ablación por radiofrecuencia del ...
Cuidados de enfermería en el tratamiento de ablación por radiofrecuencia del ...Cuidados de enfermería en el tratamiento de ablación por radiofrecuencia del ...
Cuidados de enfermería en el tratamiento de ablación por radiofrecuencia del ...
 
Hygiene Theory
Hygiene TheoryHygiene Theory
Hygiene Theory
 
Presentation1, radiological imaging of hyperparathyroidism.
Presentation1, radiological imaging of hyperparathyroidism.Presentation1, radiological imaging of hyperparathyroidism.
Presentation1, radiological imaging of hyperparathyroidism.
 
Lesikar's Business communication presentation
Lesikar's Business communication presentationLesikar's Business communication presentation
Lesikar's Business communication presentation
 
Lesikar's Business Communication
Lesikar's Business CommunicationLesikar's Business Communication
Lesikar's Business Communication
 
Chapter 1,2,3,4 notes
Chapter 1,2,3,4 notesChapter 1,2,3,4 notes
Chapter 1,2,3,4 notes
 

Similar to Cloudera Cares + DataKind | 7 May 2015 | London, UK

Webinar on Big Data Challenges : Presented by Raj Kasturi
Webinar on Big Data Challenges : Presented by Raj KasturiWebinar on Big Data Challenges : Presented by Raj Kasturi
Webinar on Big Data Challenges : Presented by Raj Kasturi
oGuild .
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
Denodo
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
Denodo
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
Denodo
 
SC6 Workshop 1: Big Data Europe platform requirements and draft architecture:...
SC6 Workshop 1: Big Data Europe platform requirements and draft architecture:...SC6 Workshop 1: Big Data Europe platform requirements and draft architecture:...
SC6 Workshop 1: Big Data Europe platform requirements and draft architecture:...
BigData_Europe
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
BigDataEverywhere
 
Unit-I_Big data life cycle.pptx, sources of Big Data
Unit-I_Big data life cycle.pptx, sources of Big DataUnit-I_Big data life cycle.pptx, sources of Big Data
Unit-I_Big data life cycle.pptx, sources of Big Data
RajendraKankrale1
 
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets

Cloudera, Inc.
 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Denodo
 
The LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity ModelThe LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity Model
Lima Consulting Group
 
Big Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInBig Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedIn
Minh-Hoang Nguyen
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
Manish Chopra
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
DATAVERSITY
 
Think Big - How to Design a Big Data Information Architecture
Think Big - How to Design a Big Data Information ArchitectureThink Big - How to Design a Big Data Information Architecture
Think Big - How to Design a Big Data Information Architecture
Inside Analysis
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
SpringPeople
 
Knowledge Matters Issue 15 - Technology at Concern
Knowledge Matters Issue 15 - Technology at ConcernKnowledge Matters Issue 15 - Technology at Concern
Knowledge Matters Issue 15 - Technology at Concern
Ellen Ward
 
The Path to Data and Analytics Modernization
The Path to Data and Analytics ModernizationThe Path to Data and Analytics Modernization
The Path to Data and Analytics Modernization
Analytics8
 
Creating your Center of Excellence (CoE) for data driven use cases
Creating your Center of Excellence (CoE) for data driven use casesCreating your Center of Excellence (CoE) for data driven use cases
Creating your Center of Excellence (CoE) for data driven use cases
Frank Vullers
 
Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!
DataKitchen
 
Breed data scientists_ A Presentation.pptx
Breed data scientists_ A Presentation.pptxBreed data scientists_ A Presentation.pptx
Breed data scientists_ A Presentation.pptx
GautamPopli1
 

Similar to Cloudera Cares + DataKind | 7 May 2015 | London, UK (20)

Webinar on Big Data Challenges : Presented by Raj Kasturi
Webinar on Big Data Challenges : Presented by Raj KasturiWebinar on Big Data Challenges : Presented by Raj Kasturi
Webinar on Big Data Challenges : Presented by Raj Kasturi
 
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?¿En qué se parece el Gobierno del Dato a un parque de atracciones?
¿En qué se parece el Gobierno del Dato a un parque de atracciones?
 
Advanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data VirtualizationAdvanced Analytics and Machine Learning with Data Virtualization
Advanced Analytics and Machine Learning with Data Virtualization
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
SC6 Workshop 1: Big Data Europe platform requirements and draft architecture:...
SC6 Workshop 1: Big Data Europe platform requirements and draft architecture:...SC6 Workshop 1: Big Data Europe platform requirements and draft architecture:...
SC6 Workshop 1: Big Data Europe platform requirements and draft architecture:...
 
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
Big Data Everywhere Chicago: Platfora - Practices for Customer Analytics on H...
 
Unit-I_Big data life cycle.pptx, sources of Big Data
Unit-I_Big data life cycle.pptx, sources of Big DataUnit-I_Big data life cycle.pptx, sources of Big Data
Unit-I_Big data life cycle.pptx, sources of Big Data
 
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets
Put Alternative Data to Use in Capital Markets

Put Alternative Data to Use in Capital Markets

 
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
Implementar una estrategia eficiente de gobierno y seguridad del dato con la ...
 
The LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity ModelThe LCG Digital Transformation Maturity Model
The LCG Digital Transformation Maturity Model
 
Big Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedInBig Data Ecosystem @ LinkedIn
Big Data Ecosystem @ LinkedIn
 
Big-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-KoenigBig-Data-Seminar-6-Aug-2014-Koenig
Big-Data-Seminar-6-Aug-2014-Koenig
 
Building a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business GoalsBuilding a Data Strategy – Practical Steps for Aligning with Business Goals
Building a Data Strategy – Practical Steps for Aligning with Business Goals
 
Think Big - How to Design a Big Data Information Architecture
Think Big - How to Design a Big Data Information ArchitectureThink Big - How to Design a Big Data Information Architecture
Think Big - How to Design a Big Data Information Architecture
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Knowledge Matters Issue 15 - Technology at Concern
Knowledge Matters Issue 15 - Technology at ConcernKnowledge Matters Issue 15 - Technology at Concern
Knowledge Matters Issue 15 - Technology at Concern
 
The Path to Data and Analytics Modernization
The Path to Data and Analytics ModernizationThe Path to Data and Analytics Modernization
The Path to Data and Analytics Modernization
 
Creating your Center of Excellence (CoE) for data driven use cases
Creating your Center of Excellence (CoE) for data driven use casesCreating your Center of Excellence (CoE) for data driven use cases
Creating your Center of Excellence (CoE) for data driven use cases
 
Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!Your Data Nerd Friends Need You!
Your Data Nerd Friends Need You!
 
Breed data scientists_ A Presentation.pptx
Breed data scientists_ A Presentation.pptxBreed data scientists_ A Presentation.pptx
Breed data scientists_ A Presentation.pptx
 

More from Cloudera, Inc.

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
Cloudera, Inc.
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
Cloudera, Inc.
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
Cloudera, Inc.
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
Cloudera, Inc.
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
Cloudera, Inc.
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Cloudera, Inc.
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Cloudera, Inc.
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Cloudera, Inc.
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Cloudera, Inc.
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
Cloudera, Inc.
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Cloudera, Inc.
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
Cloudera, Inc.
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
Cloudera, Inc.
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
Cloudera, Inc.
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
Cloudera, Inc.
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
Cloudera, Inc.
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
Cloudera, Inc.
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
Cloudera, Inc.
 

More from Cloudera, Inc. (20)

Partner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptxPartner Briefing_January 25 (FINAL).pptx
Partner Briefing_January 25 (FINAL).pptx
 
Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists Cloudera Data Impact Awards 2021 - Finalists
Cloudera Data Impact Awards 2021 - Finalists
 
2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists2020 Cloudera Data Impact Awards Finalists
2020 Cloudera Data Impact Awards Finalists
 
Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019Edc event vienna presentation 1 oct 2019
Edc event vienna presentation 1 oct 2019
 
Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19Machine Learning with Limited Labeled Data 4/3/19
Machine Learning with Limited Labeled Data 4/3/19
 
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
 
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera DataFlow (CDF) 2.13.19
 
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19Introducing Cloudera Data Science Workbench for HDP 2.12.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
 
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
 
Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19Leveraging the cloud for analytics and machine learning 1.29.19
Leveraging the cloud for analytics and machine learning 1.29.19
 
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
 
Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18Leveraging the Cloud for Big Data Analytics 12.11.18
Leveraging the Cloud for Big Data Analytics 12.11.18
 
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 3
 
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 2
 
Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1Modern Data Warehouse Fundamentals Part 1
Modern Data Warehouse Fundamentals Part 1
 
Extending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the PlatformExtending Cloudera SDX beyond the Platform
Extending Cloudera SDX beyond the Platform
 
Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18Federated Learning: ML with Privacy on the Edge 11.15.18
Federated Learning: ML with Privacy on the Edge 11.15.18
 
Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360Analyst Webinar: Doing a 180 on Customer 360
Analyst Webinar: Doing a 180 on Customer 360
 
Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18Build a modern platform for anti-money laundering 9.19.18
Build a modern platform for anti-money laundering 9.19.18
 
Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18Introducing the data science sandbox as a service 8.30.18
Introducing the data science sandbox as a service 8.30.18
 

Recently uploaded

Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
Mariano Tinti
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
panagenda
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
Neo4j
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
Claudio Di Ciccio
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
Zilliz
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 

Recently uploaded (20)

Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Mariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceXMariano G Tinti - Decoding SpaceX
Mariano G Tinti - Decoding SpaceX
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
HCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAUHCL Notes and Domino License Cost Reduction in the World of DLAU
HCL Notes and Domino License Cost Reduction in the World of DLAU
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024GraphSummit Singapore | The Art of the  Possible with Graph - Q2 2024
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”“I’m still / I’m still / Chaining from the Block”
“I’m still / I’m still / Chaining from the Block”
 
Infrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI modelsInfrastructure Challenges in Scaling RAG with Custom AI models
Infrastructure Challenges in Scaling RAG with Custom AI models
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 

Cloudera Cares + DataKind | 7 May 2015 | London, UK

  • 1. 1© Cloudera, Inc. All rights reserved. Data for Good
  • 2. 2© Cloudera, Inc. All rights reserved. Cloudera Cares & DataKind Meetup 7 May 2015
  • 3. 3© Cloudera, Inc. All rights reserved. Cloudera Cares: An employee led and driven organization • Launched in January 2014 • 1,400 employee hours donated in 2014 • $70k+ donated in 2014 • 20+ organizations to date Doug Cutting participating in the BORP Revolution Ride to help raise funds for adaptive sports gear for the physically challenged.
  • 4. 4© Cloudera, Inc. All rights reserved. Pax Data Doug Cutting | Chief Architect & Co-Founder
  • 5. 5© Cloudera, Inc. All rights reserved. Hadoop started a revolution
  • 6. 6© Cloudera, Inc. All rights reserved.Click to enter confidentiality information Now we’re winning the war
  • 7. 7© Cloudera, Inc. All rights reserved.Click to enter confidentiality information How shall we govern the peace?
  • 8. 8© Cloudera, Inc. All rights reserved.Click to enter confidentiality information We must not be tyrants
  • 9. 9© Cloudera, Inc. All rights reserved.Click to enter confidentiality information We should use our power for good
  • 10. 10© Cloudera, Inc. All rights reserved.Click to enter confidentiality information Good: Education
  • 11. 11© Cloudera, Inc. All rights reserved.Click to enter confidentiality information Good: Healthcare
  • 12. 12© Cloudera, Inc. All rights reserved.Click to enter confidentiality information Good: Climate
  • 13. 13© Cloudera, Inc. All rights reserved.Click to enter confidentiality information How can we be trusted?
  • 14. 14© Cloudera, Inc. All rights reserved.Click to enter confidentiality information Trust: Transparency
  • 15. 15© Cloudera, Inc. All rights reserved. Trust: Best practices
  • 16. 16© Cloudera, Inc. All rights reserved.Click to enter confidentiality information Trust: Define abuses
  • 17. 17© Cloudera, Inc. All rights reserved.Click to enter confidentiality information Trust: Oversight
  • 18. 18© Cloudera, Inc. All rights reserved.Click to enter confidentiality information Global effort
  • 19. 19© Cloudera, Inc. All rights reserved. Our duty as professionals
  • 20. 20© Cloudera, Inc. All rights reserved. Thank you! @cutting
  • 21. 21© Cloudera, Inc. All rights reserved. Cloudera Academic Partnership Amr Awadallah | CTO & VP of Engineering @awadallah
  • 22. 22© Cloudera, Inc. All rights reserved. Cloudera + Higher Education 22
  • 23. 23© Cloudera, Inc. All rights reserved. Cloudera Academic Partnership: Overview
  • 24. 24© Cloudera, Inc. All rights reserved. Impact: Curriculum Provided
  • 25. 25© Cloudera, Inc. All rights reserved. We were able to jumpstart an Introduction to Big Data Analytics course thanks to the support of Cloudera. The materials provided, including the lab setup, are integral to the class. “ ”
  • 26. 26© Cloudera, Inc. All rights reserved. Impact: Enterprise Grade Cloudera Manager
  • 27. 27© Cloudera, Inc. All rights reserved. Legacy systems were preventing our labs from mapping their genome sequences in a timely manner. Our partnership with Cloudera will cut the time required by scientists to deliver data from weeks to days and, eventually, to hours.
  • 28. 28© Cloudera, Inc. All rights reserved. Thank You Get involved with the Cloudera Academic Partnership: academic_partnerships@cloudera.com
  • 30. 30 @duncan3ross @DataKindUK • DataKind UK is a charity that believes we can make the world better by using data • We work by linking data volunteers (you) with charities COME AND JOIN DATAKIND
  • 31. 31 @duncan3ross @DataKindUK DATAKIND UK TODAY £ 808 2 £850K 6,850 25 6
  • 32. 32 @duncan3ross @DataKindUK WHO HAVE WE WORKED WITH? Children Education Health Young people Advice and support International and community
  • 33. 33 @duncan3ross @DataKindUK We are hiring! London DataDive 17-19 July Volunteers wanted Join us: http://www.meetup.com/DataKind-UK/ THANK YOU
  • 34. CITIZENS ADVICE & Ian Ansell, Peter Passaro, Henry Simms & Billy Wong
  • 35.
  • 36.
  • 37. 318 member bureaux in England and Wales (F2F phone, web-chat, email/letter) 2,500+ regular community locations 1,000+ ad-hoc locations Consumer advice service (phone, email/letter) in England, Wales and Scotland Our website ‘Adviceguide’ providing extensive self-help information on a wide range of topics. 2013/14 Our services
  • 40. 2. Bureau Evidence Forms (BEFs)
  • 41. 3. Web data on the Adviceguide
  • 42.
  • 43. BUREAU ISSUE STATS ADVICEGUIDE STATS BUREAU ISSUE & PROFILE STATS
  • 44. The Problem Could data science enable Citizens Advice to anticipate or even predict changes in the issues affecting people everyday, to act sooner to prevent problems escalating?
  • 45. Identifying spike and new issues - where are the next payday loans?
  • 46. The Project 1. To design a tool to harness Citizen Advice’s data so they could better identify and react to emerging social issues in the UK. 2. To build awareness among Citizens Advice staff of new methods for mining and using data, and opening up the data to staff and others.
  • 47. ● Original brief: Develop an Issues Early Warning System to find the next “payday loans” ● Run two DataDives to explore the data and find different approaches to the problem ● Run longer-term DataCorps to make sense of the DataDive findings and develop a solution
  • 48. The DataDive Experience Day 1: I can solve all the problems of the world with my AWESOME DATA SCIENTIST POWERS!
  • 49. The DataDive Experience Day 2: Why are all these null values here?!?!
  • 50. DataDive 1: What do we do with all this delicious data? ● Bureau Statistics (Visitors and their Issues) ● Bureau Evidence Forms ● Google Analytics What is the central theme across the organisation? Issue Codes!
  • 51. Bureau Statistics ● Timestamp ● Issue Code ● Bureau ID ● Client ID ~2M visits/yr ~6M issues/yr Trends & Issues Exploration Evidence Forms ● Timestamp ● Issue Code ● Bureau ID ● Client ID ● 6 Text Fields ● ~40 Demographic Fields ~ 50K Forms/yr Topic Analysis & Issues Exploration Google Analytics ● Timestamp ● NO ISSUE CODE! ● Sessions ● Users ● New Users ~ 16M Unique Users Issue Code Labelling & Data Pipelining
  • 52. CAB DataCorps Project: How do we take the DataDive work forward? ● Grand Ambition - build a prediction engine ● Needed trends across all three data types ● Evidence Forms - Better Topic Modelling ● Bureau Statistics - Look for emerging issues ● Google Analytics Data - Issue code labelling and pipeline completion ● User Interface
  • 53. DataDive 2 Citizens Advice shares their data with: ● St Mungo’s Broadway ● Northeast Child Poverty Action Committee
  • 54. Elasticsearch and Kibana Save the Day - Struggling to get good predictions because of a lack of contextual data - Trend analysis was difficult because of changes in data collection - We already had all the evidence forms in Elasticsearch for topic analysis - Volunteer Ian Huston (Pivotal) started using Kibana to explore the data
  • 55.
  • 56. Focus Becomes the Dashboard Final data clean up and normalisation ● Put everything into Elasticsearch ● Normalise issues codes across all 3 data types ● Other minor field normalisation ● Enrich geo data for bureau visits and evidence forms ● Evidence forms - full topic modelling
  • 58. Demo of the dashboard https://drive.google.com/file/d/0B0X-Agv6DH0GZGJMbEtQdE5qUTQ/view?usp=sharing
  • 60. Motivation ● At least 30% of the CAB’s usage is by repeat clients ● If we can offer preventive advice, we can reduce cost and provide better service
  • 61. Modelling the problem... ● Lift(B => A) o Given B, how much more likely is A? o = P(A|B)/P(A) o = P(A and B)/(P(A)*P(B)) ● All of the probabilities can be estimated* from case history for each client
  • 62. Time matters ● There is a temporal element to the issue counts (i.e. A must follow B) ● If two issues happen two years apart, intuitively we would think that the link between them is not as strong as that between two issues that are two weeks apart o Use exponential decay to model the “aging” of the count
  • 63. Demo
  • 64. Tools used - all open source ● Programming language - Python ● Statistics - Scipy ● Graph analysis - Networkx ● Web framework - Spyre ● Graph visualisation - D3.js
  • 65. The Future Dashboard and app ● give us comprehensive view of all our data ● helps to spot emerging issues and explore our hunches Implementation ● being integrated into Citizens Advice system
  • 66. New insights already discovered ● Adviceguide Consumer section hiding key details o just how big an issue fuel and utilities are ● Bipolar keeps cropping up in Befs around the issues of debt
  • 67. So much more than a dashboard New analysis techniques learnt & new technologies introduced
  • 68. Excitement about data ● Kibana dashboard showcased and loved ● Could be replacing core systems, watch this space... ● Democratised our data - staff can access and play with it ● Now, how about delivering data to the bureaux?
  • 69. Citizens Advice is in love with data! display-screen.cab-alpha.org.uk
  • 70. Project CreditsDatakind: ● Emma Prest - General Manager ● Duncan Ross - Founder UK Branch Original Data Ambassadors: ● Iago Martinez ● Arturo Sanchez Correa ● Peter Passaro Volunteers: ● Henry Simms ● Billy Wong ● Sam Leach ● Emmanuel Lazardis CAB Support: ● Laura Bunt ● Pete Watson ● Ian Ansell About 30 additional volunteers who contributed at various stages! Elasticsearch and General Data Hosting: Google Analytics Pipelining: Advice and Support: Funding: (Alan Hardy & Livia Froelicher)