SlideShare a Scribd company logo
May 2013
From Big Data to Smart Data
Marin Dimitrov - CTO
About Ontotext
• Provides products and services for creating,
managing and exploiting semantic data
– Founded in 2000
– Offices in Bulgaria, USA and UK
• Major clients and industries
– Media & Publishing (BBC, Press Association, EuroMoney,
NDP Nieuwsmedia)
– HCLS (AstraZeneca, UCB, NIBIO)
– Cultural Heritage (The British Museum, The National
Archives, Polish National Museum, Dutch Public Library)
– Government (UK Parliament, United Nations FAO, LMI)
#2May 2013From Big Data to Smart Data (Semantic Days 2013)
Contents
• The Problem with Big Data for BI
• From Big Data to Smart Data
• Success Stories by Ontotext
#3From Big Data to Smart Data (Semantic Days 2013) May 2013
BIG DATA FOR BUSINESS
INTELLIGENCE
#4From Big Data to Smart Data (Semantic Days 2013) May 2013
The Problem with Big Data for BI
#5From Big Data to Smart Data (Semantic Days 2013) May 2013
The Problem with Big Data for BI
• It’s not only about Volume, Velocity & Variety
• Too much focus on processing speed & storage
volume
• “Brute force” approaches increase the amount of
data processed…
– But not necessarily the Value & insight derived from data
– May lead to even more data quality & inconsistency
problems
– Problems with data visualisation & exploration
– Often do not lead to better decision making
#6From Big Data to Smart Data (Semantic Days 2013) May 2013
The Problem with Big Data for BI
• BI success is not measured by Volume, Velocity &
Variety, but by more derived Value
• Organisations should learn how to better utilise their
“small data” before targeting Big Data
– Quality over quantity
– Better understanding of the data leads to better decision
making
– Avoid “needle in a haystack” situations
#7From Big Data to Smart Data (Semantic Days 2013) May 2013
The Problem with Big Data for BI
#8From Big Data to Smart Data (Semantic Days 2013) May 2013
Smart Data for Better BI
• Efficiently analyse unstructured data
– Most of the enterprise data is still unstructured
– Even within structured & transactional data sources there
is a lot of embedded unstructured data
– … and this unstructured data is poorly analysed (if at all) =>
lots of potential value still remains locked
– (sometimes even within semantic / Linked Data with
insufficient granularity)
#9From Big Data to Smart Data (Semantic Days 2013) May 2013
Smart Data for Better BI
• Focus on metadata first, Big Data later
– (As opposed to: Big Data first, metadata later)
• Enrich data
• Interlink data
• Provide a common metadata layer
– Break legacy silos
– Align heterogeneous metadata if necessary
• Better analysis of the data, better insight
#10From Big Data to Smart Data (Semantic Days 2013) May 2013
SUCCESS STORIES
#11From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
• Comprehensive recruitment database for the UK
– 4 million job ads / vacancies (dynamic)
– 220,000 company websites & 700 job boards monitored
• Questions we can answer
– What skills are in demand at present?
– Which are the top job boards in a region?
– Which is the right Job board for your industry sector?
– Which are the most active job advertisers / employers?
– Which are the agencies and employers that do not
advertise on your job board?
#12From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
#13From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
• Technology stack
– Web mining & focussed crawling
– KB construction from open & proprietary data sources
– Skills taxonomy (based on DISCO)
– Text mining & semantic enrichment
– Reconciliation & interlinking
– BI reporting & dashboards
#14From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
#15From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
#16From Big Data to Smart Data (Semantic Days 2013) May 2013
UK Job Market Intelligence
#17From Big Data to Smart Data (Semantic Days 2013) May 2013
Asset Recovery Intelligence System (ARIS)
• Support Financial Intelligence Units with tracking
stolen assets, fight corruption & money laundering
• Questions we can answer
– What are the reported activities related to a person?
– What is the person’s personal/professional network?
– What are corruptions cases reported in regional news?
• Data sources
– News feeds from major news agencies
– Dow Jones data & news feeds
– SARs to the FIU
– Open data (people & companies, Wikipedia)
#18From Big Data to Smart Data (Semantic Days 2013) May 2013
Asset Recovery Intelligence System (ARIS)
#19From Big Data to Smart Data (Semantic Days 2013) May 2013
Asset Recovery Intelligence System (ARIS)
• Technology stack
– Web Mining
– Text mining & semantic enrichment (KIM)
– ARIS ontology
• People, companies, assets, relations, financial transactions, …
– Reconciliation & Interlinking
– Triplestore (OWLIM)
– Semantic search & exploration UX
– BI reporting / factsheets / alerts
#20From Big Data to Smart Data (Semantic Days 2013) May 2013
Semantic Information Integration & Enrichment
#21From Big Data to Smart Data (Semantic Days 2013) May 2013
Q & A
Thank you!
@ontotext
#22From Big Data to Smart Data (Semantic Days 2013) May 2013

More Related Content

What's hot

Big data
Big dataBig data
Big data
Bhuvana Patt
 
Big data characteristics, value chain and challenges
Big data characteristics, value chain and challengesBig data characteristics, value chain and challenges
Big data characteristics, value chain and challenges
Musfiqur Rahman
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
Yaman Hajja, Ph.D.
 
Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Big Data Analytics Proposal #1
Big Data Analytics Proposal #1
Ziyad Saleh
 
Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data
DATAVERSITY
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
Tony Bain
 
Big data
Big dataBig data
Big data
Ami Redwan Haq
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
Vikram Nandini
 
Big data
Big dataBig data
Big data
Claire Choong
 
Personalized News and Video Recomendation System at LinkSure
Personalized News and Video Recomendation System at LinkSurePersonalized News and Video Recomendation System at LinkSure
Personalized News and Video Recomendation System at LinkSure
Leanne Hwee
 
Importance of Big data for your Business
Importance of Big data for your BusinessImportance of Big data for your Business
Importance of Big data for your Business
azuyo.com
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data Analytics
Product School
 
Big Data
Big DataBig Data
BigData and Beyond
BigData and BeyondBigData and Beyond
BigData and Beyond
John Avery
 
Data science
Data scienceData science
Data science
SwapnilDahake2
 
Everis big data_wilson_v1.4
Everis big data_wilson_v1.4Everis big data_wilson_v1.4
Everis big data_wilson_v1.4
wilson_lucas
 
Big Data Presentation at SCQAA-SF on June 12 2013
Big Data Presentation at SCQAA-SF on June 12 2013Big Data Presentation at SCQAA-SF on June 12 2013
Big Data Presentation at SCQAA-SF on June 12 2013Sujit Ghosh
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
Ashraf Uddin
 
BIG Data and Methodology-A review
BIG Data and Methodology-A reviewBIG Data and Methodology-A review
BIG Data and Methodology-A review
Shilpa Soi
 
Big data
Big dataBig data

What's hot (20)

Big data
Big dataBig data
Big data
 
Big data characteristics, value chain and challenges
Big data characteristics, value chain and challengesBig data characteristics, value chain and challenges
Big data characteristics, value chain and challenges
 
Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)Big Data & Analytics (Conceptual and Practical Introduction)
Big Data & Analytics (Conceptual and Practical Introduction)
 
Big Data Analytics Proposal #1
Big Data Analytics Proposal #1Big Data Analytics Proposal #1
Big Data Analytics Proposal #1
 
Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data Data-Ed Webinar: Demystifying Big Data
Data-Ed Webinar: Demystifying Big Data
 
Big Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data ManagementBig Data, NoSQL, NewSQL & The Future of Data Management
Big Data, NoSQL, NewSQL & The Future of Data Management
 
Big data
Big dataBig data
Big data
 
Big data analytics
Big data analyticsBig data analytics
Big data analytics
 
Big data
Big dataBig data
Big data
 
Personalized News and Video Recomendation System at LinkSure
Personalized News and Video Recomendation System at LinkSurePersonalized News and Video Recomendation System at LinkSure
Personalized News and Video Recomendation System at LinkSure
 
Importance of Big data for your Business
Importance of Big data for your BusinessImportance of Big data for your Business
Importance of Big data for your Business
 
Importance of Data Analytics
 Importance of Data Analytics Importance of Data Analytics
Importance of Data Analytics
 
Big Data
Big DataBig Data
Big Data
 
BigData and Beyond
BigData and BeyondBigData and Beyond
BigData and Beyond
 
Data science
Data scienceData science
Data science
 
Everis big data_wilson_v1.4
Everis big data_wilson_v1.4Everis big data_wilson_v1.4
Everis big data_wilson_v1.4
 
Big Data Presentation at SCQAA-SF on June 12 2013
Big Data Presentation at SCQAA-SF on June 12 2013Big Data Presentation at SCQAA-SF on June 12 2013
Big Data Presentation at SCQAA-SF on June 12 2013
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 
BIG Data and Methodology-A review
BIG Data and Methodology-A reviewBIG Data and Methodology-A review
BIG Data and Methodology-A review
 
Big data
Big dataBig data
Big data
 

Viewers also liked

Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big DataMarin Dimitrov
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data Smarter
Matheus Mota
 
Netquest Survey Manager - Software de encuestas online
Netquest Survey Manager - Software de encuestas online Netquest Survey Manager - Software de encuestas online
Netquest Survey Manager - Software de encuestas online
Netquest
 
How Semantics Solves Big Data Challenges
How Semantics Solves Big Data ChallengesHow Semantics Solves Big Data Challenges
How Semantics Solves Big Data Challenges
DATAVERSITY
 
Inference using owl 2.0 semantics
Inference using owl 2.0 semanticsInference using owl 2.0 semantics
Inference using owl 2.0 semantics
Craig Trim
 
Big Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesBig Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesSrinath Srinivasa
 
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술Haklae Kim
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Ronald Ashri
 
DataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceDataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceMarin Dimitrov
 
OWLIM@AWS - On-demand RDF Data Management in the Cloud
OWLIM@AWS - On-demand RDF Data Management in the CloudOWLIM@AWS - On-demand RDF Data Management in the Cloud
OWLIM@AWS - On-demand RDF Data Management in the CloudMarin Dimitrov
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic Suite
Marin Dimitrov
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
Marin Dimitrov
 
Low-cost Open Data As-a-Service
Low-cost Open Data As-a-ServiceLow-cost Open Data As-a-Service
Low-cost Open Data As-a-Service
Marin Dimitrov
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and Reuse
Marin Dimitrov
 
Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012
Marin Dimitrov
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4
Marin Dimitrov
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-Service
Marin Dimitrov
 
Hackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъриHackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъри
Nikolay Stoitsev
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudScaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Marin Dimitrov
 
From Python to Java
From Python to JavaFrom Python to Java
From Python to Java
Nikolay Stoitsev
 

Viewers also liked (20)

Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big Data
 
Using the Semantic Web Stack to Make Big Data Smarter
Using the Semantic Web Stack to Make  Big Data SmarterUsing the Semantic Web Stack to Make  Big Data Smarter
Using the Semantic Web Stack to Make Big Data Smarter
 
Netquest Survey Manager - Software de encuestas online
Netquest Survey Manager - Software de encuestas online Netquest Survey Manager - Software de encuestas online
Netquest Survey Manager - Software de encuestas online
 
How Semantics Solves Big Data Challenges
How Semantics Solves Big Data ChallengesHow Semantics Solves Big Data Challenges
How Semantics Solves Big Data Challenges
 
Inference using owl 2.0 semantics
Inference using owl 2.0 semanticsInference using owl 2.0 semantics
Inference using owl 2.0 semantics
 
Big Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and OpportunitiesBig Data and the Semantic Web: Challenges and Opportunities
Big Data and the Semantic Web: Challenges and Opportunities
 
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
시스템 엔지니어가 바라보는 시맨틱웹과 빅데이터 기술
 
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An IntroductionLinking Open, Big Data Using Semantic Web Technologies - An Introduction
Linking Open, Big Data Using Semantic Web Technologies - An Introduction
 
DataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-ServiceDataGraft Platform: RDF Database-as-a-Service
DataGraft Platform: RDF Database-as-a-Service
 
OWLIM@AWS - On-demand RDF Data Management in the Cloud
OWLIM@AWS - On-demand RDF Data Management in the CloudOWLIM@AWS - On-demand RDF Data Management in the Cloud
OWLIM@AWS - On-demand RDF Data Management in the Cloud
 
S4: The Self-Service Semantic Suite
S4: The Self-Service Semantic SuiteS4: The Self-Service Semantic Suite
S4: The Self-Service Semantic Suite
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Low-cost Open Data As-a-Service
Low-cost Open Data As-a-ServiceLow-cost Open Data As-a-Service
Low-cost Open Data As-a-Service
 
Enabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and ReuseEnabling Low-cost Open Data Publishing and Reuse
Enabling Low-cost Open Data Publishing and Reuse
 
Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012Ontotext in EC Funded Projects 2002-2012
Ontotext in EC Funded Projects 2002-2012
 
RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4RDF Database-as-a-Service with S4
RDF Database-as-a-Service with S4
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-Service
 
Hackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъриHackconf 2016 - Да пишем код за хиляди сървъри
Hackconf 2016 - Да пишем код за хиляди сървъри
 
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the CloudScaling to Millions of Concurrent SPARQL Queries on the Cloud
Scaling to Millions of Concurrent SPARQL Queries on the Cloud
 
From Python to Java
From Python to JavaFrom Python to Java
From Python to Java
 

Similar to From Big Data to Smart Data

Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria? Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
INACAP
 
Eduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv Symposium 2013 - Combatting the data headaches of the digital ageEduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv
 
Ictam big data
Ictam big dataIctam big data
Ictam big data
Terry Bunio
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
Akshata Humbe
 
Social Media World presentation
Social Media World presentationSocial Media World presentation
Social Media World presentation
kperi
 
A beginner's guide to Big data
A beginner's guide to Big dataA beginner's guide to Big data
A beginner's guide to Big data
AnushkaGupta763558
 
Big Data for Business & Social Innovation
Big Data for Business & Social InnovationBig Data for Business & Social Innovation
Big Data for Business & Social Innovation
StartupSaturdayEurope
 
Big data
Big dataBig data
Big data
Prince Barai
 
Linked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesLinked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesMarin Dimitrov
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
Thinkful
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
bhavesh lande
 
Big data
Big dataBig data
Big data
Debashish Jana
 
P02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCAP02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCA
ANURAGGUPTA570
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
Thinkful
 
SME Breakfast Seminar - Keynote Session - The Data Landscape
SME Breakfast Seminar - Keynote Session - The Data LandscapeSME Breakfast Seminar - Keynote Session - The Data Landscape
SME Breakfast Seminar - Keynote Session - The Data Landscape
Nathean Technologies
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
Skillwise Consulting
 
Big data by_mcal
Big data by_mcalBig data by_mcal
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015
Sanmeet Dhokay
 

Similar to From Big Data to Smart Data (20)

Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria? Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
Sr. Jon Ander, Internet de las Cosas y Big Data: ¿hacia dónde va la Industria?
 
Eduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv Symposium 2013 - Combatting the data headaches of the digital ageEduserv Symposium 2013 - Combatting the data headaches of the digital age
Eduserv Symposium 2013 - Combatting the data headaches of the digital age
 
Ictam big data
Ictam big dataIctam big data
Ictam big data
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Social Media World presentation
Social Media World presentationSocial Media World presentation
Social Media World presentation
 
A beginner's guide to Big data
A beginner's guide to Big dataA beginner's guide to Big data
A beginner's guide to Big data
 
Big Data for Business & Social Innovation
Big Data for Business & Social InnovationBig Data for Business & Social Innovation
Big Data for Business & Social Innovation
 
Big data
Big dataBig data
Big data
 
Linked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and ChallengesLinked Data for the Enterprise: Opportunities and Challenges
Linked Data for the Enterprise: Opportunities and Challenges
 
Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)Career in Data Science (July 2017, DTLA)
Career in Data Science (July 2017, DTLA)
 
introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Big data
Big dataBig data
Big data
 
P02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCAP02 | Big Data | Anurag Gupta | BCA
P02 | Big Data | Anurag Gupta | BCA
 
Getting Started in Data Science
Getting Started in Data ScienceGetting Started in Data Science
Getting Started in Data Science
 
Final_Bigdata_pret
Final_Bigdata_pretFinal_Bigdata_pret
Final_Bigdata_pret
 
Big data in telecom
Big data in telecomBig data in telecom
Big data in telecom
 
SME Breakfast Seminar - Keynote Session - The Data Landscape
SME Breakfast Seminar - Keynote Session - The Data LandscapeSME Breakfast Seminar - Keynote Session - The Data Landscape
SME Breakfast Seminar - Keynote Session - The Data Landscape
 
SKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSISSKILLWISE-BIGDATA ANALYSIS
SKILLWISE-BIGDATA ANALYSIS
 
Big data by_mcal
Big data by_mcalBig data by_mcal
Big data by_mcal
 
big data analytics pgpmx2015
big data analytics pgpmx2015big data analytics pgpmx2015
big data analytics pgpmx2015
 

More from Marin Dimitrov

Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Marin Dimitrov
 
Mapping Your Career Journey
Mapping Your Career JourneyMapping Your Career Journey
Mapping Your Career Journey
Marin Dimitrov
 
Open Source @ Uber
Open Source @ Uber Open Source @ Uber
Open Source @ Uber
Marin Dimitrov
 
Trust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsTrust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & Organisations
Marin Dimitrov
 
Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018
Marin Dimitrov
 
Machine Learning @ Uber
Machine Learning @ UberMachine Learning @ Uber
Machine Learning @ Uber
Marin Dimitrov
 
Career Advice for My Younger Self
Career Advice for My Younger SelfCareer Advice for My Younger Self
Career Advice for My Younger Self
Marin Dimitrov
 
Scaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesScaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed Sites
Marin Dimitrov
 
Building, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsBuilding, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance Teams
Marin Dimitrov
 
Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)
Marin Dimitrov
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL Queries
Marin Dimitrov
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
Marin Dimitrov
 
Crossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyCrossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic Technology
Marin Dimitrov
 
Delivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersDelivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersMarin Dimitrov
 
Career Days 2012 @ Sofia University
Career Days 2012 @ Sofia UniversityCareer Days 2012 @ Sofia University
Career Days 2012 @ Sofia University
Marin Dimitrov
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceMarin Dimitrov
 
Linked Data Marketplaces
Linked Data MarketplacesLinked Data Marketplaces
Linked Data MarketplacesMarin Dimitrov
 
Linked Data Management
Linked Data ManagementLinked Data Management
Linked Data Management
Marin Dimitrov
 

More from Marin Dimitrov (18)

Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...Measuring the Productivity of Your Engineering Organisation - the Good, the B...
Measuring the Productivity of Your Engineering Organisation - the Good, the B...
 
Mapping Your Career Journey
Mapping Your Career JourneyMapping Your Career Journey
Mapping Your Career Journey
 
Open Source @ Uber
Open Source @ Uber Open Source @ Uber
Open Source @ Uber
 
Trust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & OrganisationsTrust - the Key Success Factor for Teams & Organisations
Trust - the Key Success Factor for Teams & Organisations
 
Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018Uber @ Telerik Academy 2018
Uber @ Telerik Academy 2018
 
Machine Learning @ Uber
Machine Learning @ UberMachine Learning @ Uber
Machine Learning @ Uber
 
Career Advice for My Younger Self
Career Advice for My Younger SelfCareer Advice for My Younger Self
Career Advice for My Younger Self
 
Scaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed SitesScaling Your Engineering Organization with Distributed Sites
Scaling Your Engineering Organization with Distributed Sites
 
Building, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance TeamsBuilding, Scaling and Leading High-Performance Teams
Building, Scaling and Leading High-Performance Teams
 
Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)Uber @ Career Days 2017 (Sofia University)
Uber @ Career Days 2017 (Sofia University)
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL Queries
 
Scaling up Linked Data
Scaling up Linked DataScaling up Linked Data
Scaling up Linked Data
 
Crossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic TechnologyCrossing the Chasm with Semantic Technology
Crossing the Chasm with Semantic Technology
 
Delivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science PractitionersDelivering Linked Data Training to Data Science Practitioners
Delivering Linked Data Training to Data Science Practitioners
 
Career Days 2012 @ Sofia University
Career Days 2012 @ Sofia UniversityCareer Days 2012 @ Sofia University
Career Days 2012 @ Sofia University
 
Semantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business IntelligenceSemantic Technologies and Triplestores for Business Intelligence
Semantic Technologies and Triplestores for Business Intelligence
 
Linked Data Marketplaces
Linked Data MarketplacesLinked Data Marketplaces
Linked Data Marketplaces
 
Linked Data Management
Linked Data ManagementLinked Data Management
Linked Data Management
 

Recently uploaded

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
Paul Groth
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
Alison B. Lowndes
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
Elena Simperl
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 

From Big Data to Smart Data

  • 1. May 2013 From Big Data to Smart Data Marin Dimitrov - CTO
  • 2. About Ontotext • Provides products and services for creating, managing and exploiting semantic data – Founded in 2000 – Offices in Bulgaria, USA and UK • Major clients and industries – Media & Publishing (BBC, Press Association, EuroMoney, NDP Nieuwsmedia) – HCLS (AstraZeneca, UCB, NIBIO) – Cultural Heritage (The British Museum, The National Archives, Polish National Museum, Dutch Public Library) – Government (UK Parliament, United Nations FAO, LMI) #2May 2013From Big Data to Smart Data (Semantic Days 2013)
  • 3. Contents • The Problem with Big Data for BI • From Big Data to Smart Data • Success Stories by Ontotext #3From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 4. BIG DATA FOR BUSINESS INTELLIGENCE #4From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 5. The Problem with Big Data for BI #5From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 6. The Problem with Big Data for BI • It’s not only about Volume, Velocity & Variety • Too much focus on processing speed & storage volume • “Brute force” approaches increase the amount of data processed… – But not necessarily the Value & insight derived from data – May lead to even more data quality & inconsistency problems – Problems with data visualisation & exploration – Often do not lead to better decision making #6From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 7. The Problem with Big Data for BI • BI success is not measured by Volume, Velocity & Variety, but by more derived Value • Organisations should learn how to better utilise their “small data” before targeting Big Data – Quality over quantity – Better understanding of the data leads to better decision making – Avoid “needle in a haystack” situations #7From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 8. The Problem with Big Data for BI #8From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 9. Smart Data for Better BI • Efficiently analyse unstructured data – Most of the enterprise data is still unstructured – Even within structured & transactional data sources there is a lot of embedded unstructured data – … and this unstructured data is poorly analysed (if at all) => lots of potential value still remains locked – (sometimes even within semantic / Linked Data with insufficient granularity) #9From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 10. Smart Data for Better BI • Focus on metadata first, Big Data later – (As opposed to: Big Data first, metadata later) • Enrich data • Interlink data • Provide a common metadata layer – Break legacy silos – Align heterogeneous metadata if necessary • Better analysis of the data, better insight #10From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 11. SUCCESS STORIES #11From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 12. UK Job Market Intelligence • Comprehensive recruitment database for the UK – 4 million job ads / vacancies (dynamic) – 220,000 company websites & 700 job boards monitored • Questions we can answer – What skills are in demand at present? – Which are the top job boards in a region? – Which is the right Job board for your industry sector? – Which are the most active job advertisers / employers? – Which are the agencies and employers that do not advertise on your job board? #12From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 13. UK Job Market Intelligence #13From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 14. UK Job Market Intelligence • Technology stack – Web mining & focussed crawling – KB construction from open & proprietary data sources – Skills taxonomy (based on DISCO) – Text mining & semantic enrichment – Reconciliation & interlinking – BI reporting & dashboards #14From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 15. UK Job Market Intelligence #15From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 16. UK Job Market Intelligence #16From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 17. UK Job Market Intelligence #17From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 18. Asset Recovery Intelligence System (ARIS) • Support Financial Intelligence Units with tracking stolen assets, fight corruption & money laundering • Questions we can answer – What are the reported activities related to a person? – What is the person’s personal/professional network? – What are corruptions cases reported in regional news? • Data sources – News feeds from major news agencies – Dow Jones data & news feeds – SARs to the FIU – Open data (people & companies, Wikipedia) #18From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 19. Asset Recovery Intelligence System (ARIS) #19From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 20. Asset Recovery Intelligence System (ARIS) • Technology stack – Web Mining – Text mining & semantic enrichment (KIM) – ARIS ontology • People, companies, assets, relations, financial transactions, … – Reconciliation & Interlinking – Triplestore (OWLIM) – Semantic search & exploration UX – BI reporting / factsheets / alerts #20From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 21. Semantic Information Integration & Enrichment #21From Big Data to Smart Data (Semantic Days 2013) May 2013
  • 22. Q & A Thank you! @ontotext #22From Big Data to Smart Data (Semantic Days 2013) May 2013