SlideShare a Scribd company logo
1 of 7
A Semantic Big Data
Companion
Stefano Bortoli
bortoli@okkam.it
Flavio Pompermaier
pompermaier@okkam.it
The company (briefly)
• Okkam is
– a SME based in Trento, Italy.
– Started as spin-off of the
University of Trento and FBK (2010)
• Okkam core business is
– large-scale data integration using
semantic technologies and
an Entity Name System
• Okkam operative sectors
– Services for public administration
– Services for restaurants (and more)
– Research projects
• FP7, H2020, and Local agencies
Who we are
• Stefano Bortoli, PhD
– works as technical director and researcher at Okkam S.R.L.
(Trento, Italy). His research and development interests are in the
area of Information Integration, with special focus in entity-
centric applications exploiting semantic technologies.
• Flavio Pompermaier, MSc.
– works as senior software engineer at Okkam S.R.L. (Trento, Italy).
Flavio is a passionate developer working with state of the art
technologies, combining semantic with big data technologies.
What we do
Why we need Flink
Entiton data model
Database record
RDF statement
Triplestore
NOSQL
& Index
+
Quad
provenance IRI
predicate
object
object Type
Subject
local IRI
Subject
ENS IRI
RDF Type
Expensive
datawearhouse
Why we are here
• We want to build and manage (very) large
entity-centric knowledge bases
• We endorsed Flink since Stratosphere as data
processing framework (during DOPA FP7)
• Our use cases for Apache Flink:
– Domain reasoning (Flink + Parquet + Thrift)
– RDF data lifecycle (Flink + Parquet + Jena/Sesame )
– RDF data intelligence (Flink + ELKiBi)
– Duplicate record detection (Flink + HBase + Solr)
– Entiton Record linkage (Flink + MongoDB + Kryo)
– Telemetry analysis (Flink + MongoDB + Weka)
Come to our session!
• We are the last presenting, don’t let us ALONE!
• We are hiring! (maybe ;-)

More Related Content

What's hot

Intranet show and_tell_2010
Intranet show and_tell_2010Intranet show and_tell_2010
Intranet show and_tell_2010
Charlie Hull
 
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
 Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr... Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
Databricks
 
Top 5 Considerations When Evaluating NoSQL
Top 5 Considerations When Evaluating NoSQLTop 5 Considerations When Evaluating NoSQL
Top 5 Considerations When Evaluating NoSQL
MongoDB
 
Why is JSON-LD Important to Businesses - Franz Inc
Why is JSON-LD Important to Businesses - Franz IncWhy is JSON-LD Important to Businesses - Franz Inc
Why is JSON-LD Important to Businesses - Franz Inc
Franz Inc. - AllegroGraph
 

What's hot (20)

Intranet show and_tell_2010
Intranet show and_tell_2010Intranet show and_tell_2010
Intranet show and_tell_2010
 
Amundsen at Brex and Looker integration
Amundsen at Brex and Looker integrationAmundsen at Brex and Looker integration
Amundsen at Brex and Looker integration
 
Building a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and OntologiesBuilding a Knowledge Graph using NLP and Ontologies
Building a Knowledge Graph using NLP and Ontologies
 
Turning search upside down with powerful open source search software
Turning search upside down with powerful open source search softwareTurning search upside down with powerful open source search software
Turning search upside down with powerful open source search software
 
AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101AnzoGraph DB - SPARQL 101
AnzoGraph DB - SPARQL 101
 
TaLend Online Training
TaLend Online TrainingTaLend Online Training
TaLend Online Training
 
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
 Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr... Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
Using Spark-Solr at Scale: Productionizing Spark for Search with Apache Solr...
 
20181019 code.talks graph_analytics_k_patenge
20181019 code.talks graph_analytics_k_patenge20181019 code.talks graph_analytics_k_patenge
20181019 code.talks graph_analytics_k_patenge
 
Data Integration Solutions Created By Koneksys
Data Integration Solutions Created By KoneksysData Integration Solutions Created By Koneksys
Data Integration Solutions Created By Koneksys
 
Lider Reference Model ld4lt session March, 3rd, 2015
Lider Reference Model ld4lt session  March, 3rd, 2015Lider Reference Model ld4lt session  March, 3rd, 2015
Lider Reference Model ld4lt session March, 3rd, 2015
 
Neo4j: What's Under the Hood
Neo4j: What's Under the HoodNeo4j: What's Under the Hood
Neo4j: What's Under the Hood
 
Micro-Servicing Linked Data
Micro-Servicing Linked DataMicro-Servicing Linked Data
Micro-Servicing Linked Data
 
CodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National PoliceCodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National Police
 
Getting Started Contributing to Apache Spark – From PR, CR, JIRA, and Beyond
Getting Started Contributing to Apache Spark – From PR, CR, JIRA, and BeyondGetting Started Contributing to Apache Spark – From PR, CR, JIRA, and Beyond
Getting Started Contributing to Apache Spark – From PR, CR, JIRA, and Beyond
 
Top 5 Considerations When Evaluating NoSQL
Top 5 Considerations When Evaluating NoSQLTop 5 Considerations When Evaluating NoSQL
Top 5 Considerations When Evaluating NoSQL
 
Devclub.lv - Introduction to stream processing
Devclub.lv - Introduction to stream processingDevclub.lv - Introduction to stream processing
Devclub.lv - Introduction to stream processing
 
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryVisual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
 
Taking Cross References to the Next Level: Reltables for Non-Topic Elements
Taking Cross References to the Next Level: Reltables for Non-Topic ElementsTaking Cross References to the Next Level: Reltables for Non-Topic Elements
Taking Cross References to the Next Level: Reltables for Non-Topic Elements
 
From discovering to trusting data
From discovering to trusting dataFrom discovering to trusting data
From discovering to trusting data
 
Why is JSON-LD Important to Businesses - Franz Inc
Why is JSON-LD Important to Businesses - Franz IncWhy is JSON-LD Important to Businesses - Franz Inc
Why is JSON-LD Important to Businesses - Franz Inc
 

Viewers also liked

Martin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache FlinkMartin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
Flink Forward
 
Apache Flink Training: DataStream API Part 2 Advanced
Apache Flink Training: DataStream API Part 2 Advanced Apache Flink Training: DataStream API Part 2 Advanced
Apache Flink Training: DataStream API Part 2 Advanced
Flink Forward
 

Viewers also liked (20)

Flink Case Study: Amadeus
Flink Case Study: AmadeusFlink Case Study: Amadeus
Flink Case Study: Amadeus
 
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim BaltagiHadoop or Spark: is it an either-or proposition? By Slim Baltagi
Hadoop or Spark: is it an either-or proposition? By Slim Baltagi
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
Fundamentals of Stream Processing with Apache Beam, Tyler Akidau, Frances Perry
 
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache BeamAljoscha Krettek - Portable stateful big data processing in Apache Beam
Aljoscha Krettek - Portable stateful big data processing in Apache Beam
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summitAnalysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
 
Kafka Streams for Java enthusiasts
Kafka Streams for Java enthusiastsKafka Streams for Java enthusiasts
Kafka Streams for Java enthusiasts
 
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-BaltagiApache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
Apache-Flink-What-How-Why-Who-Where-by-Slim-Baltagi
 
Apache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming AnalyticsApache Flink: Real-World Use Cases for Streaming Analytics
Apache Flink: Real-World Use Cases for Streaming Analytics
 
Building Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache KafkaBuilding Streaming Data Applications Using Apache Kafka
Building Streaming Data Applications Using Apache Kafka
 
Flink vs. Spark
Flink vs. SparkFlink vs. Spark
Flink vs. Spark
 
Marton Balassi – Stateful Stream Processing
Marton Balassi – Stateful Stream ProcessingMarton Balassi – Stateful Stream Processing
Marton Balassi – Stateful Stream Processing
 
Matthias J. Sax – A Tale of Squirrels and Storms
Matthias J. Sax – A Tale of Squirrels and StormsMatthias J. Sax – A Tale of Squirrels and Storms
Matthias J. Sax – A Tale of Squirrels and Storms
 
Apache Flink Training: System Overview
Apache Flink Training: System OverviewApache Flink Training: System Overview
Apache Flink Training: System Overview
 
Apache Flink internals
Apache Flink internalsApache Flink internals
Apache Flink internals
 
Ufuc Celebi – Stream & Batch Processing in one System
Ufuc Celebi – Stream & Batch Processing in one SystemUfuc Celebi – Stream & Batch Processing in one System
Ufuc Celebi – Stream & Batch Processing in one System
 
Apache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmapApache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmap
 
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache FlinkMartin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
Martin Junghans – Gradoop: Scalable Graph Analytics with Apache Flink
 
Apache Flink Training: DataStream API Part 2 Advanced
Apache Flink Training: DataStream API Part 2 Advanced Apache Flink Training: DataStream API Part 2 Advanced
Apache Flink Training: DataStream API Part 2 Advanced
 

Similar to Flink Case Study: OKKAM

Nicola_Mezzetti_CV_en.pdf
Nicola_Mezzetti_CV_en.pdfNicola_Mezzetti_CV_en.pdf
Nicola_Mezzetti_CV_en.pdf
Nicola Mezzetti
 
Infoproject company overview
Infoproject   company overviewInfoproject   company overview
Infoproject company overview
Infoproject
 
Records Management strategy and compliance in SharePoint
Records Management strategy and compliance in SharePointRecords Management strategy and compliance in SharePoint
Records Management strategy and compliance in SharePoint
Yi Zhang
 
Sharepoint webinar
Sharepoint webinarSharepoint webinar
Sharepoint webinar
Infogain
 

Similar to Flink Case Study: OKKAM (20)

Nicola_Mezzetti_CV_en.pdf
Nicola_Mezzetti_CV_en.pdfNicola_Mezzetti_CV_en.pdf
Nicola_Mezzetti_CV_en.pdf
 
Infoproject company overview
Infoproject   company overviewInfoproject   company overview
Infoproject company overview
 
PiCToR @ Vrijdag VISdag
PiCToR @ Vrijdag VISdagPiCToR @ Vrijdag VISdag
PiCToR @ Vrijdag VISdag
 
Interesting Thing about Internet of Things
Interesting Thing about Internet of ThingsInteresting Thing about Internet of Things
Interesting Thing about Internet of Things
 
Make share point the heart of your information 3 ways to extend your investment
Make share point the heart of your information 3 ways to extend your investmentMake share point the heart of your information 3 ways to extend your investment
Make share point the heart of your information 3 ways to extend your investment
 
Intranet systems beyond SharePoint in Scandinavia
Intranet systems beyond SharePoint in ScandinaviaIntranet systems beyond SharePoint in Scandinavia
Intranet systems beyond SharePoint in Scandinavia
 
Python Certification Course In Ahmedabad
Python Certification Course In AhmedabadPython Certification Course In Ahmedabad
Python Certification Course In Ahmedabad
 
SPSToronto: SharePoint 2016 - Hybrid, right choice for you and your organizat...
SPSToronto: SharePoint 2016 - Hybrid, right choice for you and your organizat...SPSToronto: SharePoint 2016 - Hybrid, right choice for you and your organizat...
SPSToronto: SharePoint 2016 - Hybrid, right choice for you and your organizat...
 
How Python is Tackling Data Integration Challenges in Fintech.pdf
How Python is Tackling Data Integration Challenges in Fintech.pdfHow Python is Tackling Data Integration Challenges in Fintech.pdf
How Python is Tackling Data Integration Challenges in Fintech.pdf
 
The Ultimate Things About IoT
The Ultimate Things About IoTThe Ultimate Things About IoT
The Ultimate Things About IoT
 
Intranet systems beyond SharePoint and the future of SharePoint
Intranet systems beyond SharePoint and the future of SharePointIntranet systems beyond SharePoint and the future of SharePoint
Intranet systems beyond SharePoint and the future of SharePoint
 
Python Certification Course In Bangalore
Python Certification Course In BangalorePython Certification Course In Bangalore
Python Certification Course In Bangalore
 
Records Management strategy and compliance in SharePoint
Records Management strategy and compliance in SharePointRecords Management strategy and compliance in SharePoint
Records Management strategy and compliance in SharePoint
 
Enterprise Collaboration using SharePoint
Enterprise Collaboration using SharePointEnterprise Collaboration using SharePoint
Enterprise Collaboration using SharePoint
 
SharePoint 2010 - Mobility, Browser Compatibility, Compliance, and its Contin...
SharePoint 2010 - Mobility, Browser Compatibility, Compliance, and its Contin...SharePoint 2010 - Mobility, Browser Compatibility, Compliance, and its Contin...
SharePoint 2010 - Mobility, Browser Compatibility, Compliance, and its Contin...
 
Data Curation @ SpazioDati - NEXA Lunch Seminar
Data Curation @ SpazioDati - NEXA Lunch SeminarData Curation @ SpazioDati - NEXA Lunch Seminar
Data Curation @ SpazioDati - NEXA Lunch Seminar
 
Dutch IT Outsourcing Intelligence Report 2011
Dutch IT Outsourcing Intelligence Report 2011Dutch IT Outsourcing Intelligence Report 2011
Dutch IT Outsourcing Intelligence Report 2011
 
Sharepoint webinar
Sharepoint webinarSharepoint webinar
Sharepoint webinar
 
Case Study: From Strategy to Large-scale Change Program
Case Study: From Strategy to Large-scale Change ProgramCase Study: From Strategy to Large-scale Change Program
Case Study: From Strategy to Large-scale Change Program
 
IT Barometer 2011 - Summary
IT Barometer 2011 - SummaryIT Barometer 2011 - Summary
IT Barometer 2011 - Summary
 

More from Flink Forward

More from Flink Forward (20)

Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
 
Evening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in FlinkEvening out the uneven: dealing with skew in Flink
Evening out the uneven: dealing with skew in Flink
 
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
“Alexa, be quiet!”: End-to-end near-real time model building and evaluation i...
 
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
Introducing BinarySortedMultiMap - A new Flink state primitive to boost your ...
 
Introducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes OperatorIntroducing the Apache Flink Kubernetes Operator
Introducing the Apache Flink Kubernetes Operator
 
Autoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive Mode
 
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...
 
One sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async SinkOne sink to rule them all: Introducing the new Async Sink
One sink to rule them all: Introducing the new Async Sink
 
Tuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptx
 
Flink powered stream processing platform at Pinterest
Flink powered stream processing platform at PinterestFlink powered stream processing platform at Pinterest
Flink powered stream processing platform at Pinterest
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Where is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
 
Using the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production DeploymentUsing the New Apache Flink Kubernetes Operator in a Production Deployment
Using the New Apache Flink Kubernetes Operator in a Production Deployment
 
The Current State of Table API in 2022
The Current State of Table API in 2022The Current State of Table API in 2022
The Current State of Table API in 2022
 
Flink SQL on Pulsar made easy
Flink SQL on Pulsar made easyFlink SQL on Pulsar made easy
Flink SQL on Pulsar made easy
 
Dynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
 
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and PinotExactly-Once Financial Data Processing at Scale with Flink and Pinot
Exactly-Once Financial Data Processing at Scale with Flink and Pinot
 
Processing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial ServicesProcessing Semantically-Ordered Streams in Financial Services
Processing Semantically-Ordered Streams in Financial Services
 
Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...Tame the small files problem and optimize data layout for streaming ingestion...
Tame the small files problem and optimize data layout for streaming ingestion...
 
Batch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & IcebergBatch Processing at Scale with Flink & Iceberg
Batch Processing at Scale with Flink & Iceberg
 

Recently uploaded

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 

Recently uploaded (20)

Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
The Ultimate Prompt Engineering Guide for Generative AI: Get the Most Out of ...
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 

Flink Case Study: OKKAM

  • 1. A Semantic Big Data Companion Stefano Bortoli bortoli@okkam.it Flavio Pompermaier pompermaier@okkam.it
  • 2. The company (briefly) • Okkam is – a SME based in Trento, Italy. – Started as spin-off of the University of Trento and FBK (2010) • Okkam core business is – large-scale data integration using semantic technologies and an Entity Name System • Okkam operative sectors – Services for public administration – Services for restaurants (and more) – Research projects • FP7, H2020, and Local agencies
  • 3. Who we are • Stefano Bortoli, PhD – works as technical director and researcher at Okkam S.R.L. (Trento, Italy). His research and development interests are in the area of Information Integration, with special focus in entity- centric applications exploiting semantic technologies. • Flavio Pompermaier, MSc. – works as senior software engineer at Okkam S.R.L. (Trento, Italy). Flavio is a passionate developer working with state of the art technologies, combining semantic with big data technologies.
  • 5. Why we need Flink Entiton data model Database record RDF statement Triplestore NOSQL & Index + Quad provenance IRI predicate object object Type Subject local IRI Subject ENS IRI RDF Type Expensive datawearhouse
  • 6. Why we are here • We want to build and manage (very) large entity-centric knowledge bases • We endorsed Flink since Stratosphere as data processing framework (during DOPA FP7) • Our use cases for Apache Flink: – Domain reasoning (Flink + Parquet + Thrift) – RDF data lifecycle (Flink + Parquet + Jena/Sesame ) – RDF data intelligence (Flink + ELKiBi) – Duplicate record detection (Flink + HBase + Solr) – Entiton Record linkage (Flink + MongoDB + Kryo) – Telemetry analysis (Flink + MongoDB + Weka)
  • 7. Come to our session! • We are the last presenting, don’t let us ALONE! • We are hiring! (maybe ;-)