SANSA ISWC 2017 Talk

Jens Lehmann
Jens LehmannResearcher at University of Leipzig
September 2017
Source: LOD-Cloud (http://lod-cloud.net/ )
◎
→
◎
•
•
•
Disk
In-memory
Iteration 1 Iteration 2 Iteration n
Intermediate
Dataset
(in cluster
memory)
Intermediate
Dataset
(in cluster
memory)
Output
•
•
•
•
•
•
•
SANSA ISWC 2017 Talk
“Big Data” Processing (Spark/Flink) Semantic Technology Stack
Data Integration Manual pre-processing Partially automated,
standardised
Modelling Simple (often flat feature vectors) Expressive
Support for data
exchange
Limited (heterogeneous formats
with limited schema information)
Yes (RDF & OWL W3C
Standards)
Business value Direct Indirect
Horizontally
scalable
Yes No
Idea: combine advantages of both worlds
SANSA ISWC 2017 Talk
SANSA ISWC 2017 Talk
•
•
•
val graph: TripleRDD = NTripleReader.load(spark, uri)
graph.find(ANY, URI("http://dbpedia.org/ontology/influenced"), ANY)
val rdf_stats_prop_dist = PropertyUsage(graph, spark).PostProc()
•
•
•
•
•
•
val rdd = ManchesterSyntaxOWLAxiomsRDDBuilder.build(spark, "file.owl")
// get all subclass-of axioms
val sco = rdd.filter(_.isInstanceOf[OWLSubClassOfAxiom])
SANSA ISWC 2017 Talk
val graphRdd = NTripleReader.load(spark,input)
val partitions = RdfPartitionUtilsSpark.partitionGraph(graphRdd)
val rewriter = SparqlifyUtils.createSparqlSqlRewriter(spark, partitions)
val qef = new QueryExecutionFactorySparqlifySpark(spark, rewriter)
SANSA Engine
RDF Layer
Data Ingestion
Partitioning
Query Layer
Sparqlifying
Distributed Data
Structures
ResultsViews Views
SANSA ISWC 2017 Talk
•
•
•
→
→
val graph = RDFGraphLoader.loadFromDisk(spark, uri)
val reasoner = new ForwardRuleReasonerOWLHorst(spark.sparkContext)
val inferredGraph = reasoner.apply(graph)
RDFGraphWriter.writeToDisk(inferredGraph, output)
RDFS rule
dependency graph
(simplified)
SANSA ISWC 2017 Talk
•
•
•
•
•
•
•
•
•
•
•
•
Visit our demo at 6pm!
•
•
•
•
•
•
•
•
•
•
•
•
•
Web: http://sansa-stack.net
Twitter: @SANSA_Stack
Github: https://github.com/SANSA-Stack
Mail: sansa-stack@googlemail.com
•
•
•
SANSA ISWC 2017 Talk
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
1 of 29

Recommended

JavaOne_2010 by
JavaOne_2010JavaOne_2010
JavaOne_2010Tadaya Tsuyukubo
330 views28 slides
Cassandra Lunch #59 Functions in Cassandra by
Cassandra Lunch #59  Functions in CassandraCassandra Lunch #59  Functions in Cassandra
Cassandra Lunch #59 Functions in CassandraAnant Corporation
394 views7 slides
Open stack @ iiit hyderabad by
Open stack @ iiit hyderabad Open stack @ iiit hyderabad
Open stack @ iiit hyderabad openstackindia
1.3K views17 slides
Apache Cassandra Lunch #67: Moving Data from Cassandra to Datastax Astra by
Apache Cassandra Lunch #67: Moving Data from Cassandra to Datastax AstraApache Cassandra Lunch #67: Moving Data from Cassandra to Datastax Astra
Apache Cassandra Lunch #67: Moving Data from Cassandra to Datastax AstraAnant Corporation
456 views8 slides
Spark: Taming Big Data by
Spark: Taming Big DataSpark: Taming Big Data
Spark: Taming Big DataLeonardo Gamas
702 views60 slides
Geo data analytics by
Geo data analyticsGeo data analytics
Geo data analyticsDaniel Marcous
5.9K views50 slides

More Related Content

What's hot

shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014 by
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014Gerd König
1.4K views11 slides
Nosql databases for the .net developer by
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developerJesus Rodriguez
2.3K views33 slides
Unsupervised Learning with Apache Spark by
Unsupervised Learning with Apache SparkUnsupervised Learning with Apache Spark
Unsupervised Learning with Apache SparkDB Tsai
10.2K views80 slides
Using PostgreSQL with Bibliographic Data by
Using PostgreSQL with Bibliographic DataUsing PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic DataJimmy Angelakos
2.1K views51 slides
Cassandra advanced data modeling by
Cassandra advanced data modelingCassandra advanced data modeling
Cassandra advanced data modelingRomain Hardouin
5.8K views36 slides
U-SQL Reading & Writing Files (SQLBits 2016) by
U-SQL Reading & Writing Files (SQLBits 2016)U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)Michael Rys
2.4K views11 slides

What's hot(20)

shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014 by Gerd König
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014
shark attack on sql-on-hadoop Talk at BerlinBuzzwords 2014
Gerd König1.4K views
Nosql databases for the .net developer by Jesus Rodriguez
Nosql databases for the .net developerNosql databases for the .net developer
Nosql databases for the .net developer
Jesus Rodriguez2.3K views
Unsupervised Learning with Apache Spark by DB Tsai
Unsupervised Learning with Apache SparkUnsupervised Learning with Apache Spark
Unsupervised Learning with Apache Spark
DB Tsai10.2K views
Using PostgreSQL with Bibliographic Data by Jimmy Angelakos
Using PostgreSQL with Bibliographic DataUsing PostgreSQL with Bibliographic Data
Using PostgreSQL with Bibliographic Data
Jimmy Angelakos2.1K views
Cassandra advanced data modeling by Romain Hardouin
Cassandra advanced data modelingCassandra advanced data modeling
Cassandra advanced data modeling
Romain Hardouin5.8K views
U-SQL Reading & Writing Files (SQLBits 2016) by Michael Rys
U-SQL Reading & Writing Files (SQLBits 2016)U-SQL Reading & Writing Files (SQLBits 2016)
U-SQL Reading & Writing Files (SQLBits 2016)
Michael Rys2.4K views
U-SQL User-Defined Operators (UDOs) (SQLBits 2016) by Michael Rys
U-SQL User-Defined Operators (UDOs) (SQLBits 2016)U-SQL User-Defined Operators (UDOs) (SQLBits 2016)
U-SQL User-Defined Operators (UDOs) (SQLBits 2016)
Michael Rys1.2K views
Build an Open Source Data Lake For Data Scientists by Shawn Zhu
Build an Open Source Data Lake For Data ScientistsBuild an Open Source Data Lake For Data Scientists
Build an Open Source Data Lake For Data Scientists
Shawn Zhu335 views
Iceberg: A modern table format for big data (Strata NY 2018) by Ryan Blue
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
Ryan Blue2K views
Elasticsearch Arcihtecture & What's New in Version 5 by Burak TUNGUT
Elasticsearch Arcihtecture & What's New in Version 5Elasticsearch Arcihtecture & What's New in Version 5
Elasticsearch Arcihtecture & What's New in Version 5
Burak TUNGUT383 views
Key Challenges in Cloud Computing and How Yahoo! is Approaching Them by Yahoo Developer Network
Key Challenges in Cloud Computing and How Yahoo! is Approaching ThemKey Challenges in Cloud Computing and How Yahoo! is Approaching Them
Key Challenges in Cloud Computing and How Yahoo! is Approaching Them
ELK - Stack - Munich .net UG by Steve Behrendt
ELK - Stack - Munich .net UGELK - Stack - Munich .net UG
ELK - Stack - Munich .net UG
Steve Behrendt1.6K views
Small intro to Big Data - Old version by SoftwareMill
Small intro to Big Data - Old versionSmall intro to Big Data - Old version
Small intro to Big Data - Old version
SoftwareMill1.1K views
Neo4j Spatial at LocationDay 2013 in Malmö by Craig Taverner
Neo4j Spatial at LocationDay 2013 in MalmöNeo4j Spatial at LocationDay 2013 in Malmö
Neo4j Spatial at LocationDay 2013 in Malmö
Craig Taverner1.4K views

Similar to SANSA ISWC 2017 Talk

Apache Spark II (SparkSQL) by
Apache Spark II (SparkSQL)Apache Spark II (SparkSQL)
Apache Spark II (SparkSQL)Datio Big Data
1.9K views26 slides
20130912 YTC_Reynold Xin_Spark and Shark by
20130912 YTC_Reynold Xin_Spark and Shark20130912 YTC_Reynold Xin_Spark and Shark
20130912 YTC_Reynold Xin_Spark and SharkYahooTechConference
5.1K views38 slides
New Developments in Spark by
New Developments in SparkNew Developments in Spark
New Developments in SparkDatabricks
9.7K views43 slides
Shark by
SharkShark
SharkAlex Ivy
356 views28 slides
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc... by
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...Lucidworks
3.2K views27 slides
NYC Lucene/Solr Meetup: Spark / Solr by
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solrthelabdude
1.8K views28 slides

Similar to SANSA ISWC 2017 Talk(20)

Apache Spark II (SparkSQL) by Datio Big Data
Apache Spark II (SparkSQL)Apache Spark II (SparkSQL)
Apache Spark II (SparkSQL)
Datio Big Data1.9K views
New Developments in Spark by Databricks
New Developments in SparkNew Developments in Spark
New Developments in Spark
Databricks9.7K views
Shark by Alex Ivy
SharkShark
Shark
Alex Ivy356 views
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc... by Lucidworks
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
Solr and Spark for Real-Time Big Data Analytics: Presented by Tim Potter, Luc...
Lucidworks3.2K views
NYC Lucene/Solr Meetup: Spark / Solr by thelabdude
NYC Lucene/Solr Meetup: Spark / SolrNYC Lucene/Solr Meetup: Spark / Solr
NYC Lucene/Solr Meetup: Spark / Solr
thelabdude1.8K views
Paris Data Geek - Spark Streaming by Djamel Zouaoui
Paris Data Geek - Spark Streaming Paris Data Geek - Spark Streaming
Paris Data Geek - Spark Streaming
Djamel Zouaoui2.7K views
TriHUG talk on Spark and Shark by trihug
TriHUG talk on Spark and SharkTriHUG talk on Spark and Shark
TriHUG talk on Spark and Shark
trihug3.2K views
Apache Spark Overview @ ferret by Andrii Gakhov
Apache Spark Overview @ ferretApache Spark Overview @ ferret
Apache Spark Overview @ ferret
Andrii Gakhov1.2K views
Apache Spark and DataStax Enablement by Vincent Poncet
Apache Spark and DataStax EnablementApache Spark and DataStax Enablement
Apache Spark and DataStax Enablement
Vincent Poncet2.7K views
Jump Start on Apache Spark 2.2 with Databricks by Anyscale
Jump Start on Apache Spark 2.2 with DatabricksJump Start on Apache Spark 2.2 with Databricks
Jump Start on Apache Spark 2.2 with Databricks
Anyscale976 views
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015 by Andrey Vykhodtsev
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Andrey Vykhodtsev526 views
OCF.tw's talk about "Introduction to spark" by Giivee The
OCF.tw's talk about "Introduction to spark"OCF.tw's talk about "Introduction to spark"
OCF.tw's talk about "Introduction to spark"
Giivee The2.3K views
Spark as the Gateway Drug to Typed Functional Programming: Spark Summit East ... by Spark Summit
Spark as the Gateway Drug to Typed Functional Programming: Spark Summit East ...Spark as the Gateway Drug to Typed Functional Programming: Spark Summit East ...
Spark as the Gateway Drug to Typed Functional Programming: Spark Summit East ...
Spark Summit1.5K views
Apache spark-melbourne-april-2015-meetup by Ned Shawa
Apache spark-melbourne-april-2015-meetupApache spark-melbourne-april-2015-meetup
Apache spark-melbourne-april-2015-meetup
Ned Shawa1.1K views
Apache Spark - San Diego Big Data Meetup Jan 14th 2015 by cdmaxime
Apache Spark - San Diego Big Data Meetup Jan 14th 2015Apache Spark - San Diego Big Data Meetup Jan 14th 2015
Apache Spark - San Diego Big Data Meetup Jan 14th 2015
cdmaxime731 views
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014 by cdmaxime
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
Apache Spark - Las Vegas Big Data Meetup Dec 3rd 2014
cdmaxime564 views
A look under the hood at Apache Spark's API and engine evolutions by Databricks
A look under the hood at Apache Spark's API and engine evolutionsA look under the hood at Apache Spark's API and engine evolutions
A look under the hood at Apache Spark's API and engine evolutions
Databricks3.2K views
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ... by Databricks
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
Databricks5K views
Brief Intro to Apache Spark @ Stanford ICME by Paco Nathan
Brief Intro to Apache Spark @ Stanford ICMEBrief Intro to Apache Spark @ Stanford ICME
Brief Intro to Apache Spark @ Stanford ICME
Paco Nathan1.8K views

Recently uploaded

Pollination By Nagapradheesh.M.pptx by
Pollination By Nagapradheesh.M.pptxPollination By Nagapradheesh.M.pptx
Pollination By Nagapradheesh.M.pptxMNAGAPRADHEESH
17 views9 slides
How to be(come) a successful PhD student by
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD studentTom Mens
513 views62 slides
Applications of Large Language Models in Materials Discovery and Design by
Applications of Large Language Models in Materials Discovery and DesignApplications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and DesignAnubhav Jain
11 views17 slides
Exploring the nature and synchronicity of early cluster formation in the Larg... by
Exploring the nature and synchronicity of early cluster formation in the Larg...Exploring the nature and synchronicity of early cluster formation in the Larg...
Exploring the nature and synchronicity of early cluster formation in the Larg...Sérgio Sacani
346 views12 slides
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe... by
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...Anmol Vishnu Gupta
26 views12 slides
scopus cited journals.pdf by
scopus cited journals.pdfscopus cited journals.pdf
scopus cited journals.pdfKSAravindSrivastava
9 views15 slides

Recently uploaded(20)

Pollination By Nagapradheesh.M.pptx by MNAGAPRADHEESH
Pollination By Nagapradheesh.M.pptxPollination By Nagapradheesh.M.pptx
Pollination By Nagapradheesh.M.pptx
MNAGAPRADHEESH17 views
How to be(come) a successful PhD student by Tom Mens
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD student
Tom Mens513 views
Applications of Large Language Models in Materials Discovery and Design by Anubhav Jain
Applications of Large Language Models in Materials Discovery and DesignApplications of Large Language Models in Materials Discovery and Design
Applications of Large Language Models in Materials Discovery and Design
Anubhav Jain11 views
Exploring the nature and synchronicity of early cluster formation in the Larg... by Sérgio Sacani
Exploring the nature and synchronicity of early cluster formation in the Larg...Exploring the nature and synchronicity of early cluster formation in the Larg...
Exploring the nature and synchronicity of early cluster formation in the Larg...
Sérgio Sacani346 views
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe... by Anmol Vishnu Gupta
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...
Study on Drug Drug Interaction Through Prescription Analysis of Type II Diabe...
application of genetic engineering 2.pptx by SankSurezz
application of genetic engineering 2.pptxapplication of genetic engineering 2.pptx
application of genetic engineering 2.pptx
SankSurezz12 views
A Ready-to-Analyze High-Plex Spatial Signature Development Workflow for Cance... by InsideScientific
A Ready-to-Analyze High-Plex Spatial Signature Development Workflow for Cance...A Ready-to-Analyze High-Plex Spatial Signature Development Workflow for Cance...
A Ready-to-Analyze High-Plex Spatial Signature Development Workflow for Cance...
InsideScientific67 views
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ... by ILRI
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
Small ruminant keepers’ knowledge, attitudes and practices towards peste des ...
ILRI5 views
Nitrosamine & NDSRI.pptx by NileshBonde4
Nitrosamine & NDSRI.pptxNitrosamine & NDSRI.pptx
Nitrosamine & NDSRI.pptx
NileshBonde418 views
Structure of purines and pyrimidines - Jahnvi arora (11228108), mmdu ,mullana... by jahnviarora989
Structure of purines and pyrimidines - Jahnvi arora (11228108), mmdu ,mullana...Structure of purines and pyrimidines - Jahnvi arora (11228108), mmdu ,mullana...
Structure of purines and pyrimidines - Jahnvi arora (11228108), mmdu ,mullana...
jahnviarora9895 views
CSF -SHEEBA.D presentation.pptx by SheebaD7
CSF -SHEEBA.D presentation.pptxCSF -SHEEBA.D presentation.pptx
CSF -SHEEBA.D presentation.pptx
SheebaD714 views
RemeOs science and clinical evidence by PetrusViitanen1
RemeOs science and clinical evidenceRemeOs science and clinical evidence
RemeOs science and clinical evidence
PetrusViitanen144 views
Open Access Publishing in Astrophysics by Peter Coles
Open Access Publishing in AstrophysicsOpen Access Publishing in Astrophysics
Open Access Publishing in Astrophysics
Peter Coles1K views

SANSA ISWC 2017 Talk