Successfully reported this slideshow.

SANSA ISWC 2017 Talk

3

Share

Loading in …3
×
1 of 29
1 of 29

SANSA ISWC 2017 Talk

3

Share

Download to read offline

The talk describes the SANSA software framework for distributed in-memory analytics ("Big Data") based on the semantic technology stack, which was presented at ISWC (International Semantic Web Conference) 2017 in Vienna.

The talk describes the SANSA software framework for distributed in-memory analytics ("Big Data") based on the semantic technology stack, which was presented at ISWC (International Semantic Web Conference) 2017 in Vienna.

More Related Content

Related Books

Free with a 14 day trial from Scribd

See all

SANSA ISWC 2017 Talk

  1. 1. September 2017
  2. 2. Source: LOD-Cloud (http://lod-cloud.net/ ) ◎ → ◎
  3. 3. • • • Disk In-memory Iteration 1 Iteration 2 Iteration n Intermediate Dataset (in cluster memory) Intermediate Dataset (in cluster memory) Output
  4. 4. • • • • • • •
  5. 5. “Big Data” Processing (Spark/Flink) Semantic Technology Stack Data Integration Manual pre-processing Partially automated, standardised Modelling Simple (often flat feature vectors) Expressive Support for data exchange Limited (heterogeneous formats with limited schema information) Yes (RDF & OWL W3C Standards) Business value Direct Indirect Horizontally scalable Yes No Idea: combine advantages of both worlds
  6. 6. • • • val graph: TripleRDD = NTripleReader.load(spark, uri) graph.find(ANY, URI("http://dbpedia.org/ontology/influenced"), ANY) val rdf_stats_prop_dist = PropertyUsage(graph, spark).PostProc()
  7. 7.
  8. 8. • • • • • val rdd = ManchesterSyntaxOWLAxiomsRDDBuilder.build(spark, "file.owl") // get all subclass-of axioms val sco = rdd.filter(_.isInstanceOf[OWLSubClassOfAxiom])
  9. 9. val graphRdd = NTripleReader.load(spark,input) val partitions = RdfPartitionUtilsSpark.partitionGraph(graphRdd) val rewriter = SparqlifyUtils.createSparqlSqlRewriter(spark, partitions) val qef = new QueryExecutionFactorySparqlifySpark(spark, rewriter) SANSA Engine RDF Layer Data Ingestion Partitioning Query Layer Sparqlifying Distributed Data Structures ResultsViews Views
  10. 10. • • • → →
  11. 11. val graph = RDFGraphLoader.loadFromDisk(spark, uri) val reasoner = new ForwardRuleReasonerOWLHorst(spark.sparkContext) val inferredGraph = reasoner.apply(graph) RDFGraphWriter.writeToDisk(inferredGraph, output) RDFS rule dependency graph (simplified)
  12. 12. • • • • • • • • • •
  13. 13. • •
  14. 14. Visit our demo at 6pm!
  15. 15. • • • • • •
  16. 16. • • • • • • •
  17. 17. Web: http://sansa-stack.net Twitter: @SANSA_Stack Github: https://github.com/SANSA-Stack Mail: sansa-stack@googlemail.com
  18. 18. • • •
  19. 19. • • • • • • • • •
  20. 20. • • • • • • •
  21. 21. • • • • • •
  22. 22. • • • • •

×