Our colleague Yuri Glikman of Fraunhofer FOKUS (LinDA partner) presented the LinDA transformation tool at the recent Samos Summit (http://samos-summit.blogspot.de/).
Sigma EE: Reaping low-hanging fruits in RDF-based data integrationRichard Cyganiak
A presentation I gave at I-Semantics 2010 on Sigma EE, an RDF-based data integration front-end.
Sigma EE is now available for download here: http://sig.ma/?page=help
Enabling Low-cost Open Data Publishing and ReuseMarin Dimitrov
In the space of just a few years we’ve seen the transformational power of open data; both for transparency and accountability in public data, and efficiency and innovation with businesses in private data. In its first year, institutions and individuals throughout Europe have supported public sector bodies in releasing data and numerous start-ups, developers and SMEs in reusing this data for economic benefit.
However, we are still at the beginning of the open data movement, and there is still more that can be done to make open data simpler to use and to make it available to a wider audience.
The core goal of the DaPaaS project is to provide a Data- and Platform-as-a-Service environment, where 3rd parties (such as governmental organisations, SMEs, developers and larger companies) can publish and host both data sets and data-intensive applications, which can then be accessed by end-user applications in a cross-platform manner. You can find out more about DaPaaS on the detailed about page.
Essentially, DaPaaS aims to make publishing, consumption, and reuse of open data, as well as deploying open data applications, easier and cheaper for SMEs and small public bodies which otherwise may not have sufficient technical expertise, infrastructure and resources required to do so.
see also http://www.slideshare.net/eswcsummerschool/wed-roman-tutopendatapub-38742186
On-Demand RDF Graph Databases in the CloudMarin Dimitrov
slides from the S4 webinar "On-Demand RDF Graph Databases in the Cloud"
RDF database-as-a-service running on the Self-Service Semantic Suite (S4) platform: http://s4.ontotext.com
video recording of the talk is available at http://info.ontotext.com/on-demand-rdf-graph-database
Text Analytics & Linked Data Management As-a-ServiceMarin Dimitrov
slides from the talk on "Text Analytics & Linked Data Management As-a-Service with S4" from the ESWC'2015 workshop on Semantic Web Enterprise Adoption & Best Practices
full paper available at http://2015.wasabi-ws.org/papers/wasabi15_1.pdf
slides from our talk "Low-Cost Open Data as-a-service" from the Semantic Web Developers workshop of ESWC'2015 (full paper: http://ceur-ws.org/Vol-1361/paper7.pdf)
Sigma EE: Reaping low-hanging fruits in RDF-based data integrationRichard Cyganiak
A presentation I gave at I-Semantics 2010 on Sigma EE, an RDF-based data integration front-end.
Sigma EE is now available for download here: http://sig.ma/?page=help
Enabling Low-cost Open Data Publishing and ReuseMarin Dimitrov
In the space of just a few years we’ve seen the transformational power of open data; both for transparency and accountability in public data, and efficiency and innovation with businesses in private data. In its first year, institutions and individuals throughout Europe have supported public sector bodies in releasing data and numerous start-ups, developers and SMEs in reusing this data for economic benefit.
However, we are still at the beginning of the open data movement, and there is still more that can be done to make open data simpler to use and to make it available to a wider audience.
The core goal of the DaPaaS project is to provide a Data- and Platform-as-a-Service environment, where 3rd parties (such as governmental organisations, SMEs, developers and larger companies) can publish and host both data sets and data-intensive applications, which can then be accessed by end-user applications in a cross-platform manner. You can find out more about DaPaaS on the detailed about page.
Essentially, DaPaaS aims to make publishing, consumption, and reuse of open data, as well as deploying open data applications, easier and cheaper for SMEs and small public bodies which otherwise may not have sufficient technical expertise, infrastructure and resources required to do so.
see also http://www.slideshare.net/eswcsummerschool/wed-roman-tutopendatapub-38742186
On-Demand RDF Graph Databases in the CloudMarin Dimitrov
slides from the S4 webinar "On-Demand RDF Graph Databases in the Cloud"
RDF database-as-a-service running on the Self-Service Semantic Suite (S4) platform: http://s4.ontotext.com
video recording of the talk is available at http://info.ontotext.com/on-demand-rdf-graph-database
Text Analytics & Linked Data Management As-a-ServiceMarin Dimitrov
slides from the talk on "Text Analytics & Linked Data Management As-a-Service with S4" from the ESWC'2015 workshop on Semantic Web Enterprise Adoption & Best Practices
full paper available at http://2015.wasabi-ws.org/papers/wasabi15_1.pdf
slides from our talk "Low-Cost Open Data as-a-service" from the Semantic Web Developers workshop of ESWC'2015 (full paper: http://ceur-ws.org/Vol-1361/paper7.pdf)
What is Connected Data as a concept? Who is interested in Connected Data? What problems does Connected Data solve? What skills are used in Connected Data?
Connected Data as of July 2017 has been running for over a year with very successful conference and 9 meetups held to date on a range of topics. These have included Knowledge Representation, Semantics, Linked Data, Graph Databases, Ontology development and use cases and industry verticals including recommendations, telecoms and finance. Yet the group has never had a particularly formal terms of reference or description defining what Connected Data actually means. Some would say this is something of a irony for a group so focused on semantics, schemas, definitions & structure!
This is an attempt (with some humour and something of journey included in it) to achieve something resembling a definition and terms of reference for the group.
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
Watch here: https://bit.ly/3719Bi7
Advanced data science techniques, like machine learning, have proven an extremely useful tool to derive valuable insights from existing data. Platforms like Spark, and complex libraries for R, Python and Scala put advanced techniques at the fingertips of the data scientists. However, these data scientists spent most of their time looking for the right data and massaging it into a usable format. Data virtualization offers a new alternative to address these issues in a more efficient and agile way.
Attend this webinar and learn:
-How data virtualization can accelerate data acquisition and massaging, providing the data scientist with a powerful tool to complement their practice
- How popular tools from the data science ecosystem: Spark, Python, Zeppelin, Jupyter, etc. integrate with Denodo
- How you can use the Denodo Platform with large data volumes in an efficient way
-About the success McCormick has had as a result of seasoning the Machine Learning and Blockchain Landscape with data virtualization
DSpace-CRIS slides presented at ORCID's Better Together webinar on 19.09.2019, full slide deck with ORCID introduction at https://doi.org/10.23640/07243.9884033.v2.
Video Recording available at https://vimeo.com/361523018
In this webinar Thomas Cook, Sales Director, AnzoGraph DB, provides a history lesson on the origins of SPARQL, including its roots in the Semantic Web, and how linked open data is used to create Knowledge Graphs. Then, he dives into "What is RDF?", "What is a URI?" and "What is SPARQL?", wrapping up with a real-world demonstration via a Zeppelin notebook.
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...Shawn Jones
In a perfect world, all articles consistently contain sufficient metadata to describe the resource. We know this is not the reality, so we are motivated to investigate the evolution of the metadata that is present when authors and publishers supply their own. Because applying metadata takes time, we recognize that each news article author has a limited metadata budget with which to spend their time and effort. How are they spending this budget? What are the top metadata categories in use? How did they grow over time? What purpose do they serve? We also recognize that not all metadata fields are used equally. What is the growth of individual fields over time? Which fields experienced the fastest adoption? In this paper, we review 227,726 HTML news articles from 29 outlets captured by the Internet Archive between 1998 and 2016. Upon reviewing the metadata fields in each article, we discovered that 2010 began a metadata renaissance as publishers embraced metadata for improved search engine ranking, search engine tracking, social media tracking, and social media sharing. When analyzing individual fields, we find that one application of metadata stands out above all others: social cards -- the cards generated by platforms like Twitter when one shares a URL. Once a metadata standard was established for cards in 2010, its fields were adopted by 20% of articles in the first year and reached more than 95% adoption by 2016. This rate of adoption surpasses efforts like schema.org and Dublin Core by a fair margin. When confronted with these results on how news publishers spend their metadata budget, we must conclude that it is all about the cards.
GraphDB Cloud: Enterprise Ready RDF Database on DemandOntotext
GraphDB Cloud is an enterprise grade RDF graph database providing high-performance querying over large volumes of RDF data. On this webinar, Ontotext demonstrates how to instantly create and deploy a fully managed Graph Database, then import & query data with the (OpenRDF) GraphDB Workbench, and finally explore and visualize data with the build in visualization tools.
Manage tracability with Apache Atlas, a flexible metadata repositorySynaltic Group
Do you know where is your data ?
Do you know who is responsible of this specific datasets ?
Do you know from which application or task this entity was modified last friday ?
Apache Atlas helps you to manage all your metadata of your data. With Apache Atlas you can know all lineages between your datasets and process that use them.
Enhancing Interoperability: The Implementation of OpenAIRE Guidelines and COA...4Science
ABSTRACT: The continuous work of the OpenAIRE community on guidelines for CRIS managers, literature repositories, and data archives, together with the publication of the “Behaviours and Technical Recommendations of the COAR Next Generation Repositories Working Group”, are raising important challenges for the CRIS and the repository communities, working together to make research information more an more interoperable, and, hopefully, open. The recommendations of the Open Science Policy Platform, published by the European Commission, identify FAIR (Findable-Accessible-Interoperable-Reusable) data among its priorities. In an interoperable world, all these indications lead toward a common direction, where implementers are encouraged to use open protocols, such as the OAI-PMH and ResourceSync, open standards such as CERIF, persistent identifiers such as DOIs and ORCiDs, to make this happen. The presentation will go through these challenges, illustrating how CRIS and repository managers should work together toward a successful information exchange, and exemplifying how a single free open platform, DSpace-CRIS, can implement both a CRIS and a repository and fulfill requirements for a FAIR environment for research information and research objects.
The Business Case for Semantic Web Ontology & Knowledge GraphCambridge Semantics
In this webinar Mark Wallace, Ontologist & Developer, Semantic Arts, and Thomas Cook, Director of Sales AnzoGraph DB, Cambridge Semantics, explore the benefits of building a Semantic Knowledge Graph with RDF*, wrapping up with an airline data demo that illustrates the value of schema, inference and reasoning in it.
Linda (Linked Data Analytics) project general presentationSalvatore Virtuoso
he LinDA project addresses one of the most significant challenges of the usage and publication of Linked Data, the renovation and conversion of existing data formats into structures that support the semantic enrichment and interlinking of data. The set of tools provided by LinDA will assist enterprises, especially SMEs which often cannot afford the development and maintenance of dedicated information analysis and management departments, in efficiently developing novel data analytical services that are linked to the available public data therefore contributing to improve their competitiveness and stimulating the emergence of innovative business models.
What is Connected Data as a concept? Who is interested in Connected Data? What problems does Connected Data solve? What skills are used in Connected Data?
Connected Data as of July 2017 has been running for over a year with very successful conference and 9 meetups held to date on a range of topics. These have included Knowledge Representation, Semantics, Linked Data, Graph Databases, Ontology development and use cases and industry verticals including recommendations, telecoms and finance. Yet the group has never had a particularly formal terms of reference or description defining what Connected Data actually means. Some would say this is something of a irony for a group so focused on semantics, schemas, definitions & structure!
This is an attempt (with some humour and something of journey included in it) to achieve something resembling a definition and terms of reference for the group.
Advanced Analytics and Machine Learning with Data VirtualizationDenodo
Watch here: https://bit.ly/3719Bi7
Advanced data science techniques, like machine learning, have proven an extremely useful tool to derive valuable insights from existing data. Platforms like Spark, and complex libraries for R, Python and Scala put advanced techniques at the fingertips of the data scientists. However, these data scientists spent most of their time looking for the right data and massaging it into a usable format. Data virtualization offers a new alternative to address these issues in a more efficient and agile way.
Attend this webinar and learn:
-How data virtualization can accelerate data acquisition and massaging, providing the data scientist with a powerful tool to complement their practice
- How popular tools from the data science ecosystem: Spark, Python, Zeppelin, Jupyter, etc. integrate with Denodo
- How you can use the Denodo Platform with large data volumes in an efficient way
-About the success McCormick has had as a result of seasoning the Machine Learning and Blockchain Landscape with data virtualization
DSpace-CRIS slides presented at ORCID's Better Together webinar on 19.09.2019, full slide deck with ORCID introduction at https://doi.org/10.23640/07243.9884033.v2.
Video Recording available at https://vimeo.com/361523018
In this webinar Thomas Cook, Sales Director, AnzoGraph DB, provides a history lesson on the origins of SPARQL, including its roots in the Semantic Web, and how linked open data is used to create Knowledge Graphs. Then, he dives into "What is RDF?", "What is a URI?" and "What is SPARQL?", wrapping up with a real-world demonstration via a Zeppelin notebook.
It’s All About The Cards: Sharing on Social Media Encouraged HTML Metadata G...Shawn Jones
In a perfect world, all articles consistently contain sufficient metadata to describe the resource. We know this is not the reality, so we are motivated to investigate the evolution of the metadata that is present when authors and publishers supply their own. Because applying metadata takes time, we recognize that each news article author has a limited metadata budget with which to spend their time and effort. How are they spending this budget? What are the top metadata categories in use? How did they grow over time? What purpose do they serve? We also recognize that not all metadata fields are used equally. What is the growth of individual fields over time? Which fields experienced the fastest adoption? In this paper, we review 227,726 HTML news articles from 29 outlets captured by the Internet Archive between 1998 and 2016. Upon reviewing the metadata fields in each article, we discovered that 2010 began a metadata renaissance as publishers embraced metadata for improved search engine ranking, search engine tracking, social media tracking, and social media sharing. When analyzing individual fields, we find that one application of metadata stands out above all others: social cards -- the cards generated by platforms like Twitter when one shares a URL. Once a metadata standard was established for cards in 2010, its fields were adopted by 20% of articles in the first year and reached more than 95% adoption by 2016. This rate of adoption surpasses efforts like schema.org and Dublin Core by a fair margin. When confronted with these results on how news publishers spend their metadata budget, we must conclude that it is all about the cards.
GraphDB Cloud: Enterprise Ready RDF Database on DemandOntotext
GraphDB Cloud is an enterprise grade RDF graph database providing high-performance querying over large volumes of RDF data. On this webinar, Ontotext demonstrates how to instantly create and deploy a fully managed Graph Database, then import & query data with the (OpenRDF) GraphDB Workbench, and finally explore and visualize data with the build in visualization tools.
Manage tracability with Apache Atlas, a flexible metadata repositorySynaltic Group
Do you know where is your data ?
Do you know who is responsible of this specific datasets ?
Do you know from which application or task this entity was modified last friday ?
Apache Atlas helps you to manage all your metadata of your data. With Apache Atlas you can know all lineages between your datasets and process that use them.
Enhancing Interoperability: The Implementation of OpenAIRE Guidelines and COA...4Science
ABSTRACT: The continuous work of the OpenAIRE community on guidelines for CRIS managers, literature repositories, and data archives, together with the publication of the “Behaviours and Technical Recommendations of the COAR Next Generation Repositories Working Group”, are raising important challenges for the CRIS and the repository communities, working together to make research information more an more interoperable, and, hopefully, open. The recommendations of the Open Science Policy Platform, published by the European Commission, identify FAIR (Findable-Accessible-Interoperable-Reusable) data among its priorities. In an interoperable world, all these indications lead toward a common direction, where implementers are encouraged to use open protocols, such as the OAI-PMH and ResourceSync, open standards such as CERIF, persistent identifiers such as DOIs and ORCiDs, to make this happen. The presentation will go through these challenges, illustrating how CRIS and repository managers should work together toward a successful information exchange, and exemplifying how a single free open platform, DSpace-CRIS, can implement both a CRIS and a repository and fulfill requirements for a FAIR environment for research information and research objects.
The Business Case for Semantic Web Ontology & Knowledge GraphCambridge Semantics
In this webinar Mark Wallace, Ontologist & Developer, Semantic Arts, and Thomas Cook, Director of Sales AnzoGraph DB, Cambridge Semantics, explore the benefits of building a Semantic Knowledge Graph with RDF*, wrapping up with an airline data demo that illustrates the value of schema, inference and reasoning in it.
Linda (Linked Data Analytics) project general presentationSalvatore Virtuoso
he LinDA project addresses one of the most significant challenges of the usage and publication of Linked Data, the renovation and conversion of existing data formats into structures that support the semantic enrichment and interlinking of data. The set of tools provided by LinDA will assist enterprises, especially SMEs which often cannot afford the development and maintenance of dedicated information analysis and management departments, in efficiently developing novel data analytical services that are linked to the available public data therefore contributing to improve their competitiveness and stimulating the emergence of innovative business models.
How google is using linked data today and vision for tomorrowVasu Jain
In this presentation, I will discuss how modern search engines, such as Google, make use of Linked Data spread inWeb pages for displaying Rich Snippets. Also i will present an example of the technology and analyze its current uptake.
Then i sketched some ideas on how Rich Snippets could be extended in the future, in particular for multimedia documents.
Original Paper :
http://scholar.google.com/citations?view_op=view_citation&hl=en&user=K3TsGbgAAAAJ&authuser=1&citation_for_view=K3TsGbgAAAAJ:u-x6o8ySG0sC
Another Presentation by Author: https://docs.google.com/present/view?id=dgdcn6h3_185g8w2bdgv&pli=1
Big Data in Action – Real-World Solution ShowcaseInside Analysis
The Briefing Room with Radiant Advisors and IBM
Live Webcast on February 25, 2014
Watch the archive: https://bloorgroup.webex.com/bloorgroup/lsr.php?RCID=53c9b7fa2000f98f5b236747e3602511
The power of Big Data depends heavily upon the context in which it's used, and most organizations are just beginning to figure out where, how and when to leverage it. One key to success is integration with existing information systems, many of which still rely on relational database technologies. Finding ways to blend these two worlds can help companies generate measurable business value in fairly short order.
Register for this episode of The Briefing Room to hear Analysts Lindy Ryan and John O'Brien as they explain how the combination of traditional Business Intelligence with Big Data Analytics can provide game-changing results in today's information economy. They'll be briefed by Eric Poulin and Paul Flach of Stream Integration who will share best practices for designing and implementing Big Data solutions. They'll discuss the components of IBM BigInsights, and explain how BigSheets can empower non-technical users who need to explore self-structured data.
Visit InsideAnlaysis.com for more information.
20140902 LinDa Workshop Semantincs2014 - LinDA Project OverviewLinDa_FP7
LinDa Project presentation - Challenges, tools, workplan and objectives
Presentation at LinDA Workshop on 2nd September 2014 at Semantics2014 by Spiros Mouzakitis
PlanetData project was presented by Elena Simperl and Barry Norton from Karlsruhe Institute of Technology at the 1st International Symposium on Data-driven Process Discovery and Analysis on June 30, 2011 in Campione d’Italia, Italy
Data Integration for Big Data (OOW 2016, Co-Presented With Oracle)Rittman Analytics
Set of product roadmap + capabilities slides from Oracle Data Integration Product Management, and thoughts on data integration on big data implementations by Mark Rittman (Independent Analyst)
“Publishing and Consuming Linked Data. (Lessons learnt when using LOD in an a...Marta Villegas
Talk given at the "1st Summer Datathon on Linguistic Linked Open Data (SD-LLOD-15)"
In this talk we will describe our experience when publishing and, more crucially, consuming Linked Data at the Spanish CLARIN Knowledge Centre (http://lod.iula.upf.edu). The center includes a Catalog of NLP resources & tools which aims to promote the use of language technology to researches of Humanities and Social Sciences. Though the original data set followed the XML/XSD schema, this was rewritten in accordance to the LOD approach in order to maximize the information contained in our repositories and to be able to enrich the data there.
We will addresses some critical aspects when RDFying XSD/XML data focusing on the strategy followed when mapping controlled vocabularies expressed in XML enumerations; when dealing with certain unstructured data (those where input strings may generate relevant instances); and when addressing identity resolution and linking tasks once the eventual instances are RDFied. Here we will also report on data cleansing, a crucial and unavoidable task which we addressed as an incremental process where SPARQL played an important role. We will see that some of the decisions taken depend on the eventual application we have in mind. The requirements of our Catalog (implemented as a web browser) include: displaying data to the user in a comprehensive way; aggregating external data in a sensitive manner and making hidden implicit relations explicit. In addition, the system needs to provide fresh data (regularly updated) in a quick response time.
Finally, we will report on our experiences when addressing data integration and enrichment (via data mashup). We experimented with different strategies (e.g. using external URIS vs caching local data) and faced different problems (time latency, dereferencing external URIS) that may be useful to share.
The Best of Both Worlds: Unlocking the Power of (big) Knowledge Graphs with S...Gezim Sejdiu
Over the past decade, vast amounts of machine-readable structured information have become available through the automation of research processes as well as the increasing popularity of knowledge graphs and semantic technologies.
A major and yet unsolved challenge that research faces today is to perform scalable analysis of large scale knowledge graphs in order to facilitate applications like link prediction, knowledge base completion, and question answering.
Most machine learning approaches, which scale horizontally (i.e. can be executed in a distributed environment) work on simpler feature vector based input rather than more expressive knowledge structures.
On the other hand, the learning methods which exploit the expressive structures, e.g. Statistical Relational Learning and Inductive Logic Programming approaches, usually do not scale well to very large knowledge bases owing to their working complexity.
This talk gives an overview of the ongoing project Semantic Analytics Stack (SANSA) which aims to bridge this research gap by creating an out of the box library for scalable, in-memory, structured learning.
From http://www.csdn.net/article/2015-12-17/2826501
《Databricks公司联合创始人、Spark首席架构师辛湜:Spark发展:回顾2015,展望2016 》
辛湜介绍了Spark的目标是“Unified engine across data workloads and platforms”。在谈到Spark在2015年最大的改变时,他感觉应该是增加了DataFrames API。对于Spark的生态圈,他表示主要侧重三个不同的方向,一个是上层的应用,二是下层的环境,还有最重要的是连接到的数据源。
How Data Virtualization Adds Value to Your Data Science StackDenodo
Watch here: https://bit.ly/3cZGCxr
For their machine learning and data science projects to be successful, data scientists need access to all of the enterprise data delivered through their myriad of data models. However, gaining access to all data, integrated into a central repository has been a challenge. Often 80% of the project time is spent on these tasks. But, a virtual layer can help the data scientist speed up some of the most tedious tasks, like data exploration and analysis. At the same time, it also integrates well with the data science ecosystem. There is no need to change tools and learn new languages. The data virtualization platform helps data scientists offload these data integration tasks, allowing them to focus on advanced analytics.
In this session, you will learn how data virtualization:
- Provides all of the enterprise data, in real-time, and without replication
- Enables data scientists to create and share multiple logical models using simple drag and drop
- Provides a catalog of all business definitions, lineage, and relationships
Similar to Simplified minimalistic workflows for the publication of Linked Open Data (20)
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Unleashing the Power of Data_ Choosing a Trusted Analytics Platform.pdfEnterprise Wired
In this guide, we'll explore the key considerations and features to look for when choosing a Trusted analytics platform that meets your organization's needs and delivers actionable intelligence you can trust.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
2. +
Motivation behind LinDA
Linked Data is an active research field with currently relatively
little activity towards ease of use and accessibility for non-
experts. LinDA’s motivation is to
bring lay people, especially from SMEs, closer to Linked Data
and help them unlock the potential of their data in an affordable
manner.
help expand the Linked Open Data Cloud by publishing
Linked Open Data.
02.07.2015Samos Summit 2015
3. +
LinDa is…
The renovation of Public Sector Information (PSI) and private
data
A set of tools allowing for a rich holistic workflow that
seamlessly facilitates the ease in:
02.07.2015Samos Summit 2015
Icons made by Picol, Freepik, Yannick, Mario Purisic from www.flaticon.com is licensed by CC BY 3.0
RDB
CSV
Excel
…
RDF
1. Transforming and
semantically enriching your
data
2. Publishing your data
LOD Cloud
Linked Enterprise Data
3. Linking and querying
your data
4. Visualising and analysing your data
4. +
The Transformation Tool
The Transformation tool lays the ground work for the
subsequent steps in the workflow
The state of the Art analysis has shown shortcomings in the
current available transformation tools:
Most tools lack a simple intuitive UI, tailored for non-experts
Rarely any reconciliation methods against LD resources –
facilitating 5* Linked Data
R2RML, the de-facto standard for mapping RDB to RDF, is not
always supported
Rarely simple, non technical user guidance is found
Non or insufficient integrated ontology finding services
02.07.2015Samos Summit 2015
5. +
The Transformation Tool: close-up
RDF snapshots and on the fly transformation of semi-structured
data (e.g. CSV, Excel)
SQL re-write and publishing of SPARQL enpoints
Re-use of transformation mappings (R2RML + tool specific)
Intergration with intelligent vocabulary service for suggesting
adequate classes and properties
02.07.2015Samos Summit 2015
7. +
Ontology Finding
Access to
repository with all
public ontologies
Oracle Service for
best matches
Ranking based on
popularity and re-
use
Cross-lingual
suggestions
02.07.2015Samos Summit 2015
8. +
LD resources reconciliation and
type guessing
Using ‚DBpedia
Lookup‘ to
reconcile against
Linked Data
resources
Support for 5*
Linked Data
principles
02.07.2015Samos Summit 2015
9. +
Semantic enrichment of RDF
Allows describing
data using the
rdf:type for rich
queries
02.07.2015Samos Summit 2015
10. +
View and publish to Triple Store
RDF download
Publish to
preferred
Triple Store
02.07.2015Samos Summit 2015
11. +
LinDA at a glance
Grant Agreeement 610565
Start 1/12/2013
Duration 24 months
Objective ICT-2013.4.3 - SME Initiative on Analytics
Cost 1,931,624.00€
Coordinator National Technical University of Athens
More info: http://www.linda-project.eu
02.07.2015Samos Summit 2015
- Linked Data is still an academic discipline and very hard for lay users to fathom. Furthermore, With the current state of the art the use of linked data concepts is costly and gives slow ROI as the learning curve is high.
This needs to change by creating useful tools and tool chains that abstract its compelxity
This is the motivation of linDA to bring lay…. And help expand…
- The LinDA project will create a set of tools combined into a distinct workflow to help with the seamless renovation of PSI und private data . - By renovation we mean transforming data and semanticallly enriching it to be able to unlock ist potential.
The workflow consists of the following steps
1. semantically enriching your data,
2. Publishing your data
3. Linking and querying your data
4. Visualising and analysing your data.
Today we will focus on transforamtion as this is the corner stone for our workflow.
But first lets look at a busniess scenario where the linda tools bring added value.
The following questions may arise:
How is OTC liberalisation related with healthcare expenditures and self-medicatiion?
Is the economical and political stability of a country somehow related to the OTC liberalisation and/or sales?
What are the trends in drug consumptions among different population groups?
Which countries will benefit most from OTC liberalisation
The focus of this presentation is on the transformation of structured and sem-strucured data into Linked Data.
The transformation lays the ground work for the followin steps in the presented workflow (visualisation, analytics, query etc.)
As in any project we conducted a SotA analysis that yielded the following results.
Emphasis lies on the existence of a simple intuitive UI targeted at non-Linked Data experts.
Reconcilation: is resolving a literal (e.g. Greece) to a unique linked data resource, such as http://dbpedia.org/resource/Greece
The UI is designed to give the user a view over his/her data along each step of the transformation together with useful and simple guidance.
In the case of csv or maybe excel files:
first we upload the data
then we select the columns containing the data we want to transform
then we start descibing the data by using drag/drop mechanisms to aid in constructing a unique subject URI to identify the triples.
As a next step we consult our Oracle Service to identify the most relevant and best matching classes in an ontology.
With this ontology finding technique we aid the user in one simple step to automatically and effortlessly descibe his/her data using public ontologies.
In order to enhance the RDF even more we first try to automatically reconcile each column to DB pedia resource (more linked data resouce sites will be added soon (e.g. wikidata, geoNames,etc)) while giving the user the option to select the best match out of a ranked list. With this approach we aid the user in getting closer to 5* linked data
For the columns that are left unreconciled auto type guessing is performed.
The final step in describing the data is the semantic enrichment. Here we again aid the user in finding adequate ontologies to describe the dataset, i.e what content is in each row.
In this case the dataset is a listing of a companies employees.