A 1015 update to the 2012 "Data Big and Broad" talk - http://www.slideshare.net/jahendler/data-big-and-broad-oxford-2012 - extends coverage, brings more in context of recent "big data" work.
morning session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
The Unreasonable Effectiveness of MetadataJames Hendler
Invited talk at VIVO 2017 conference - explores the view of the semantic web as enriched metadata, and how that kind of information can be used in new and interesting ways.
A 1015 update to the 2012 "Data Big and Broad" talk - http://www.slideshare.net/jahendler/data-big-and-broad-oxford-2012 - extends coverage, brings more in context of recent "big data" work.
morning session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
The Unreasonable Effectiveness of MetadataJames Hendler
Invited talk at VIVO 2017 conference - explores the view of the semantic web as enriched metadata, and how that kind of information can be used in new and interesting ways.
Bringing Machine Learning and Knowledge Graphs Together
Six Core Aspects of Semantic AI:
- Hybrid Approach
- Data Quality
- Data as a Service
- Structured Data Meets Text
- No Black-box
- Towards Self-optimizing Machines
presented at the 2011 SemTech
Open government data and related services/applications are quickly growing on the Web. Although most agree that the government data has great potential in solving real world problems, there are still many challenges that must be addressed. This talk will describe several representative domain applications and provide concrete examples of evolving technical challenges remaining. We will show solution paths that have proven useful and make recommendations on the corresponding Semantic Web best practices.
• Scalability. How can we handle(e.g. search and cleanse) the 3,000+ raw/tool datasets, and the additional 300,000+ geo datasets from data.gov?
• Interoperability. Multi-scale open government data came from city governments, state governments, and national governments. How can one compare the GDP of the US and China, and later link to state-level financial data? Open government data covers many domains. How can one associate open government data with domain knowledge to build a cancer prevention application?
• Provenance and quality. How should provenance be leveraged to facilitate high-quality data management interactions (e.g. reuse, mash-up and feedback) between the government and the public?
LinuxCon 2010 Education Mini-Summit: The State of Open Data in Educationcomputercolin
A call for more, open data in education so we can foster innovative applications, better tools for teaching, and tons of interesting applications we haven't even thought of.
Briefing on US EPA Open Data Strategy using a Linked Data Approach3 Round Stones
An overview presented by Ms. Bernadette Hyland on 18-Nov 2014 on the US EPA Open Data strategy, focusing on the Resource Conservation & Recovery Act (RCRA) dataset to be published as linked data . This work is in support of Presidential Memorandum M13-13 - Open Data Policy and Managing Information as an Asset.
Linked Open Data and data-driven journalismPia Jøsendal
A keynote held at the Media 3.0 seminar in Bergen. It is an introductionary presentation of simple key elements of linked open data. It adresses media and journalists, what data driven journalism can look like and why they should care about what linked open data can offer.
In search of lost knowledge: joining the dots with Linked Datajonblower
These slides are from my seminar to the University of Reading Department of Meteorology, November 2013. They contain a (hopefully not very technical) introduction to the concepts of Linked Data and how we are applying them in the CHARMe project (http://www.charme.org.uk). In CHARMe we are using Open Annotation to connect users of climate data with community-generated "commentary information" that helps them to understand a dataset's strengths and weaknesses.
The slide notes contain some helpful context, so you might like to download the PPT file!
The slides are licensed as "Creative Commons Attribution 3.0", meaning that you can do what you like with these slides provided that you credit the University of Reading for their creation. See http://creativecommons.org/licenses/by/3.0/.
How Graph Databases used in Police Department?Samet KILICTAS
This presentation delivers basics of graph concept and graph databases to audience. It clearly explains how graph databases are used with sample use cases from industry and how it can be used for police departments. Questions like "When to use a graph DB?" and "Should I solve a problem with Graph DB?" are answered.
Data Science Institutes : kelly technologies is the best Data Science Training Institutes in Hyderabad. Providing Data Science training by real time faculty in Hyderabad.
US EPA Resource Conservation and Recovery Act published as Linked Open Data3 Round Stones
A presentation by 3 Round Stones to the US EPA on the new Linked Open Data Management System, including Linked Open Data on 4M facilities (from FRS), 25 years of Toxic Release Inventory (TRI), chemical substances (SRS), and Resource Conservation and Recovery Act (RCRA) content. This represents one of the largest Open Data projects published by a federal government agency using Open Source Software (OSS), Open Web Standards and government Open Data.
Bringing Machine Learning and Knowledge Graphs Together
Six Core Aspects of Semantic AI:
- Hybrid Approach
- Data Quality
- Data as a Service
- Structured Data Meets Text
- No Black-box
- Towards Self-optimizing Machines
presented at the 2011 SemTech
Open government data and related services/applications are quickly growing on the Web. Although most agree that the government data has great potential in solving real world problems, there are still many challenges that must be addressed. This talk will describe several representative domain applications and provide concrete examples of evolving technical challenges remaining. We will show solution paths that have proven useful and make recommendations on the corresponding Semantic Web best practices.
• Scalability. How can we handle(e.g. search and cleanse) the 3,000+ raw/tool datasets, and the additional 300,000+ geo datasets from data.gov?
• Interoperability. Multi-scale open government data came from city governments, state governments, and national governments. How can one compare the GDP of the US and China, and later link to state-level financial data? Open government data covers many domains. How can one associate open government data with domain knowledge to build a cancer prevention application?
• Provenance and quality. How should provenance be leveraged to facilitate high-quality data management interactions (e.g. reuse, mash-up and feedback) between the government and the public?
LinuxCon 2010 Education Mini-Summit: The State of Open Data in Educationcomputercolin
A call for more, open data in education so we can foster innovative applications, better tools for teaching, and tons of interesting applications we haven't even thought of.
Briefing on US EPA Open Data Strategy using a Linked Data Approach3 Round Stones
An overview presented by Ms. Bernadette Hyland on 18-Nov 2014 on the US EPA Open Data strategy, focusing on the Resource Conservation & Recovery Act (RCRA) dataset to be published as linked data . This work is in support of Presidential Memorandum M13-13 - Open Data Policy and Managing Information as an Asset.
Linked Open Data and data-driven journalismPia Jøsendal
A keynote held at the Media 3.0 seminar in Bergen. It is an introductionary presentation of simple key elements of linked open data. It adresses media and journalists, what data driven journalism can look like and why they should care about what linked open data can offer.
In search of lost knowledge: joining the dots with Linked Datajonblower
These slides are from my seminar to the University of Reading Department of Meteorology, November 2013. They contain a (hopefully not very technical) introduction to the concepts of Linked Data and how we are applying them in the CHARMe project (http://www.charme.org.uk). In CHARMe we are using Open Annotation to connect users of climate data with community-generated "commentary information" that helps them to understand a dataset's strengths and weaknesses.
The slide notes contain some helpful context, so you might like to download the PPT file!
The slides are licensed as "Creative Commons Attribution 3.0", meaning that you can do what you like with these slides provided that you credit the University of Reading for their creation. See http://creativecommons.org/licenses/by/3.0/.
How Graph Databases used in Police Department?Samet KILICTAS
This presentation delivers basics of graph concept and graph databases to audience. It clearly explains how graph databases are used with sample use cases from industry and how it can be used for police departments. Questions like "When to use a graph DB?" and "Should I solve a problem with Graph DB?" are answered.
Data Science Institutes : kelly technologies is the best Data Science Training Institutes in Hyderabad. Providing Data Science training by real time faculty in Hyderabad.
US EPA Resource Conservation and Recovery Act published as Linked Open Data3 Round Stones
A presentation by 3 Round Stones to the US EPA on the new Linked Open Data Management System, including Linked Open Data on 4M facilities (from FRS), 25 years of Toxic Release Inventory (TRI), chemical substances (SRS), and Resource Conservation and Recovery Act (RCRA) content. This represents one of the largest Open Data projects published by a federal government agency using Open Source Software (OSS), Open Web Standards and government Open Data.
AI en IP (Artificieele Intelligentie en Intellectueel Eigendom)voginip
Lezing door Fulco Blokhuis over de juridische aspecten die optreden bij generatieve AI, zoals ChatGPT, Dall-e e.d.
VOGIN-IP-lezing 28 april 2024 Amsterdam
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Show drafts
volume_up
Empowering the Data Analytics Ecosystem: A Laser Focus on Value
The data analytics ecosystem thrives when every component functions at its peak, unlocking the true potential of data. Here's a laser focus on key areas for an empowered ecosystem:
1. Democratize Access, Not Data:
Granular Access Controls: Provide users with self-service tools tailored to their specific needs, preventing data overload and misuse.
Data Catalogs: Implement robust data catalogs for easy discovery and understanding of available data sources.
2. Foster Collaboration with Clear Roles:
Data Mesh Architecture: Break down data silos by creating a distributed data ownership model with clear ownership and responsibilities.
Collaborative Workspaces: Utilize interactive platforms where data scientists, analysts, and domain experts can work seamlessly together.
3. Leverage Advanced Analytics Strategically:
AI-powered Automation: Automate repetitive tasks like data cleaning and feature engineering, freeing up data talent for higher-level analysis.
Right-Tool Selection: Strategically choose the most effective advanced analytics techniques (e.g., AI, ML) based on specific business problems.
4. Prioritize Data Quality with Automation:
Automated Data Validation: Implement automated data quality checks to identify and rectify errors at the source, minimizing downstream issues.
Data Lineage Tracking: Track the flow of data throughout the ecosystem, ensuring transparency and facilitating root cause analysis for errors.
5. Cultivate a Data-Driven Mindset:
Metrics-Driven Performance Management: Align KPIs and performance metrics with data-driven insights to ensure actionable decision making.
Data Storytelling Workshops: Equip stakeholders with the skills to translate complex data findings into compelling narratives that drive action.
Benefits of a Precise Ecosystem:
Sharpened Focus: Precise access and clear roles ensure everyone works with the most relevant data, maximizing efficiency.
Actionable Insights: Strategic analytics and automated quality checks lead to more reliable and actionable data insights.
Continuous Improvement: Data-driven performance management fosters a culture of learning and continuous improvement.
Sustainable Growth: Empowered by data, organizations can make informed decisions to drive sustainable growth and innovation.
By focusing on these precise actions, organizations can create an empowered data analytics ecosystem that delivers real value by driving data-driven decisions and maximizing the return on their data investment.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Chatty Kathy - UNC Bootcamp Final Project Presentation - Final Version - 5.23...John Andrews
SlideShare Description for "Chatty Kathy - UNC Bootcamp Final Project Presentation"
Title: Chatty Kathy: Enhancing Physical Activity Among Older Adults
Description:
Discover how Chatty Kathy, an innovative project developed at the UNC Bootcamp, aims to tackle the challenge of low physical activity among older adults. Our AI-driven solution uses peer interaction to boost and sustain exercise levels, significantly improving health outcomes. This presentation covers our problem statement, the rationale behind Chatty Kathy, synthetic data and persona creation, model performance metrics, a visual demonstration of the project, and potential future developments. Join us for an insightful Q&A session to explore the potential of this groundbreaking project.
Project Team: Jay Requarth, Jana Avery, John Andrews, Dr. Dick Davis II, Nee Buntoum, Nam Yeongjin & Mat Nicholas
Van de droom van het Semantic Web naar de realiteit van Linked Open
1. Creative Commons CC BY 3.0:
allowed to share & remix
(also commercial)
but must attribute
Frank van Harmelen
Vrije Universiteit
Van de droom van
het Semantic Web
naar de realiteit van
Linked Open Data
2. “There is lots of data we all use every day,
and it’s not connected.
I can see my bank statements on the web, and my photos,
and I can see my appointments in a calendar.
But can I see my photos in a calendar
to see what I was doing when I took them?
Can I see bank statement lines in a calendar?
No... Why not?
Because we don’t have a web of data.
Because data is controlled by applications
and each application keeps it to itself.”
The Semantic Web Dream
3. Twee problemen om op te lossen
Gedistribueerde informatie
Semantic
Web
Heterogene informatie
5. SW/Linked Data in 4 principles
1. Give all things a name
2. Make a graph of relations between the things
at this point we have (only) a Giant Graph
3. Make sure all names are URIs
at this point we have (only) a Giant Global Graph
4. Add semantics (= predictable inference)
Now we have a Giant Global Knowledge Graph
8. P3. The names are addresses on the Web
Allows integration of data from different owners
at different locations
geonames:..
T
[<x> rdf:type <T>]
different
owners & locations Dbpedia/Village
11. P4. explicit & formal semantics
• assign types to things
• assign types to relations
• organise types in a hierarchy
• empose constraints on
possible interpretations
O W L
12. Why is semantics hard for computers?
Or: What’s it like to be a computer ?
13. P4. explicit & formal semantics
Frank Bussum
birth-place
• has-birth-place
relates person to
location
Frank is person
• Has-birth-place relates
1 person to
1 location
Bussum = Meren
lowerbound upperbound
Meren
Has-birth-place
14. SW/Linked Data in 4 principles
Give all things a name
Make a graph of relations between the things
at this point we have (only) a Giant Graph
Make sure all names are URIs
at this point we have (only) a Giant Global Graph
Add semantics (= predictable inference)
Now we have a Giant Global Knowledge Graph
18. Is anybody using this for real?
Schema.org:
Vocabulary to describe “things on the web”
Agreed upon by all major search engines
600+ types, 1000 properties
used by 10M+ sites
shows up in 36% of all Google results
22. data.gov.uk
started in early 2010 with 3000 datasets
include Ordnance Survey data
• map safety of bicycle routes
• inform home buyers about their new neighborhood.
• school finder
• nursery finder
• pollution alert
• fix my neighbourhood
• regional expenditure map
• www.wheredidmytaxgo.co.uk/
24. Data.gov
6 Other nations establishing open data
Canada, Ireland, Norway, Australia, New Zealand
8 States now offering data sites
California, Utah, Michigan, Massachusetts, Washington, ...
8 Cities in America with open data
San Francisco, New York City, Austin, ...
Fact or Fiction?
Followers from data.gov
Linked Open Data for Open Government
World Wide
28. • TOOI: Overheid.nl | KOOP Waardelijsten
• Nieuwe metagegevensstandaard voor de
gehele overheid | Archive-IT
• Registratie onderwijs instellingen RIO - DUO
(onderwijsregistratie.nl)
34. • NXP is a semiconductor (microchip) manufacturer
• Established: 2006 (formerly a division of Philips) with 50+
years of experience in semiconductors
• Headquarters: Eindhoven, The Netherlands
• Customers include Apple, Bosch, Continental, Delphi,
Gemalto, Giesecke/Devrient, Huawei, NSN, Panasonic and
Samsung
• Portfolio of 26,000+ products
34
37. Uber
“When an eater enters a query, we try to understand their
intent based on our knowledge of food organized as a graph”
Uber Engineering blog, June 6 2018
39. PILOD platform
Platform Linked Data Nederland (pldn.nl):
Great source of practical info on Linked Open Data:
• learning resources,
• good use-cases,
LinkedDataParels2019.pdf
• steps to take
Editor's Notes
The good news: a distributed knowledge-base that describes hundreds of millions of items through tens of billions of relations between them, classifying them into hundreds of thousands of different classes, hosted on a web of thousands of different servers across the world, with fully distributed access and open to contributions from anybody. A knowledge-base on this scale, of this size and of such broad coverage would have been unthinkable 15 years ago, but it has now become reality under a variety of names such as the Semantic Web, the Linked Open Data cloud, or the Web of Data.The bad news: despite this success, we actually understand very little of the structure of the Web of Data. Its formal meaning is specified in logic, but with its scale, context dependency and dynamics, the Web of Data has outgrown its traditional model-theoretic semantics. Is the meaning of a logical statement (an edge in the graph) dependent on the cluster ("context") in which it appears? Does a more densely connected concept (node) contain more information? Is the path length between two nodes related to their semantic distance? Properties such as clustering, connectivity and path length are not described, much less explained by model-theoretic semantics. Do such properties contribute to the meaning of a knowledge graph?To properly understand the structure and meaning of knowledge graphs, we should no longer treat knowledge graphs as (only) a set of logical statements, but treat them properly as a graph. But how to do this is far from clear. In this talk, we'll report on some of our early results on some of these questions, but we'll ask many more questions for which we don't have answers yet.
We’re going to explain all of these.
Give all things a name (including non-physical things like a date, a year, a location, a movie, the color red, a disease, etc). That’s lots of names.
Make a graph of those names: nodes are the (names of) things, edges are the relations between them. Notice names for non-physical things like “1999”.
This creates a giant graph.
This slide is sloppy, all of these names should be URL, next principle
So now, anybody can assign any property to any object published by anybody else. Together this creates a giant GLOBAL graph
To make that giant global graph a knowledge graph, we need to assign formal meaning. That meaning will have a very simple structure. More or less the modelling primitives you find in any widely accepted modelling language
First, let’s remind ourselves how hard it is for computers to find “meaning” in anything. T
Mind-reading game to explain semantics.
If I show the audience the top triple, and we share a little bit of background knowledge in the square box (“ontology”), I can predict what the audience will infer from the top-triple. The shared background knowledge forces us to believe certain things (such that the right blobs must be locations) , and forbids us to believe certain things (such as that the two right blobs are different). By increasing the background knowledge the enforced conclusions (lowerbound on agreement) and the forbidden conlusions (upperbound on agreement) get closer and closer, and the remaining space for ambiguity and misunderstanding reduces. Not only misunderstanding between people, but also between machines.
Slogan: semantics is when I can predict what you will infer when I send you something.
We’re going to explain all of these.
From ivo@velitchkov.eu
1. For Morgan Stanley etc see case studies of Top Quadrant
2. For Voklswagen, Nokia, Daimler, Bosch, I couldn't find quickly an online resource but they are all clients of eccenca
3. I can't remember seeing Schneider Electric, which are heavy RDF user. You can find them along many others on Stardog's customer page
4. Philips, CreditSuise etc at PoolParty customer page.
5. Taxonic is now implementing Asset Managemnt system based on RDF at Schihol Airport but you should ask Jan if they are fine to associate their logo with that
6. I saw the logo of the European Commission, but not of European Council (SPARQL: http://data.consilium.europa.eu/sparql ) and Publications Office (SPARQL: http://publications.europa.eu/webapi/rdf/sparql)
We’ve seen this example
That works because all three major search engines are sharing a single very lightweight ontology.
From ivo@velitchkov.eu
1. For Morgan Stanley etc see case studies of Top Quadrant
2. For Voklswagen, Nokia, Daimler, Bosch, I couldn't find quickly an online resource but they are all clients of eccenca
3. I can't remember seeing Schneider Electric, which are heavy RDF user. You can find them along many others on Stardog's customer page
4. Philips, CreditSuise etc at PoolParty customer page.
5. Taxonic is now implementing Asset Managemnt system based on RDF at Schihol Airport but you should ask Jan if they are fine to associate their logo with that
6. I saw the logo of the European Commission, but not of European Council (SPARQL: http://data.consilium.europa.eu/sparql ) and Publications Office (SPARQL: http://publications.europa.eu/webapi/rdf/sparql)
The US government is publishing many many datasets in semantic web format. So that citizens and companies can re-use these data for their own purposes. (commercial, lobbying, education, science, etc)
Lots of Governments around the world do this.
In Europe too
From ivo@velitchkov.eu
1. For Morgan Stanley etc see case studies of Top Quadrant
2. For Voklswagen, Nokia, Daimler, Bosch, I couldn't find quickly an online resource but they are all clients of eccenca
3. I can't remember seeing Schneider Electric, which are heavy RDF user. You can find them along many others on Stardog's customer page
4. Philips, CreditSuise etc at PoolParty customer page.
5. Taxonic is now implementing Asset Managemnt system based on RDF at Schihol Airport but you should ask Jan if they are fine to associate their logo with that
6. I saw the logo of the European Commission, but not of European Council (SPARQL: http://data.consilium.europa.eu/sparql ) and Publications Office (SPARQL: http://publications.europa.eu/webapi/rdf/sparql)
From ivo@velitchkov.eu
1. For Morgan Stanley etc see case studies of Top Quadrant
2. For Voklswagen, Nokia, Daimler, Bosch, I couldn't find quickly an online resource but they are all clients of eccenca
3. I can't remember seeing Schneider Electric, which are heavy RDF user. You can find them along many others on Stardog's customer page
4. Philips, CreditSuise etc at PoolParty customer page.
5. Taxonic is now implementing Asset Managemnt system based on RDF at Schihol Airport but you should ask Jan if they are fine to associate their logo with that
6. I saw the logo of the European Commission, but not of European Council (SPARQL: http://data.consilium.europa.eu/sparql ) and Publications Office (SPARQL: http://publications.europa.eu/webapi/rdf/sparql)
Journalists re-use bits of information, text and images from other journalists all the time. Semweb technology made that process more efficient. The BBC website, powered by SemWeb technology was the busiest website in the world during the London Olympic Games.
And yes, just as XMP made an ontology about their electronic products, the BBC made an ontology about Olympic sports.
This company had so many variations on their products that their own engineers couldn’t find the specs of each others designs any more.
After it was such a success for their own engineers, they also made portions of it open to their customers.