3. 1.1 billion edges
84 million nodes
(May 2020)
‘sister’ of Wikipedia
Q: pictures of animals with female grammatical gender
in German but male grammatical gender in French
4. Common sense
the basic ability to perceive, understand, and judge things that
are shared by nearly all people and can be reasonably
expected of nearly all people without need for debate
5. Research questions
Q1: Does Wikidata contain relevant commonsense knowledge?
Q2: If so, is this complementary to other commonsense knowledge sources?
6. Principles of Commonsense Knowledge
P1: Concepts, not entities
Houses have rooms
Versailles Palace has 700 rooms
P2: Commonness
Container used for storage
Noma subclass of aphthous stomatitis
P3: General-domain knowledge
wheel is part of a car
cholesterol has component cell membrane
7. Principles of Commonsense Knowledge
P1: Concepts, not entities
Houses have rooms
Versailles Palace has 700 rooms
Keep nodes with lowercase
alphanumeric characters
P2: Commonness
Container used for storage
Noma subclass of aphthous stomatitis
P3: General-domain knowledge
wheel is part of a car
cholesterol has component cell membrane
8. Principles of Commonsense Knowledge
P1: Concepts, not entities
Houses have rooms
Versailles Palace has 700 rooms
Keep nodes with lowercase
alphanumeric characters
P2: Commonness
Container used for storage
Noma subclass of aphthous stomatitis
Frequent words ~ common concepts
Usage stats on a large (independent!) corpus
P3: General-domain knowledge
wheel is part of a car
cholesterol has component cell membrane
11. Principles of Commonsense Knowledge
P1: Concepts, not entities
Houses have rooms
Versailles Palace has 700 rooms
Keep nodes with lowercase
alphanumeric characters
P2: Commonness
Container used for storage
Noma subclass of aphthous stomatitis
Frequent words ~ common concepts
Usage stats on a large (independent!) corpus
P3: General-domain knowledge
wheel is part of a car
cholesterol has component cell membrane
Take the top 50 relations (97.4% of all edges)
Annotate: domain-specific?
Annotate: map to ConceptNet relations
17. Discussion
1. Integrating Wikidata-CS with ConceptNet and other sources
2. Generalizing over instance-level knowledge
a. birthplace of people -> functional property
3. Missing knowledge types
a. typical/expected quantities (chairs have 4 legs, spiders have 8)
b. agent goals (compete in order to win)
c. symbolism (red - danger)
18. Conclusions
Common concepts & general relations allow us to distill Wikidata-CS
Wikidata contains some commonsense knowledge (0.01%)
Very little overlap with existing commonsense KGs
Future work:
1. enrich common sense coverage of Wikidata
2. integrate commonsense knowledge across sources