Paul houle the supermen

616 views

Published on

Types in Freebase typically mean that something plays a role. For instance, superman is a :film.film_subject because he the subject of a film. He is an :amusement_parks.ride_theme because amusement parks have been made about him. There's nothing contradictory about this, at least to first order, because these types don't fit into a hierarchy.

  • Be the first to comment

  • Be the first to like this

Paul houle the supermen

  1. 1. Paul Houle - The Supermen http://blog.databaseanimals.com/the-supermen[7/9/2014 3:37:19 PM] The Supermen FictionalCharactersinDBpedia,FreebaseandotherGenericDatabases Paul Houle – Creatorofdatabaseanimalsandbayesianbrains June 27, 2014
  2. 2. Paul Houle - The Supermen http://blog.databaseanimals.com/the-supermen[7/9/2014 3:37:19 PM] This is based on a response to a person who was looking at a record for the D.C. comics character named Superman in :BaseKB, which is derived from Freebase
  3. 3. Paul Houle - The Supermen http://blog.databaseanimals.com/the-supermen[7/9/2014 3:37:19 PM] What I see in Freebase right now (June 2014) https://www.freebase.com/m/070vn#/award/ranked_item doesn't contain anything that strikes me as wrong, but there is a split discussion attached to it and it's a little fishy that the topic was created on June 4 2013. If I look at it with the gold copy of :BaseKB that dated March 2014, https://aws.amazon.com/marketplace/pp/B00KDO5IFA and do this query with the sql command SQL> sparql select ?type { <http://rdf.basekb.com/ns/m.070vn> a ?type .}; type LONG VARCHAR ___________________________________________________________ ____________________ http://rdf.basekb.com/ns/theater.theater_character http://rdf.basekb.com/ns/user.geektastique.superheroes.topi c http://rdf.basekb.com/ns/base.zxspectrum.topic http://rdf.basekb.com/ns/common.topic http://rdf.basekb.com/ns/award.ranked_item http://rdf.basekb.com/ns/base.ontologies.ontology_instance http://rdf.basekb.com/ns/base.tagit.concept http://rdf.basekb.com/ns/book.book_character http://rdf.basekb.com/ns/comic_books.comic_book_character http://rdf.basekb.com/ns/fictional_universe.fictional_chara cter http://rdf.basekb.com/ns/film.film_character http://rdf.basekb.com/ns/film.film_subject http://rdf.basekb.com/ns/tv.tv_character http://rdf.basekb.com/ns/base.fictionaluniverse.topic http://rdf.basekb.com/ns/cvg.game_character http://rdf.basekb.com/ns/user.duck1123.default_domain.prima ry_identity http://rdf.basekb.com/ns/user.duck1123.default_domain.adopt ed_character http://rdf.basekb.com/ns/amusement_parks.ride_theme http://rdf.basekb.com/ns/base.fictionaluniverse.cloned_char acter http://rdf.basekb.com/ns/user.geektastique.superheroes.supe rhero http://rdf.basekb.com/ns/user.jschell.default_domain.alter_ ego and I don't see anything that's obviously wrong there.
  4. 4. Paul Houle - The Supermen http://blog.databaseanimals.com/the-supermen[7/9/2014 3:37:19 PM] Types without Hierarchy
  5. 5. Paul Houle - The Supermen http://blog.databaseanimals.com/the-supermen[7/9/2014 3:37:19 PM] Types in Freebase typically mean that something plays a role. For instance, superman is a :film.film_subject because he the subject of a film. He is an :amusement_parks.ride_theme because amusement parks have been made about him. There's nothing contradictory about this, at least to first order, because these types don't fit into a hierarchy. This is similar to what people call this a 'duck type' in some programming languages, and this is the way an RDFS reasoner thinks. If we define :film.film.subjects a rdfs:Property . :film.film.subjects rdfs:domain :film.film . :film.film.subjects rdfs:range :film.film_subject . and tell the reasoner that :m.01_mdl :film.film.subjects :m.070vn . it infers that :m.01_mdl a :film.film . :m.070vn a :film.film_subject. It's liberating to not have types in a strict hierarchy. You'll hear people say :Person rdfs:subClassOf :Animal . but lawyers will tell you that :Corporation a :Person .
  6. 6. Paul Houle - The Supermen http://blog.databaseanimals.com/the-supermen[7/9/2014 3:37:19 PM] this is a contradiction, of course. The problem isn't any of the statements, it's the fact that we're using :Person to mean two different things. That is, a member of the species homo sapiens vs an entity that can be a party to a contract. The Foaf vocabulary partially resolves this problem by creating the foaf:Agent concept such that foaf:Person rdfs:subClassOf foaf:Agent . foaf:Organization rdfs:subClassOf foaf:Agent . if we know m.070vn a foaf:Person . the system infers m.070vn a foaf:Agent . By the language of the standard, ["Something is a Person if it is a person. We don't nitpic about whether they're alive, dead, real, or imaginary."(http://xmlns.com/foaf/spec/#term_Person) Though he's not a member of Homo Sapiens , he looks like a person, talks like a person and flies faster than a speeding plane so I guess he's a person. Practically, you could map the Freebase :people.person to foaf:Person and map :business.employer to foaf:Organization and feel comfortable. How you map things from there is more subjective. If you don't want your system to call Jack Bauer for help, you don't have to map fictional characters to foaf:Person . It's a choice you make based on what you want your system to think. Splitting hairs
  7. 7. Paul Houle - The Supermen http://blog.databaseanimals.com/the-supermen[7/9/2014 3:37:19 PM] In databases such as DBpedia, Freebase and Wikidata, concepts like "Superman" get overloaded. The trouble is that they take multiple forms; for instance Superman the character might have started in a comic book, but he has been in movies and TV shows and been the subject of pinball games, amusement park rides, video games, etc. So when you say there are multiple topics with the same id, you are right. Some people split topics finer than others do. Worse than that, since Superman has been around a long time there have been many different versions of him in the comic books. After the 1980's "Crisis of infinite worlds", Superman is officially the "last Kryptonian", the only survivor of Krypton's explosion. Before then there was Krypto, General Zod, and Supergirl but they all got wiped out, or sorta-kinda wiped out in the case of
  8. 8. Paul Houle - The Supermen http://blog.databaseanimals.com/the-supermen[7/9/2014 3:37:19 PM] http://en.wikipedia.org/wiki/Superman_prime http://en.wikipedia.org/wiki/Supergirl_(Matrix) Marvel is almost as bad, to the point where it is hard to make statements about a subject like "Iron Man"; for instance, the original Iron Man kept his identity secret from almost everybody, including Pepper Potts, He's completely open about it in the recent movies and comics. The Hulk has usually been named "Bruce Banner" except on the 1970s TV show where he was named "David Banners". Unless you split "Iron Man" and "The Hulk" into separate characters, you can't make statements about the most basic facts about them. Will the real Star Trek Stand up? You run into similar problems with "Star Trek", "Sailor Moon",
  9. 9. Paul Houle - The Supermen http://blog.databaseanimals.com/the-supermen[7/9/2014 3:37:19 PM] "Halo" and other media franchises. If you need a finer grained description of a domain like this, you could try to build it into Freebase or DBpedia through the community process or you can create your own database. This starts with writing a better schema, but there's the challenge that people might have a hard time populating that schema or using it. I'd imagine a crack ontologist who's obsessed with comic books would probably define 50 or 100 "Supermen" to model the illustrious history of what most people think of the one and only "Superman" We're only human There's a tension, however, between databases that are precise
  10. 10. Paul Houle - The Supermen http://blog.databaseanimals.com/the-supermen[7/9/2014 3:37:19 PM] versus databases that can be maintained by a community. What's in Wikipedia, for instance, is controlled by a battle between inclusionists and exclusionists over what is "notable" enough to be in Wikipedia. Star Trek is notable enough that each episode has its own page, yet there are no individual pages for the 13,088 episodes of General Hospital. Although Wikis dedicated to fictional words are encouraged on Wikia, detailed coverage of fictional worlds will always get pushback from deletionists in Wikipedia. We can have simple databases that everyone can contribute too, or more complex databases that require you to be an ontologist and a comic fan at the same time. It would be nice to see something though that's to Wikia what Freebase and DBpedia are to Wikipedia. Paul Houle Creator of database animals and bayesian brains    Read Next: RDFeasy DBpedia Experience © 2014 Paul Houle

×