Will the Real
[or why we have to take matters into our own hands]
The “big” idea
• You can’t build a Spatial Web if you’re going to
IGNORE THE WEB
• Google & friends
• Next Steps…
Google vs. Catalogs
• Catalogs are ﬁne but they tend to hide
behind obscure interfaces (even if they are
• Google - billions & billions of entries
• Entire NASA metadata holdings for
EOSDIS ~ 10GB.
– A drop in Google’s
• Give me a link and I’ll crawl the world, give
me a CGI script and I’m stopped in my
• Meet me 1/2 way… How about dual-entry
catalogs? Browsers, bots, spiders, &
crawlers. (Did I mention Google?)
Query Engine “Link Portal”
• Web Logs
• “Latest on top” series of semi-random
• Comments allowed but usually a level
• Structure imposed by Blog package
• RSS describes top level
Content can be found in (yup!) Google…
• WikiWikiWeb - a communal experience
• Free form, internally interlinked
• Structure imposed by content authors
• RSS about updates to a page (not always
Google… it’s all there, just leave a link somewhere.
• Persistent URLs
• Either make sure yours never change or use a
service. (Former is more scalable).
• Use PURLs instead of UUIDs
• ONLY USE RESOLVABLE URI
– Even if you insist on URNs or outsource your PURLs
Will Google crawl URNs? How about PURLs? Hmmm…
What makes the Web GREAT?
• Well, yes.
• Well, YES!
• If you are a content provider, you should
provide links to your content!
• All Usenet postings since ca. 1980 are
available on… Google.
• We have alt.<your favorite kind of deviant
behavior here>, searchable on Google.
• Why not alt.metadata?
• Why not alt.metadata.opinions?
• Really Simple Syndication or RDF Site
• Structured information about content
• Blogs & Wikis generate it
• Can be generated by other programs as well
• Example (won’t work for you…)
• Let’s go ﬂatten a catalog
• Let Google ﬁnd it
• Work on a string-based location scheme
• Develop some useful tags
• Build some Blogs, Wikis, etc.
• Flood some poor newsgroup with metadata
• Generate some data discussion
• Play for 2-4 months. Google will get it. Then we
can decide if it was worth it.