The Real-Time Web and its Future


Published on

Published in: Technology, Business
1 Comment
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

The Real-Time Web and its Future

  1. 1. The Real-Time Weband its FutureEdited by Marshall Kirkpatrick
  2. 2. The following report is based largely on insights shared generouslyfrom these interviewees:Aardvark LexalyticsAdrian Chan Marnie Webb, CompumentorAlertSite MendeleyAllVoices NomeeAmber Case Notify.meBacktype Nozzl MediaBernardo A. Huberman, OLarkHP Social Computing Lab OneRiotBeth Kanter OrSiSoBlack Tonic PBWorksBrad Fitzpatrick, Google PipioBrett Slatkin, Google PostrankChris Messina Steve GillmorCitySourced SuperfeedrCliqset SysomosCollecta Ted Roden, NY Times/DeWitt Clinton, Google EnjoysThingsEvri The American Red CrossFactery Labs ThreadsyFaroo TibcoFirstRain TweetmemeJay Rosen, NYU TwinglyJohn Borthwick, BetaWorks Urban AirshipJS-Kit Warner Bros.Kaazing WowdKevin Marks, BT YourVersion
  3. 3. Contents1 a. What is the real-time Web? Beyond Twitter and Facebook 2 b. Matrix of issues and companies 42. Case studies 5 a. Ted Roden puts real-time into and the New York Times 6 b. Superfeedr: Transforming the legacy Web into real-time 9 c. Real-time as a trigger: Evri’s news-parsing technology 11 d. How Warner Brothers uses the real-time Web in the music business 13 e. Urban Airship does real-time mobile push 15 f. Nozzl Media: Bringing real-time to old media 17 g. Aardvark and the real-time Web of people 20 h. Mendeley and the real-time Web of science 23 i. Black Tonic re-imagines the real-time Web as a controlled experience 26 j. At the Red Cross, the real-time Web saves lives 283. Key players 31 a. John Borthwick: thoughtful prince of the real-time Web 32 b. Chris Messina: Rebel with a proposed technical standard 37 c. Brett Slatkin, Brad Fitzpatrick and PubSubHubbub 41 d. Steve Gillmor: The real-time Web’s leading journalist 45 e. Another 15 important people to follow to understand the real-time Web 514. Sector overviews 56 a. Stream readers: Interfaces for the real-time flow 57 b. Real-time search: Challenges old and new 68 c. Text analysis and filtering the real-time Web 725. Visualizations 75 a. The path to value 76 b. Real-time in conjunction with the static or slower Web 77 c. Information overload 786. Selected background articles on real-time technology 79 ReadWriteWeb | The Real-Time Web and its Future | 1
  4. 4. What is The Real-Time Web? Beyond Twitter and Facebook Dave Winer defines the real-time Web in four words: “It Happens Without Waiting.”1 That’s true, and appropriately vague. The phrase “real-time Web” means different things for different people and it’s too early in the game to have anything but a loose, inclusive definition. Many of the different forms the real-time Web takes do have some common benefits, user experience elements, lessons learned, pitfalls and possibilities. This is what we explore in this report. It’s definitely a whole lot more than just Twitter and Facebook, though these are the best known instances of what’s referred to as the real-time Web. Someday Facebook may open up its user data and play a larger role in the real-time Web than just the introduction to the stream model that it plays today. Someday Twitter may grow, discover how to retain users and effectively encourage more than the small number of people who today create the vast majority of content on that service. Today engineers estimate that Twitter sees about 1 thousand messages published per second and between 5 and 10 million links shared per day, before de-duplication. That sounds like a lot, but the real-time Web as a whole is already much, much larger than Twitter. For infrastructure provider Kaazing, the real-time Web is using HTML5 Web Sockets technology to push live financial information to the Web browsers of banking customers that had always been limited to desktop applications for security reasons. For consumer web app, the real-time Web is creating an XMPP-powered chat-like experience for users to communicate with friends around objects like a Google Map or a streaming Netflix video playing in the web OS. For semantic recommendation company Evri, the real-time Web is the ebbing and flowing of traffic data on Wikipedia. That data points to hot topics that Evri needs to build topic pages to serve their publisher customers. 1 | ReadWriteWeb | The Real-Time Web and its Future
  5. 5. For search engine OneRiot, the real-time Web is made up of the links people share on Twitter wellas Digg, Delicious and the click-streams of more than a million users who have opted-in to exposingwhat they see online through the OneRiot toolbar.For Q&A service Aardvark, the real-time Web is the people inside the social circle of a user whohappens to be available online at a given moment and interested in the topic of a user’s question.There are hundreds of thousands of blogs that now deliver updated content to any other applicationthat subscribes to a PubSubHubbub or RSSCloud feed, immediately after that content is published.NYU Journalism Professor Jay Rosen says the real-time Web creates a sense of flow for users that’scomparable to the way television holds our attention. Google’s Brett Slatkin, developer of thePubSubHubbub real-time protocol, says the real-time Web is a foundation for efficient computing anduse cases we can’t yet even imagine.In writing this report we interviewed 50 people who work on technologies that power or leveragewhat they consider to be the real-time Web. Those people have had a very diverse array of experiences,but articulate a common story. It’s a story of increased computational efficiency – and softwarethat struggles to keep users from feeling overwhelmed. It’s a story of radically new possibilitiesbut strategies based on adding value in conjunction with more traditional, slower moving onlineresources.We hope you enjoy reading this overview of the emerging real-time Web. We believe this phenomenonis one that will play a major role in the Web and world of the future. The page-based model ofdestination sites, created by centralized expertise and navigated through authority-based search andclicking link by link is being transcended. We think this survey of current strategies and experiencesto date will prove very useful in helping you effectively participate in and help build the future of thereal-time Web. ReadWriteWeb | The Real-Time Web and its Future | 3
  6. 6. Matrix of Issues and Companies This matrix allows you to navigate the contents of this report by topic. For example if you are a User Experience expert, the second column shows you where the most relevant content for you is. STANDARDS, DATA CHANGING BENEFITS OF USER ANALYTICS & NORMALIZATION OLDER REAL-TIME AS REAL-TIME EXPERIENCE ADVERTISING & TEXT ANALYSIS ORGANIZATIONS A SERVICE CASE STUDIES NYT • • • SuperFeedr • • Evri • • • Warner Bros. • • • Urban Airship • • Nozzl Media • • • Aardvark • • Mendeley • • • Black Tonic • • • Red Cross • • PEOPLE PROFILES John Borthwick • • • • Chris Messina • • • • Slatkin/ Fitzpatrick • • • • Steve Gillmor • SECTOR OVERVIEWS Stream Readers • • Search • • • • Text Analysis • • •4 | ReadWriteWeb | The Real-Time Web and its Future
  7. 7. Case Studies
  8. 8. Ted Roden puts real-time into and the New York Times By day, Ted Roden works on the very top floor of the New York Times building, in the R&D department. The Times has a great team of engineers: it does cutting-edge work in APIs, data visualization and computer-assisted reporting. Roden does work with real-time data at his day job, but he gets full creative freedom when working on a side project called The primary contributions that Ted Roden makes to understanding the real-time Web include articulating the following: • The material benefits of going real-time; • The importance of user experience; and, • The changing landscape in analytics and advertising. Roden is also writing a book about real-time for O’Reilly Publishing. We had a conversation with him about what happened after he added a real-time feed to He articulates well some of the biggest advantages of a real-time infrastructure. is a visual bookmarking site, like Delicious for images and other media. Even bookmarked text snippets are highlighted visually. User experience is a key consideration in all of the site’s developments, and the service is a lot of fun to use. This summer Roden added a premium subscription option to the site, called Joy accounts. A Joy account costs $20 per year for access to all current and forthcoming premium features, or users can pay $5 for an individual premium feature, such as disabling ads on the site or being able to view NSFW content. One of the features that Joy account holders get is access to a real-time view of new shared content. That real-time stream can be viewed in any browser but may be best served up in a Firefox sidebar. A real-time feed as an up-sell value add? That’s remarkable, and Roden says the response has been positive. The sidebar is simple but compelling. New content, including images, is pushed live into the side of the browser as soon as it’s shared on the site.6 | ReadWriteWeb | The Real-Time Web and its Future
  9. 9. At first, Roden said he used AJAX to poll his site every few seconds. Then he switched to a is still very small, but the implications of adding real-time to this site could likely benefitsites of any size.1. INCREASED TIME-ON-SITE“People leave it open all day long,” Roden said of the sidebar. “Time-on-site has seen a huge increase.It’s like when the new content comes in on the Facebook Live Feed: if you know it’s about to pop in fiveseconds, you’ll stick around.”A number of different factors are making time-on-site an increasingly important metric on the Web,compared to page views. Increased consumption of video is the best known, but as real-time streamsof aggregated content become increasingly common, increased time-on-site will be an importantmeasurement of how successful an implementation is.2. DECREASED SERVER COSTSAfter implementing real-time infrastructure, Roden reports that “my site runs a lot more smoothly. I’llprobably move the whole site to that technology, because deep down it’s much easier on the databasefor me.”“ I used to get hit by Stumbleupon and [the site] would start to crawl. Then I changed to some of this real-time stuff, and I’ve reduced the number of servers. Instead of the users sitting on the page and refreshing, I push it out to them. My EC2 bill has gone way down.”Roden’s experience complements the story that Google’s Brad Fitzpatrick told us about usingPubSubHubbub to push feeds to deliver shared items in Google Reader to FriendFeed. Changingfrom polling to real-time push cut traffic between the two sites by 85%. Likewise, magazine-style feedreader Feedly says that the part of its service that now consumes PubSubHubbub from Google Readerhas seen a 72% reduction in bandwidth.3. ADVERTISING COMPLICATIONS“Analytics totally change,” Roden told us. “If you never click around off the home page, then GoogleAnalytics says it’s one page view. Now if you’re pushing stories to the top of the page, then you don’tknow how many stories people have seen unless you start measuring differently.”“ Measuring user engagement totally changes. People use in a sidebar in Firefox: do I count that as whole page view? Do I count it as one, even though some people have it open for eight hours? Can you convince an ReadWriteWeb | The Real-Time Web and its Future | 7
  10. 10. advertiser that they are going to see an ad 100 times while looking at a page just once, and do they want that? For projects like, it’s going to be a scary world out there for advertising for a while.” Roden has been placing display ads in the real-time feed and prioritizing the attractiveness of the creative. That’s been somewhat effective so far, but he says it’s very early days in advertising in a real- time model. He says that real-time won’t be an effective differentiator for ad sales in the future because everything will be real-time. “Otherwise it’s like looking at a Word doc in a Web browser. It has to be real-time,” Roden says. Ted Roden says that at the root of the change towards real-time is a long list of emerging technologies that make it easy. “It’s blowing my mind how quickly the tech is coming out,” he told us. What’s Roden most excited about now? Tornado, the highly scalable, open-source real-time infrastructure released by Facebook after its acquisition of FriendFeed. He’s switched all his prototypes at the New York Times to it. “I’ll be really interested to see if people pick that up as quickly as they did Django,” he says. “It’s an easy framework to work with.” The technology is becoming easier and easier, now it’s largely just the frame of mind that has to change. “It’s not hard to write real-time code,” says Roden, “but if you’re in a LAMP mindset, that doesn’t scale in real-time.” See also: • Ted Roden’s shared items on at; • Roden’s Delicious bookmarks (technical) at; • Roden on Twitter at; • New York Times Labs on Twitter at | ReadWriteWeb | The Real-Time Web and its Future
  11. 11. Superfeedr: Transforming theLegacy Web into real-timeSuperfeedr’s slogan is, “We’re doing something stupid sothat you don’t have to.” Julien Genestoux’s Superfeedr is aservice that pulls in content feeds from around the Weband then offers updates for those feeds in XMPP orPubSubHubbub format.Superfeedr’s primary contributions to understanding the real-time Web include articulating thefollowing:• The opportunity to add value through technological transformation of legacy resources into real-time;• The ease of leveraging real-time, normalized data through use of services such as Superfeedr;• How consumer markets may not be as prepared for real-time data as developers.That means, instead of polling feed publishers over and over again to check for new updates, afeed-consuming service can just sit and wait for Superfeedr to deliver updates automatically as theybecome available. The publisher doesn’t even have to publish real-time feeds: Superfeedr takes care ofthat. It’s real-time-as-a-Service.“We don’t just do polling,” Genestoux says. “For each feed, we actually try to determine what is themost appropriate way to get the updates: PubSubHubbub, RSSCloud, SUP, specific APIs (Twitterstream, etc). We do polling as a failover.”One year ago, Julien Genestoux launched a service called Notifixious. It delivered real-time updatesfrom any feed to a user’s IM client or email. Ten thousand people signed up for it, but 90% of themwere having just one blog delivered, usually by email. Not an inspiring predicament for Genestoux.A very small subset of users were using the service to follow thousands of blogs. Genestoux inquiredand learned that they were using the service like an API. “The vast majority said they would pay to dothis, too,” Genestoux told us,” as long as it was cheaper than doing it themselves.”Superfeedr now offers just that: transformation of feeds into real-time, at lower than the cost of yourcurrent feed-parsing system and in 15 minutes or less after publication – or your money back. ReadWriteWeb | The Real-Time Web and its Future | 9
  12. 12. The company is working on lowering that to 3 minutes or less. Your first 1000 feeds are free; if you want to consume more than that, the company charges $1 for every 2000 items it delivers. Superfeedr pings feeds once and shares updates with all subscribed customers, dramatically lowering the polling overhead in the RSS ecosystem. “We also do feed normalization to make things easier for the subscriber and avoid the hassle of dealing with RSS/Atom + namespaces,” says Genestoux. Google’s Brett Slatkin, the primary developer of PubSubHubbub, is very supportive of what Superfeedr is doing. Genestoux says the companies using his service so far include SixApart, Adobe, Twitterfeed, StatusNet and a number of small services such as Webwag, EventVue, Quub, AppNotifications, and SmackSale. “So many services fetch feeds from other services,” Genestoux says. “The market is huge. In the end, everyone’s going to need real-time. It’s going to be the differential between services.” Genestoux firmly believes that the real-time Web will have the biggest impact on developers, not consumers. “The fact that services do not need to poll over and over, as well as have access to ‘normalized’ data, considerably lowers the bar to allow ‘free data’ to flow from one service to another. Up until now, if you wanted your app to include data from other apps, you had to massively invest in that (see Friendfeed), and maintaining such a component was a nightmare. If you make this ‘data flow/ stream’ transparent to the services, you start seeing richer mashups and apps that integrate data from others. I sincerely think that more than for end users, the real-time will eventually change how Web apps are built and interact together.” What’s the downside? Genestoux admits that not all companies are comfortable relying on a third- party service for this kind of functionality. Superfeedr went down for several hours one evening in November. Genestoux wrote a blog post discussing the problem and his solution.2 Superfeedr isn’t the only real-time-as-a-service company online. Others we’ve spoken to include and Kaazing. Surely, there are many more. But when it comes to lightweight feed- transformation services that are developer-friendly and engaged in cutting-edge Web technology conversations, Superfeedr certainly fits the bill. See also: Julien Genestoux’s lifestream of links and bookmarks at Genestoux is on Twitter at His circle on Twitter3 includes: • Ilan Abehassera, NY entrepreneur • Stephane Delbecque, SF entrepreneur • Sylvain Hellegouarch, French developer • Johann Romefort, CTO at Seesmic • Guillaume Dumortier, SF entrepreneur 2 3 | ReadWriteWeb | The Real-Time Web and its Future
  13. 13. Real-time as a Trigger: Evri’sNews-Parsing TechnologyEvri is a semantic Web recommendation service for onlinepublishers. The company tracks the real-time Web to knowwhen it needs to create or update a topic page for one of itsemerging news topics.The primary contributions that Evri makes to understanding the real-time Web include articulatingthe following:• Creative ways that real-time and slower moving data sources can be used together to create value;• Wikipedia as a source of real-time data beyond Twitter and Facebook. We’ve seen Wikipedia used by other services before for disambiguation, but not as a source of real-time trending topics data;• Another example of text analysis as a very important part of a service provider working on time- sensitive content delivery; and• Struggles experienced by forward-looking startup companies seeking to bring real-time services to older businesses, in this case publishers.Evri watches news sources to see when a news topic is trending, including articles on Wikipediathat publicly available data shows have leaped in page views. Then it visits structured databaseslike Wikipedia and FreeBase to check for updates to entries about related entities. It then creates orupdates a topic page with news links, photos and Twitter search results. The language used in thoseTwitter posts is analyzed and the names of news entities in the posts are linked to other Evri topicpages, like pivots.“We’ve got it down to 15 minutes from when an event happens to when facts get updated,” DeepDhillon, CTO of Evri, told us. “Nothing is manual.” That may have been true of Patrick Swayze’s death,as Dhillon pointed out in our interview, but it was not true of the death of anthropologist ClaudeLevi-Strauss. The Levi-Strauss topic page was filled with news of his death, but for hours afterward theexcerpt from Wikipedia on his date of birth and death had not been updated to match the informationabout his death that Wikipedia and Freebase contained.“Another example is emergent entities,” Dhillon said. “The day after Michael Jackson died, therewas a bunch of info online about Conrad Murray, the physician. Within minutes, we had structured ReadWriteWeb | The Real-Time Web and its Future | 11
  14. 14. information for a page but also for the rest of the system to link his ID with things like physician, Michael Jackson. It ripples through our whole system. We have some API customers that are all about emergent entities – we’re not just going to say that Conrad Murray is a person and a male.” It’s a work in progress and Dhillon acknowledges that more work has to be done, on text analysis in particular. Evri is working with the publishers that it draws content from (it’s wider than just a Web search) on matters such as structured data and push notifications. The publishing industry has a lot of catching up to do, though, in moving on from old content management systems that did little to create meta data. The content that Evri receives for analysis comes in various forms (National Imagery Transmission Format is one of the most common), and it has a wide variety of problems, but Dhillon says that publishers have a motive to make sure their product is annotated. More obvious is the incentive to do push notifications, Dhillon says. Timeliness being an advantage for Google ranking is obvious. In the future, then, everyday publishers may push highly structured content out to aggregators for analysis, but today Evri is watching the real-time Web for news spikes, then using those as a trigger to go out and query other parts of the Web. See also: • Deep Dhillon on Twitter • Deep Dhillon’s blog | ReadWriteWeb | The Real-Time Web and its Future
  15. 15. How Warner Brothers usesThe real-time Web in the Music BusinessEthan Kaplan is VP of Technology at Warner Brothers Records,and he’s a pretty savvy guy. He has built a real-time dashboardto display the number of people who visit each WarnerBrothers artist website at any given time. When a site spikeson the dashboard, the team can hover over that part of thebar graph and see search results from blogs, Twitter andelsewhere to determine what caused the increase in traffic andto respond immediately.The primary contributions that Ethan Kaplan offers to understanding the real-time Web are articulatingthe following:• A legacy industry capable of taking new forms of action based on substantially decreased delays in information delivery;• The value of having your own data in real-time, instead of relying entirely on third parties;• Opportunities that arise from being able to create interfaces for real-time data display in-house; and• Opportunities still untapped when real-time data is analyzed in bulk.Kaplan tells us:“ We used to be oriented around getting data only once a week, because that’s how it was fed to us from SoundScan, Mediabase, etc. We’d then reconcile that data against our plan for the week.” “Now we’ve got a whole back end that exposes data in near and real-time: purchases going through the system, site visits, visitors logged in, comments left. The culture of that real-time environment has impacted how bands are being marketed and products are created. People want more and more real-time. “One day, for example, I saw a site with marginal traffic that suddenly had 7,000 people on it. We did a Twitter search, checked [celebrity blog aggregator] and found out that the artist was having a baby. No one told us! We immediately started planning to change the merch on the site, maybe have ReadWriteWeb | The Real-Time Web and its Future | 13
  16. 16. a baby shower; we added a poll asking people if they thought it would be a boy or a girl; all steps to take advantage of the traffic that was coming to the site at that moment. “Something like that happens every day. One of the sites might be trending more than usual because the artist just released a record. We can correlate and react right away. Omniture is good data, but it’s not as fast as we have here.” Kaplan says the next step is to expose this data in ways that best suit different people throughout the company. He views it through an Adobe AIR application that he built in dashboard form, but different departments have different needs. He’d like to figure out effective ways to present that data all the way up to the CEO level. Traffic data is just one type of information that the company sees. Kaplan says Warner Brothers knows, for example, that promotions on Twitter tend to get higher click-through rates but lower conversions than promotions on Facebook. Authentic artist sites, even if they aren’t as contemporary in design as, say, Facebook, remain very important to the online music ecosystem. Kaplan says he’d like to see all data the company captures, including anonymous user-specific data in aggregate, run through artificial intelligence systems that quantify and detect patterns of engagement. “This user did X,Y and Z in a time period. That’s a huge amount of computation,” Kaplan says. He told us that he’s looking at Mapreduce, cluster analysis and other methods, but the big takeaway for him is that the company can do a lot because4 it has the raw data and understands what types of data it needs. Lessons learned? “We’re still at such an early stage that we don’t have any lessons learned,” Kaplan says. “We’re just constantly learning new things.” See also: Ethan Kaplan’s personal blog Jeremy Welt, SVP of New Media at Warner Bros Records Kaplan’s circle on Twitter includes: • Eston Bond, Fox Entertainment • Mathew Ingram, Toronto Globe and Mail • Kyle Neath, GitHub • Andy Gadiel, • Mikael Mossberg, Warner Bros. 4 | ReadWriteWeb | The Real-Time Web and its Future
  17. 17. Urban Airship doesreal-time Mobile PushUrban Airship is a mobile phone push-notification and in-app sales-infrastructure provider. The company powers pushnotifications for a wide variety of customers, large and small,filling a gap created primarily by Apple’s implementationof push in a way that’s just complicated enough for manydevelopers to believe it warrants outsourcing.Urban Airship’s primary contributions to our understanding of the real-time Web include articulatingthe following:• The wide variety of potential use cases for real-time, including onto mobile platforms;• Another example of a real-time service provisioning as a business;• Limitations introduced by delivering real-time data through networks owned by other companies.Starting with the iPhone but aimed at cross- and multi-platform mobile services, Urban Airship told usa number of interesting things about its experience with real-time information delivery:• Machine-to-machine real-time messaging is now cheap and relatively easy to implement.• You can now get updates on a wide spectrum of activities. The technologies to deliver notifications are evolving faster than the use cases, and there remains some question of just what to do with these real-time capabilities. A number of real-time companies have told us that the technology is dropping in price and complexity so quickly that people are looking for particular ways to implement a clearly compelling general concept (real-time messaging). In other words, the real- time Web may be more tool-driven than demand-driven so far.• Use cases that Urban Airship has seen so far range from mobile social games to reminder apps to mobile storytelling that uses push notification to let a plot unfold over time. The company says it has other customers in sports and medical fields that it can’t discuss publicly. One that has just launched is a prescription drug-tracking service that pushes notifications soon before a user’s prescription needs to be refilled. ReadWriteWeb | The Real-Time Web and its Future | 15
  18. 18. • Push notifications have been used most visibly by media companies to send simple messages, but each iPhone push can carry a payload and allow recipients to take actions such as voting or approving a purchase. “There will be richer content in the future, not just a line of text,” founder Scott Kveton told us. “It’s going to move from alerts to real-time interactive: more personal, more social.” • Scaling large quantities of high-priority real-time information remains a challenge. (Shortly after our interview, Urban Airship launched a product aimed at filling this need.) • One very new expectation that clients have of many developers who they hire is an ability to quickly build out real-time features. • Push notifications on the iPhone also require a download; push can come only from apps on the phone. So, Urban Airship says it is a cheap and easy mobile push-notification service and that rich use cases of the future are limited only by our imaginations.16 | ReadWriteWeb | The Real-Time Web and its Future
  19. 19. Nozzl Media: Bringing real-timeto Old MediaSteve Suo and Brian Hendrickson were newspaper guysfor decades. Then the confluence of declining revenue andinstitutional risk-aversion, during a period of historic changefor the industry, led them to leave those institutions and strikeout on their own. Suo has a background in automated public-records extraction and analysis and Hendrickson in real-time.The primary contributions to understanding the real-time Web that Nozzl Media offers are articulatingthe following:• The gap between legacy publishing and the real-time Web; that’s both opportunity and barrier;• Another filtering strategy: user-centric, client-side and full-text, instead of strategic, programmatic and as a pre-determined value-add; and,• The opportunity available in transforming old data into real-time.Early this year, another long-time newspaper guy, Steve Woodward, joined them to found a startupcalled Nozzl Media. Nozzl aims to help newspapers embellish their original content with a real-time,filterable stream of hyper-local public records, news and blog posts. The company is building a mobileWeb app and Web page widget that push that content live to readers.Public records tend to be largely inaccessible, relegated to arcane, search-driven websites and dumbPDFs. Nozzl Media says it has built technology to extract that information, put it in geographic contextand push it live to the Web as soon as it’s discovered. ReadWriteWeb | The Real-Time Web and its Future | 17
  20. 20. Nozzl is doing a number of particularly interesting things. PUBLIC RECORDS Nozzl extracts public records of interest – including Occupational Health and Safety Administration (OSHA) citations to businesses, approved building permits and doctors’ licensing information – from online repositories with what Nozzl calls its “automated form-pumping robot.” Many computer-assisted reporting specialists write scripts to perform one-off acts of data extraction for their research, but Nozzl has built software to perform these functions systematically, regularly, reliably and behind the scenes – and then make the information available in a published stream in real-time. The results can be quite interesting – and could qualify as news content. Is the raw feed of public records valuable, though? Or is a journalist with a trained eye still needed to find the real news in the feed and put it into context? Presumably, both the raw feed and the journalism it enables will support one another, but a raw feed of public records could possibly have a signal-to-noise ratio that no one but a journalist would find compelling. The fire hose is valuable, but sometimes the hand of a skilled, real-time curator is more valuable. Nozzl Media specializes in pushing the fire hose to the public as an act of media. Finding new forms of information that haven’t been available in real-time and making them easily available is a meaningful addition of value. People say that information about more and more social activities are becoming available as data – but someone has to build the infrastructure for that to happen, and that requires more technology in certain milieus than in others. Government data – so often made available in unsyndicated, opaque PDF files – is particularly challenging. Dislodging it into the cloud, then, becomes a particularly valuable act. LIVE FILTERING The Nozzl team built its Web page widget with a live jQuery feature that allows for filtering of the current corpus of data on the fly; items on display are filtered as each letter is typed by the user in the filter box. It looks like Google Suggest in reverse. Filtering the flow of data is something that every company in this space is talking about, and Nozzl has a unique way of doing it. Real-time, on-demand, full-text filtering at the user’s fingertips may or may not be a compelling user experience. It’s an option, though, that stands in contrast to the text analysis, entity extraction and imposed categorization that other filtering strategies emphasize and are slowed by. EMBELLISHING LEGACY CONTENT The time-frame for freshness in publishing is shrinking rapidly. While online publishing was so much faster than print publishing that it disrupted an entire industry, the manual creation of original content is the slow horse in the race online. That doesn’t mean it’s not valuable; it’s the primary source of value for the institutions in question (newspapers), but it’s not necessarily sufficient.18 | ReadWriteWeb | The Real-Time Web and its Future
  21. 21. Embellishing the original content of newspapers with local real-time content is reminiscent of the oldnewswire model of newspapers syndicating AP or Reuters content. Will it save newspapers? A dose ofFacebook newsfeed-style delivery of things like new doctor licenses, local restaurant health violationsand aggregated blog posts on a newspaper website? That could make a big difference. Time will tell.NEWSPAPER RETICENCENozzl originally intended to focus on a mobile Web app, or delivering content for newspapers thatneed mobile apps. The company believes that newspapers and broadcasting organizations in generaldo not yet have effective mobile implementations; spend a little time using all but a few mobilenewspaper efforts, and you’ll see the validity of this argument.Newspapers were reticent to use Nozzl in that way, though. They wanted widgets for their websitesinstead. We assumed that was because websites are more effectively monetized using display ads.The widget economy and experience are crowded, though, and Nozzl argues that displays ads havepeaked and will only decline from here. Nozzl as a stand-alone, highly functioning local-news mobileapp strikes us as incredibly compelling; Nozzl as one more widget on a Web page, less so.Nozzl’s Steve Woodward says it was simpler than that, though. “The real reason [that newspapers werereticent about the mobile app],” he says, “has more to do with comfort level than any direct thoughtsabout monetization. Mobile is a new technology that most newspapers aren’t yet comfortable with.On the other hand, they feel they understand the Web, and they certainly understand content. So theyare able to see value in adding real-time content to a news site, while they have a harder time seeingthe same or greater value in mobile.”So goes the story of innovators who would break free of aging institutions only to establish businessesthat are built on adding value to those same institutions. Web widgets it is, for now at least.Bringing content to Web pages in real-time may not be a sufficient differentiator for Nozzle indefinitely,though. As Ted Roden of the NY Times R&D Department and says, “Otherwise it’s likelooking at a Word doc in a Web browser. [Everything in the future] has to be real-time.”Woodward says that the type of content Nozzle delivers will be key. “We need to step up our game tobring in more, not fewer, public records,” he says. “That kind of content will be the thing that sets usapart most from future competitors.”See also:Steve Woodward on Twitter Hendrickson on Twitter ReadWriteWeb | The Real-Time Web and its Future | 19
  22. 22. Aardvark and the real-time Web of People Aardvark is a social search engine that combines artificial intelligence, natural-language processing and presence data to create what the company calls “the real-time Web of people.” The end result is “a magical experience,” CEO Max Ventilla says. The primary contributions that Aardvark makes to understanding the real-time Web include: • Leveraging presence data; • Communicating across platforms; • Emphasizing user experience; • Harvesting social data from third-party profiles; • Text analysis on-the-fly; • Mediating human interactions with machine intelligence; and, • Filtering the flow for both inquirers and respondents. You can ask Aardvark any question, and it will try to find a person in your extended social circles who knows about that topic and is available to answer at that moment. Aardvark facilitates these conversations through a very polite IM bot, an iPhone app with push notifications, the company’s website, Twitter or email. Instead of broadcasting your question to everyone’s stream of information, Aardvark delivers the question only to people who are relevant and available. Founded in 2007 but launched just this year, Aardvark’s got an all-star team of engineers from Google and Yahoo and high-profile investors. It’s already cutting deals with major tech brands, and the use cases are just beginning to be explored. The Web 2.0 Summit had a dedicated Aardvark circle for attendees to answer each other’s questions, and Federated Media will soon roll out a campaign sponsored by Microsoft in which Aardvark will facilitate a Q&A with relevant IT experts around the clock. The company says that 90% of questions get answered in five minutes or less. During our extensive use of the system and conversations with many other users, we found the answers that were delivered were generally satisfactory or better. The system gets smarter the more you use it. “When users come in and have a magical experience,” CEO Max Ventilla says, “that’s more important than the info they get back, to know that there are people who would help you immediately. This is20 | ReadWriteWeb | The Real-Time Web and its Future
  23. 23. social search as a complement to web search. The billions of pages on the Web are static data; that’sjust a fraction of what’s available in peoples’ heads.”Aardvark goes so far as to say in a blog post about the real-time Web5 that, “What really matters is theincreased accessibility of people online, not just information online.”Users are tagged with areas of interest or expertise by the friends who invite them to the system, andthen they add additional tags on their own. Further information about what a person knows is gleanedby analyzing the user’s Facebook profile page or Twitter stream.“Data gets stale, even your profile data,” Ventilla says. “We want to keep that fresh, by taking advantageof all the data that’s passing by. The things you’re posting about [on other social sites] are things youhave recent experience with. Being able to converse with someone who just had a learning experienceadds a lot of relevance. Social graph and profile data built up over time, the fact that people aremaking that info available for building value with communication tools – that’s a dramatic shift withthe Web.”In addition to user tags and social network profiles, Aardvark analyzes the text of inquiries to find relatedusers to query, and it keeps track of response times and types. The service notes the vocabulary that peopleuse (including ‘off-color’ conversations), who likes little chats and who engages in extended conversations. Itthen pairs sets of users with questions and with answers that it believes will be compatible.5 ReadWriteWeb | The Real-Time Web and its Future | 21
  24. 24. “This is a serendipity engine,” Ventilla says. “There’s variability in peoples’ experience, and we have to maximize the chance that something goes beautifully instead of bad. It’s about designing a user experience to keep a conversation on the rails.” Aardvark scores high on user experience for most of its interfaces, the latest iteration of its website being one possible exception. With this service, the website isn’t that important. Filtering the flow of information from the real-time Web is a concern that everyone who is touched by these technologies raises. Aardvark says it performs a filtering function by limiting the broadcast of a user’s question to relevant people they are socially connected to. “[With Aardvark,] you have the ability to have a conversation,” CEO Max Ventilla says. “This is fundamentally different from other forms of real-time search.” Conversations are so easy to have on demand with Aardvark that I once instigated and conducted three extended, simultaneous live interviews with topical experts around the world during a tech industry event6, all through the Aardvark IM interface. QUESTIONS THAT AARDVARK HAS ANSWERED WELL IN TESTING. • Is there any good way to serve a butternut squash and a sweet potato in the same meal? I’m thinking maybe I should just do the squash. [I ended up making a great soup.] • What are some examples of publicly available real-time data still excluded from search after today’s announcements by Bing and Google? [Best answer: commodities prices.] • What’s a good email address for Mozilla PR? [I should have had this already, and it took one line of explanation, but a Mozilla employee gave me contact info for the head of PR there within minutes.] • I have 5 minutes to choose: what tech, business, news or art podcast should I load up to take on a walk with my dogs? [Best suggestion: Monocle Weekly.] • What’s in Arm & Hammer baking soda laundry detergent, and can I spread it on my carpet to vacuum up? [I would have been to embarrassed to ask this in other contexts, but Aardvark subjected just a small number of people to my cry for help.] QUESTIONS THAT AARDVARK HAS NOT ANSWERED WELL IN TESTING. • What’s a romantic ocean cabin rental near San Diego that I might be able to get near new year’s? [No answer.] • What question should I ask the founder of Blog Talk Radio podcast service in an interview? [A 15 year old gave me a generic question, and I didn’t resubmit.] • Where can I get pizza delivered in North East Portland after 10pm? [To be fair, this may be an unanswerable question. I can’t believe I bought a house in an area with such bad pizza coverage.] 6 | ReadWriteWeb | The Real-Time Web and its Future
  25. 25. Mendeley and the real-timeWeb of ScienceMendeley is a service for organizing scientific research papersand includes social features such as recommendations ofresearch and other scientists you might like. The company saysit’s like or iTunes for scientific research and has backersthat include co-founders of and Skype. The companyoffers both Web and desktop software.The primary contributions that Mendeley makes to understanding the real-time Web includearticulating the following:• Opportunities to transform legacy institutions in qualitative ways by reducing time and harnessing network effects;• The importance of offering non-real-time, non-social value in order to get individual buy-in; and,• The value of implicit data.What’s the real-time element? Whereas scientists traditionally have had to attend events to learn aboutthe hot research topics in their fields and who is doing related research, Mendeley can track readingand citation activity in real-time to provide recommendations and trending data. The company is alsoconsidering adding a feature to its Word plugin that captures and tracks citations as they are written.Bringing real-time, social network effects and recommendation to science? If successful, theconsequences could be profound. Effective online recommendations could change work in the laband the quality of the face-to-face conversations. Real-world interaction now has a whole lot morepreliminary context, thanks to the Web in general and services like Mendeley in particular.Mendeley says it is on pace to become the largest repository of scientific literature on the Websometime next year. The key to adoption of the software, the company says, has been that Mendeleyoffers value even when used alone: the meta data extraction and paper organizing are useful enoughon their own. There are many different kinds of software for organizing scientific papers, though, andearly versions of Mendeley had some trouble processing the content that users inputted. The softwareis really aimed at social recommendations, and many scientists enjoy it for that. ReadWriteWeb | The Real-Time Web and its Future | 23
  26. 26. Librarians interested in discovering which journals are publishing the hottest research articles also use Mendeley; that is information that publishers of high-priced research journals haven’t had an interest in exposing. Mendeley envisions a future when university departments use the service to capture data about the productivity of their researchers, information that could influence hiring and tenure decisions. “The real benefit of real-time is for those doing the science,” Mendeley’s Research Director Jason Hoyt told us. “The most relevant research to yours could be in a minor journal you might miss. If it’s popular and relevant, this search process will show you that.” “You find researchers downloading a lot of papers,” Hoyt says. “Many times people will cite bad research; but implicit data – like opening a document several times, sharing it, etc. – that data says that a research document is really relevant.” Mendeley isn’t the only real-time company that derives a lot of its value from a desktop client and the implicit behavioral data that it provides. Many of the best-known real-time search engines leverage local software that captures implicit data. There is far more implicit data (like clickstreams) in the world than explicit data (like shared links) – it’s just a matter of building support for software that makes it available. Aren’t scientists famously private with their research in progress, though? “There might be some trade off, even with anonymous aggregate data,” Hoyt told us. “But you have to communicate in science anyway – and you have to give a little to gain a lot. You do have the option to make what you’re reading private in Mendeley, but less than 5% of articles and citations are hidden from complete view.” That’s a reasonable account, but some reviewers have said that Mendeley’s disposition towards sharing creates a flow that encourages users to either share publicly or not use the service at all. (Private group sharing isn’t yet supported, for example.) Time will tell how well Mendeley can move a market that’s already crowded with other research organization tools that are far less social. Hoyt says the company is still learning what to do with all the data it captures, but there are a lot of possibilities.24 | ReadWriteWeb | The Real-Time Web and its Future
  27. 27. “ If we have a subset of research on a topic right now, we can then predict where the research is going to take us in future. We can predict how research topics are going to morph. Then you can know where to apply research funds or remove funds. People could start modeling their careers based on the data they are seeing.”One of the next steps on a technical level, Hoyt says, will be for Mendeley to learn how to extract setsof data from papers and offer scientists recommendations of data that are similar to what they areworking with.This is disruptive work that Mendeley is doing.See also:Jason Hoyt on Twitter Hoyt’s social graph on Twitter includes:• William Gunn, scientist,• Daniel Mietchen, scientist, biophysics, ReadWriteWeb | The Real-Time Web and its Future | 25
  28. 28. Black Tonic Re-Imagines the real-time Web as a Controlled Experience Black Tonic is unlike any other company covered in this report. The Black Tonic product is a presentation tool for designers to give controlled, remote presentations of proposed design work to clients. The Black Tonic experience is not public. It’s not collaborative. It’s not a lot of things we associate with the most visible examples of real-time technology. It’s actually very controlled. Black Tonic is a download-free, HTML- and JavaScript-only browser-synchronization and browser- sharing application with unlimited viewership and support for broadcasting to mobile browsers. Still pre-launch, the company says it plans to “offer prices and plans that scale from independent designers to large agencies.” The company calls this type of browser synchronizing technology DOMCasting.7 It’s an interesting, relatively simple, model. A common problem for designers working for remote clients is that work tends to be sent in PDF or PowerPoint formats, via email. The client then clicks through the presentation at their own pace, with no explanation from the designer, well before the two parties have a phone conversation to go through it together. Designers don’t like this very much. “It frustrates the necessary process and work flow when reviewing work,” Black Tonic co-founder David Price says. Black Tonic offers a way for designers to control in real-time what is displayed in the viewer’s browser, through nothing but a Web link, and with as many remote viewers on Web or mobile browsers as they choose to share the link with. Presentations – complete with explanations, concepts and story – can then be given at the designer’s pace. Black Tonic argues that on real-time social networks such as Twitter and Facebook, the emphasis is on empowering individuals, and there’s no structure to the relationships between people. A spectrum of options is available on the real-time Web, though, ranging from technologies that reinforce and empower the perspective of the individual to those that force an individual to view content from a different perspective or a larger structured context. 7 | ReadWriteWeb | The Real-Time Web and its Future
  29. 29. “If you’re doing a remote client presentation, how do you prevent the client from having a subjectiveexperience of the work?” Black Tonic co-founder Phillippe Blanc asks. “First, force them to view the workfrom a perspective guided by the designer. Once they understand the work and the context, you canhave a collaborative, constructive discussion about the work.”“Conversation is the new content. And true conversation only happens when people share timeand space,” Blanc’s co-founder David Price says. “The designer’s inability to storyboard is a failureof the process.”Historically, the two argue, when people find the limits of a technology, they develop workarounds.Then, when more powerful technology becomes available, people often fail to reconsider theworkarounds and so change the process.The Black Tonic team believes that lightweight real-time technology is an opportunity to reconsiderremote presentations, to add some structure to them and add the necessary control over presentationthat they haven’t had with the workaround of emailing PDFs.A whole lot of options arise when a new computing paradigm emerges. Real-time doesn’t have to onlymean delivering a chaotic or filtered stream of social information to an individual at the center of thesystem. Black Tonic is a good example of looking outside the standard application of a new technologyand instead taking advantage of the opportunity to reconsider standard practices that have beeninfluenced by technological limitations that no longer exist. ReadWriteWeb | The Real-Time Web and its Future | 27
  30. 30. At the Red Cross, the real-time Web Saves Lives The real-time Web isn’t just changing our lives online; it’s starting to make a big difference offline as well. Disaster relief efforts at the American Red Cross have been transformed by real-time technology. Walmart may be world famous for its powerful inventory-control system, but some people say the Red Cross is becoming another leading example of a highly effective, large-scale organization co-ordinating activities around the world in real-time. The primary contributions that Michael Spencer’s discussion of the Red Cross makes to our understanding of the real-time Web include articulating the following: • The real-world consequences of real-time technology; • Transforming a legacy institution using real-time technology; • Strategic reliance on third-party software in a real-time context; and, • The importance of planning, relative to technology implementation. Michael Spencer, lead for SharePoint technology at the American Red Cross National Headquarters, puts it like this: “ The Red Cross has been around for over 100 years. I’ve been here for 12 years, and with what I’ve seen over the last year in terms of real-time information, co-ordination and our dashboard overseeing everything, I think we’ve made 50 years worth of advancement in a year or two because of real-time technologies. At the Red Cross, the real-time Web saves lives.”28 | ReadWriteWeb | The Real-Time Web and its Future
  31. 31. The national Red Cross disaster response center responds to about 350 disasters every year, whenevera local chapter is beyond its capacity. When hurricanes strike, the organization has days to plan; withearthquakes or aviation disasters, it has no time at all to plan.Spencer says:“ It used to take two days to inventory our available volunteers. Now that can be done in one or two hours. We used to call them, send them emails, try to process all of these incoming emails. It was a struggle to get people on the ground. Now I can see exactly who is available, trim the list down by region, by language, by specialty skills. That’s all at my fingertips instantly.” “We now put videos and photographs in an online disaster news room, where victims can also go for shelter locations. We’re feeding information into SharePoint and then posting that to All that info feeds into a public shelter database; as soon as one opens or closes, the information is available to the public. It’s a way for the media to see what they can publish on the radio and TV. This is critical info. With shelters, once one is filled to capacity, people need to be sent to a different shelter. “We also have something called ‘Safe and Well.’ We can now register people through our website and then publish this information, so that anyone looking for info on family can search for peoples’ names, addresses or phone numbers. Displaced people can leave a message there – we can reassure so many people that their loved ones are safe.”The Red Cross makes sure to keep latency and downtime on that “Safe and Well” site as low as possible.One thing the organization has to do when responding to disasters is to verify the claims of home lossthat people file. That used to take a long time, but no longer, Spencer says.“ In this last year, we’ve sent volunteers out with PDAs. We used to go around with a car and sheet of paper to verify damage. Now we have handhelds that let you take a picture of a house – it has GPS in it – upload it to a satellite, and then we can do real-time monitoring from a dashboard. “That dashboard view of houses damaged? That would have taken weeks before. Now we can do it right away. The government can also do fly-overs that feed rough estimates of damage from a plane into our portal, so we can get an overview within a few hours, and then our volunteers go out with ReadWriteWeb | The Real-Time Web and its Future | 29
  32. 32. devices. That used to take me a week and a half or two weeks, even longer. I could never get a fly-over by the government or get my volunteers in. Now it’s fed automatically to my dashboard. I don’t have to call people and report our new numbers. We even used to do shelter numbers by hand for meal ordering. Now it’s all done through the Web.” From volunteer and shelter co-ordination to the “Safe and Well” program to sometimes millions of dollars in donations collected online in a single day, the Red Cross is heavily dependent on its Web presence. The organization uses a service called AlertSite to monitor its uptime. AlertSite runs continuous automatic tests of website functionality and sends the Red Cross real-time alerts and diagnostics whenever there’s a problem. “We were having critical problems with SharePoint going out for 5 to 24 minutes,” Spencer says. “We can’t withstand that. AlertSite now pages all the engineers with diagnostics, and we respond immediately, sometimes just from our BlackBerrys.” Despite those problems, Spencer remains a big advocate of SharePoint. “ We’ve seen the evolution of SharePoint over time. The biggest problem with SharePoint 2007 is when you fail to put a good governance plan in place. Your work should be 80% planning, 20% implementation. It tends to be just the opposite. People tend not to plan it out well and don’t have a good idea of what SharePoint could do. We’re only leveraging about 15%, maybe 20%, of its capabilities. We had to spin up a call center for Katrina, for example: we needed to track calls, see who’s following up, etc. I was able to create a solution in SharePoint in one day, and they are still using the same system three years later. It’s all about training users how to use it, empowering them to take it off IT’s shoulders.” Another third-party service that the Red Cross uses heavily? Breaking News Online (BNO), the international newswire on Twitter and the iPhone. BNO is an amazing story. The service was founded two years ago by a 17 year old from Switzerland and is now run by a plucky little crew of online journalists around the world. It’s the fastest way to get breaking news from around the world, around the clock. Rafat Ali of the UK Guardian’s paidContent wrote last month that BNO is eating the mainstream media’s lunch and that someone really ought to try to buy the organization. Apparently, BNO is so on top of things that even the Red Cross watches it closely. Spencer says that a lot of people at Red Cross headquarters are subscribed to BNO. He told us the story of an eight-hour work session on simultaneous disasters that the team finished late one recent night, only to receive push notifications from BNO as soon as they closed their laptops, breaking news that another disaster had struck.30 | ReadWriteWeb | The Real-Time Web and its Future
  33. 33. Key Players
  34. 34. John Borthwick: Thoughtful prince of the real-time Web John Borthwick is a complicated, thoughtful man. Business Week called him “perhaps the real-time Web’s key articulator.” He has already built, bought and invested in more high-profile real-time Web technologies than probably anyone else in the consumer Web world. He’s hardly an unqualified cheerleader Creative Commons for the real-time Web, though. Borthwick Attribution Brian Solis is unafraid to consider different sides of a situation or to change his mind. In 1997, John Borthwick built and sold to AOL the content publishing company behind the site Total New York. The New York Times focused on the irony of the deal in its coverage: Borthwick had publicly called for independent content producers to stay independent just a month earlier. While at AOL, he testified in the US government’s case against Microsoft – but now he says he thinks the position he took was wrong. These days he argues instead that innovation will outpace monopoly in technology and that regulation isn’t the solution. Borthwick saw AOL fall from grace, but he kept in touch with many of the smartest people he met there, and he has ties to several of their startup companies today. That circle of people includes Gerry Campbell of real-time search engine Collecta and the Summize crew, which both Borthwick and Campbell invested in before it was acquired to become Twitter’s in-house search engine. Borthwick points to the rise of YouTube as proof that an entirely new kind of search can emerge fast. YouTube is now the second-most popular place for people to perform searches online, after Google. This summer he wrote, “I now see search as fragmenting and Twitter search doing to Google what broadband did to AOL.” These days, Borthwick is the CEO of Betaworks, the best-known investment group on the real-time Web. After Summize went to Twitter,, a link-sharing and analytics tool built by Betaworks and invested in by a constellation of Silicon Valley superstars, became the default URL shortener for | ReadWriteWeb | The Real-Time Web and its Future
  35. 35. Other Betaworks investments include the most popular Twitter client (TweetDeck), Howard Lindzon’sTwitter experience for stock traders (Stocktwits), the new database of gadget reviews (Gdgt), fromEngadget and Gizmodo founders Ryan Block and Peter Rojas, the humor site Someecards, hyper-local news aggregator, lightweight customer support service UserVoice, content curationplatform Tumblr and 13 others. Betaworks itself bought Twitterfeed, the service that every organizationfrom CNN to the White House uses to pump RSS feeds into Twitter and now into Facebook.For all this real-timeness, Borthwick watches out vigilantly for his own ability to think andcommunicate in long form:“ I write about one long blog post per quarter. I don’t show them to anyone. I’m long-winded and verbose. I try to make it intentionally long form because there’s a lot of things we’re touching on right now. I write about history. A lot of the tools we’re using today are washing away history. There’s a bunch of really profound implications of that. I try to do long form things periodically because you can get so fragmented in our world that you never dig into the long-term issues that we’re contributing to but not talking about.”This leader of the real-time Web, one of the main men behind the biggest little link shortener on earth, isworried about the consequences of rapid-fire short-form communication? Thank goodness. Thoughtfulconsideration is very reassuring and too rare. Here’s Borthwick on why he does what he does:“ John Barlow said there was no Prana or life source energy in an Internet interaction, but could there be some sense of life and of energy that gets transmitted? Part of what’s happening in the real-time Web is the synchronicity that takes place in a real-time conversation. There’s not time to package and prepare the meaning around the meaning of what you’re discussing; the liveness of the event yields an order of magnitude different interaction, and that interaction is more human. The Web is becoming a more human place. We’re humanizing the machine a bit. I think that’s a good thing. I have three kids, and I see the way they interact with machines, and this is something I strive toward. There’s a moral imperative in this – but I don’t want to imply that for anyone else.”Borthwick is a believer in the data portability vision; he believes that identity will be separate fromservices in the future and that people will pick and choose between best-of-breed service options.“In the early days, there was a sense that people were going to build portal sites,” Borthwick says. “Thenpeople thought that social networks would provide a new way to navigate.” Now he sees search as aprimary form of navigation, a way to track conversations, not pages. ReadWriteWeb | The Real-Time Web and its Future | 33
  36. 36. “ We believe things are becoming more connected. In the future, everything will consume APIs and publish APIs. People on the business side would say over the last 5 or 10 years, ‘That’s not a company. That’s a service’ [i.e. services with APIs at both ends]. I would say to them, ‘If it’s just a product, then what is the whole it should be a part of. They’d say Yahoo should buy it, but in most cases they squander it. I sold a company to AOL and went through the squandering of my company, then did that to other companies. If the next generation is cohesive parts, the whole they belong to is the Internet. “[Betaworks investment] Gdgt is a database. They are aggregating user- generated content around a structured data set. That’s central to what we think about at Betaworks. We view it as data structuring – that fits into our worldview of what’s important. They aren’t a gadget blog or a media company. They understand that many of those contributions won’t happen on their website, that the boundaries of their site need to be permeable. They are all involved in social real-time. They are also to a greater extent sharing open data. “In the real-time stream, a core reason why we jumped in with TweetDeck (we wanted to buy the company) was because Iain was articulating the data in a column format. The Web is striving for new representations of data types. We’re supplementing the page-based metaphor with the stream- based metaphor. When you screw with metaphors, you destabilize things. All the clients before TweetDeck used the heritage metaphor of instant messaging. “The metaphors people choose are so powerful for how people both publish and subscribe. I think we’re just scratching the surface of this stuff. The lock- in that we’ve had around pages has held us back in terms of innovation and how to use this medium. When we got here [to the Web] there was nothing, and we flopped a 500-year-old metaphor of pages, a browser that by its name says you will browse, not touch, this content. But it was not meant to be a one-way experience. We’re only a fragment of the way into this journey.”34 | ReadWriteWeb | The Real-Time Web and its Future
  37. 37. ARE WE GOING TO GET BRAIN IMPLANTS?I made casual mention of brain implants and what a bad idea I think they are in a recent conversationwith Borthwick, and he had something to say about the matter.“ The brain implant is implicitly happening. I spend seven hours a day looking at and tied to the screen. We’ve extended ourselves into this network already; we’ve accepted it de facto. A good piece of the revolution for me is to humanize it more. There’s a large degree of computing and Web work that has occurred in the last twenty years that’s dehumanizing. The transition from portal to search to social distribution – part of that trajectory is that it’s becoming more human. But we are also placing ourselves into the network and into the machine. The day we wake up and realize that the network has ‘become self’ will be too late – we will have extended ourselves into the network. “Once upon a time, people thought eyeglasses were technology. In that Umberto Eco book ‘The Name of the Rose’, a character made eyeglasses. People thought he was modifying sight. You read this and it’s quaint. We embrace them as an extension of self, but we don’t think of eyeglasses as technology. We’ve become comfortable with the technological mediation of what we see. It’s an example of how human beings are capable of extending sense of self and embedding technology into our sense of self. “Filtering is already endemic to the stream. To some extent, everybody is curating the inputs into their stream, but sharing the curation tools is not available today or is very, very crude. Using other people’s brains to filter and help curate that data stream in a dynamic fashion is implicit to where all this is going. The data structuring stuff is important because we’ve got to find ways beyond search to find things. But as one of the engineers on Summize said, a computer science professor wouldn’t consider this search because the axis on which we measure is time, not relevancy. To me, it’s much more of a filtering metaphor. What we found with Summize was that people left multiple tabs open to run concurrent searches. All of the old PubSub Wyman stuff was coming back to the fore. Human filters, understanding how we can share, how we can do data structuring, using search and navigation for discovering relevant info is where this is going. ReadWriteWeb | The Real-Time Web and its Future | 35
  38. 38. “ I feel like we’ve got this concurrent stream of how we can plug into what other people are doing, thinking, feeling and experiencing. We can bring greater humanness to that, make the world more connected and more understanding because we can understand other people’s context. That’s what you’re feeding in. A lot of that is what I’m working on, what I wish for and think is fascinating. “That said, I have a lot of respect for the sole contributor. My brother is an artist and has no interest in other people’s ideas. Many of the greatest works have been created that way. There’s a tension there that’s very interesting.” See also: John Borthwick’s blog posts and other information is at John Borthwick’s circle on Twitter1 includes: • Andrew Weissman, Betaworks, • Bijan Sabet, VC at Spark Capital, • Terry Jones, CEO at Fluidinfo, • Nathan Folkman, Engineer at Foursquare, former Systems Architect at Betaworks, • Mary Hodder, serial entrepreneur, 1 | ReadWriteWeb | The Real-Time Web and its Future
  39. 39. Chris Messina: Rebel with aproposed technical standardJust 10 years ago, Chris Messina was asuburban teenager in New Hampshire wholost his faith in authority, stopped doing all hishomework and tried to hold his high school’swebsite hostage after he was suspendedfor running an ad on it for a proposed gay/straight alliance student group. Photo of Messina from Wikipedia, taken by Tara Hunt.Since then, he’s enjoyed some impressiveaccomplishments. He designed the two-page ad that ran in the New York Timesannouncing the launch of Firefox2; heco-founded a network of public events(Barcamp3) in more than 350 cities; heserves on the Boards of the OpenIDFoundation4, the influential new Open WebFoundation5; and he is now one of the mostclosely watched players in the world ofonline social networking. He’ll turn 29 yearsold in January.Now working as an independent consultant, Messina is one of the leading people behind a technicalformat for syndicating user activity data from one service to another in a human-readable way, calledActivity Streams6. Facebook, MySpace and Windows Live have already begun producing user data inthe Activity Streams format. Twitter does not yet.2 10,000 people donated $30 each to buy that ad and it featured all their names.3 http://barcamp.org4 ReadWriteWeb | The Real-Time Web and its Future | 37
  40. 40. WHAT IS THE ACTIVITY STREAMS FORMAT? Everybody talks about filtering the real-time stream of information online, but the Activity Streams community is where conversations take place between leading engineers at the world’s biggest and smallest social networks with the goal to replace the “walled garden” model of social networking with an open, inter-operable communication marketplace. If Activity Streams succeeds, you will be able to subscribe to and filter the activities of your friends across multiple different networks, without having to sign up for or even know about those other networks. This is almost the equivalent of AT&T phones being able to make calls to Verizon phones, or of rail- transport companies being able to ship goods across the country over different railroad networks – because the rails for the trains are the same size. It’s different, though, because of the granular filtering by type of activity. Applications built on top of Activity Streams will allow for the equivalent of a phone that accepts phone calls only about certain subjects from certain people... because, of course, we’re now receiving a lot more inbound communication than we did in the telephone era. “ The real-time river of news makes information available to you as it is created,” Messina told us, “but you need a way to consume it that respects your time, enhances the content or makes it easier to consume. The Activity Streams format aims to allow people to receive a stream in a way that they can manage.” An extension of the Atom feed format, the spec explains it like this: “An activity is a description of an action that was performed (the verb) at some instant in time by some actor (the subject), usually on some social object (the object). An activity feed is a feed of such activities.” In the current draft spec, you can perform such actions as Post, Share, Save, Mark as Favorite, Play, Start Following, Make Friend, Join and Tag Object. An Object could be an Article, Blog Entry, Note, File, Photo, Photo Album, Playlist, Video, Audio, Bookmark, Person, Group, Place or Comment. These actions can have such contexts as Location, Mood and Annotation. Stream aggregator Cliqset publishes Activity Streams feeds that don’t require API authentication to view. You can see a sample one at: The aim of Activity Streams is to have multiple social networks use a common language and have a common understanding of what all those things mean, so that messages can be read across different networking sites. Messina explains that both publishing and subscription technologies need to become more sophisticated in reading and writing streams of data in order for this vision to become a reality.38 | ReadWriteWeb | The Real-Time Web and its Future
  41. 41. He says:“ The real-time Web is a shift towards something more like how humans interact with the world: the information just flows right in. When it comes to thinking about Activity Streams, how can we add a few more semantic hints to the original info coming to our [subscription] agents? And then how do we filter what’s relevant? Here’s an analogy. Dogs have 300 million receptors in their noses, so they can parse smells really well. We only have 6 million receptors in our noses. Imagine if we went from having 6 million to 300 million receptors that we could use to filter information. We haven’t developed those sensors yet in order to create more possibilities.”Standardized, semantic clues from feed publishers and the ability to read them in whatever applicationwe use to read updates are the kinds of receptors that Messina is helping to design and implement.THE WEB OF PEOPLE“ The thing non-geeks can understand and bring to this is their identity,” Messina says. “We’re getting back to the individual as the primary actor in the system. They can hook up systems to their identity providers and do things. “Facebook is one of the first services to orient itself in this direction; it is providing some good R&D into where this is going, and it is doing good work in this kind of direction. You log in to your Facebook account, and everything flows to you. Right now, that’s the best metaphor that we have. “I think Facebook is going to play a very important roll. I think it has a desire to align itself with the Web, just as Google does. “Video games provide a great experience about what real-time on the Web would be like. Gaming has to be real-time to be enjoyable. Right now, most of the Web uses interfaces from the document-centric era of the Web that don’t scale or translate to the real-time Web. “For example, we want to have longer conversations, but email is one of the big linchpins that’s broken. Outlook is so entrenched. It’s clear that these conversation systems are broken. “But the ‘river of news’ doesn’t have handles that regular people can grasp. ReadWriteWeb | The Real-Time Web and its Future | 39
  42. 42. The number of old people who make Facebook wall posts and think they’re private is enormous! But there are a lot of benefits to this real-time Web, like being able to reply immediately to a photo. My mom would like iPhone push notifications of pictures of me or my girlfriend. How do we lead with a carrot to get people to shift away from email and into a real-time model?” When Messina was 13 years old, he traveled to Greece and Italy and was shocked to find out that people in some European cultures left work in the middle of the day to have lunch with their families and take a nap. “The fact that a whole culture could exist and be so different from mine broke all my assumptions,” he says. That realization gave him a great sense of hope. Now, as an adult, the tagline on his blog reads, “All of this can be made better. Ready? Begin.” He’s been working to make the world better ever since, and now he has a whole lot of traction. Watch his work for an important window onto the future of the Internet. See also: Chris Messina’s blog Messina on Twitter His Flickr collection of notable user interfaces To understand Messina and his work, pay attention to: • David Recordon at Facebook • Scott Kveton, Urban Airship • Will Norris, independent software developer • Joseph Smarr and John McCrea, Plaxo/Comcast40 | ReadWriteWeb | The Real-Time Web and its Future
  43. 43. Brett Slatkin, Brad Fitzpatrickand PubSubHubbubBrett Slatkin has long been an idealist. “If I made a greatproduct, and Microsoft offered me a lot of money, I wouldspit in their faces,” he told Newsweek while a brash freshmanat Columbia University in 2002. He joined Google aftercompleting a computer science degree in 2005. Last year,Slatkin sprung into public view with the launch of GoogleApp Engine, a product that lets developers run their Webapplications on Google’s infrastructure.Slatkin works on App Engine as his day job, but for his 20% time project he has led the creation of animportant new real-time syndication format called PubSubHubbub. Slatkin calls it Hubbub for short.HOW HUBBUB WORKSThe PubSubHubbub model has three parties. There’s a Publisher (FeedBurner, for example) and aSubscriber (perhaps Netvibes), and communication is facilitated through a Hub (Google’s AppSpotHub was the demo and is the most popular Hub so far). The publisher knows that every time newcontent is published, it will notify the hub; the hub that gets notified will be declared at the top of thepublisher’s document, just like an RSS feed URL. So, the publisher delivers new content to the hub, andthen the hub delivers that message immediately to all the subscribers who have subscribed to receiveupdates from that particular publisher.This is very different from the traditional model in which a subscriber polls a publisher directly every 5to 30 minutes (or less) to see if there’s new content. There usually isn’t new content, and so that modelis inefficient and slow. Hubbub is nearly immediate and only takes action when something importantoccurs. Protocol co-creator Brad Fitzpatrick says that the current system of websites polling each otherfor updates is like a kid in the back seat of a car saying “Are we there yet?” over and over again. Hubbubsays, “Shut up, kid. I’ll tell you when we get there.” That’s how Fitzpatrick explains it.It’s remarkably simple, at the end points in particular. If things ever get complicated, it would be in thehub, and that’s easily available as a service if a publisher doesn’t want to host their own. The hub doesthings like authenticate subscribers, check in with feeds that haven’t pinged it lately, deliver a singleupdate from a publisher to multiple subscribers and act as a publisher itself for other hubs to subscribe ReadWriteWeb | The Real-Time Web and its Future | 41
  44. 44. to. Neither publishers nor subscribers have to worry about the hub’s details, though, unless they are looking for things like subscriber analytics. Real-time PubSubHubbub feeds are already being published by FeedBurner, Blogger, LiveJournal, LiveDoor, Google Alerts and the feed republishing service Superfeedr. Facebook’s FriendFeed, LazyFeed and the newest version of Netvibes are consuming Hubbub feeds so far, as are a number of small sites and services that are using the feeds for machine-to-machine communication. Slatkin is the public face of the protocol, but he created it with Google’s Brad Fitzpatrick. Fitzpatrick, now 29 years old, grew up in Oregon and built the popular social-networking service LiveJournal while he was in high school in 1999. One year later, he hired Martin Atkins, then a high- schooler in the UK and now a SixApart engineer and a leader in the online identity community. (Atkins also had a big hand in formalizing Hubbub.) In 2003, LiveJournal grew fast and hired a number of additional engineers, including then high-school senior and now Senior Open Programs Manager at Facebook David Recordon. Also in 2003, Fitzpatrick’s company developed Memcached, an open-source memory caching system that’s used today by Twitter, Digg, YouTube, Craigslist, Wikipedia, WordPress, Flickr and more. In 2005, Fitzpatrick sold LiveJournal to SixApart. Later than year, he created the first OpenID authentication protocol for LiveJournal. In other words, he’s been a whirlwind of technical innovation for the last 10 years. Fitzpatrick is now at Google working on what could become the infrastructure for distributed, independent and inter-operable social networks, PubSubHubbub among them. Fitzpatrick explained: “ Real-time stuff is one dependency around federated social networking. No one would suggest a chat function that’s based on polling, for example. You can’t compete with walled gardens that have real-time internally if you don’t. One of the obstacles has always been real-time: engaged conversation, news feed, etc. So in order to solve social networking we need to implement PubSubHubbub and WebFinger [a profile-syncing technology that Fitzpatrick is now working on:]. “Things are about to get interesting. I don’t need another social networking site – we need competition, we need the basic crap that all these sites do [posting, commenting, sharing, etc.] to be federated and all working together.” So Atom-based Activity Streams may be the language in which functions such as posting, commenting and sharing are expressed; and then PubSubHubbub may be the method of delivering the Atom feeds of updates in real-time.42 | ReadWriteWeb | The Real-Time Web and its Future
  45. 45. The use cases are essential to consider, but Slatkin thinks of this work mostly as creating betterbuilding blocks that can then be used for anything. He emphasizes that engineers need to be buildingnow to scale for the unforeseeable use cases of the future.“Real-time implementers need to think about consistent [application] workloads,” he told us. “That’sthe only way they can scale.”“ To sip from the fire hose you need to only get what you care about. If you have to cut anything out, then you’ll drown. People say ‘RSS and Atom are good enough!’ I don’t think people know where we’re going to be in 10 years. Right now our back ends can handle the load – but if we only cared about today, then we’d just stay home. The whole point of technology is to make new things. When people think about the real-time Web, they need to think about new use cases that no one has considered because they seemed technically unfeasible. If you told someone 10 years ago that you could have 15 people concurrently editing a document – that was crazy!”Slatkin emphasizes that we can’t know what the ultimate killer apps for push will be, but he rattled offto us a short list of ways in which he could imagine them being put to use:“ Push as compliance with SEC for filing financial reports. Real-time monitoring of the performance of cloud services using Hubbub. Sensor networks: tiny sensors everywhere with little bits of data, sonar modules or IR pings. Put a thousand of those in a field and get a 3-D picture of what’s going on. So far, that’s been done with binary, proprietary, one-off protocols, hard to use. Open, real-time Web data could enable vast numbers of people to consume that sensor data. It could be used on battlefields, football fields or as road data.”Fitzpatrick thinks Hubbub could even replace Google’s crawls of the Web. “All content should bereal-time and subscribable,” he says. “You could replace crawling with this, every page on the Web. Youcould probably get most pages pretty soon, but one could imagine modifying Apache to support thisby default.”Former Googler Paul Buchheit (Googler #23, in fact), now at Facebook after selling FriendFeed to thecompany earlier this year, zooms into the smallest details. “The next step is for people to open more oftheir current activities and plans,” he wrote in a recent blog post.77 ReadWriteWeb | The Real-Time Web and its Future | 43