Your SlideShare is downloading. ×

The Real-Time Web and its Future


Published on

Published in: Technology, Business
1 Comment
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. The Real-Time Weband its FutureEdited by Marshall Kirkpatrick
  • 2. The following report is based largely on insights shared generouslyfrom these interviewees:Aardvark LexalyticsAdrian Chan Marnie Webb, CompumentorAlertSite MendeleyAllVoices NomeeAmber Case Notify.meBacktype Nozzl MediaBernardo A. Huberman, OLarkHP Social Computing Lab OneRiotBeth Kanter OrSiSoBlack Tonic PBWorksBrad Fitzpatrick, Google PipioBrett Slatkin, Google PostrankChris Messina Steve GillmorCitySourced SuperfeedrCliqset SysomosCollecta Ted Roden, NY Times/DeWitt Clinton, Google EnjoysThingsEvri The American Red CrossFactery Labs ThreadsyFaroo TibcoFirstRain TweetmemeJay Rosen, NYU TwinglyJohn Borthwick, BetaWorks Urban AirshipJS-Kit Warner Bros.Kaazing WowdKevin Marks, BT YourVersion
  • 3. Contents1 a. What is the real-time Web? Beyond Twitter and Facebook 2 b. Matrix of issues and companies 42. Case studies 5 a. Ted Roden puts real-time into and the New York Times 6 b. Superfeedr: Transforming the legacy Web into real-time 9 c. Real-time as a trigger: Evri’s news-parsing technology 11 d. How Warner Brothers uses the real-time Web in the music business 13 e. Urban Airship does real-time mobile push 15 f. Nozzl Media: Bringing real-time to old media 17 g. Aardvark and the real-time Web of people 20 h. Mendeley and the real-time Web of science 23 i. Black Tonic re-imagines the real-time Web as a controlled experience 26 j. At the Red Cross, the real-time Web saves lives 283. Key players 31 a. John Borthwick: thoughtful prince of the real-time Web 32 b. Chris Messina: Rebel with a proposed technical standard 37 c. Brett Slatkin, Brad Fitzpatrick and PubSubHubbub 41 d. Steve Gillmor: The real-time Web’s leading journalist 45 e. Another 15 important people to follow to understand the real-time Web 514. Sector overviews 56 a. Stream readers: Interfaces for the real-time flow 57 b. Real-time search: Challenges old and new 68 c. Text analysis and filtering the real-time Web 725. Visualizations 75 a. The path to value 76 b. Real-time in conjunction with the static or slower Web 77 c. Information overload 786. Selected background articles on real-time technology 79 ReadWriteWeb | The Real-Time Web and its Future | 1
  • 4. What is The Real-Time Web? Beyond Twitter and Facebook Dave Winer defines the real-time Web in four words: “It Happens Without Waiting.”1 That’s true, and appropriately vague. The phrase “real-time Web” means different things for different people and it’s too early in the game to have anything but a loose, inclusive definition. Many of the different forms the real-time Web takes do have some common benefits, user experience elements, lessons learned, pitfalls and possibilities. This is what we explore in this report. It’s definitely a whole lot more than just Twitter and Facebook, though these are the best known instances of what’s referred to as the real-time Web. Someday Facebook may open up its user data and play a larger role in the real-time Web than just the introduction to the stream model that it plays today. Someday Twitter may grow, discover how to retain users and effectively encourage more than the small number of people who today create the vast majority of content on that service. Today engineers estimate that Twitter sees about 1 thousand messages published per second and between 5 and 10 million links shared per day, before de-duplication. That sounds like a lot, but the real-time Web as a whole is already much, much larger than Twitter. For infrastructure provider Kaazing, the real-time Web is using HTML5 Web Sockets technology to push live financial information to the Web browsers of banking customers that had always been limited to desktop applications for security reasons. For consumer web app, the real-time Web is creating an XMPP-powered chat-like experience for users to communicate with friends around objects like a Google Map or a streaming Netflix video playing in the web OS. For semantic recommendation company Evri, the real-time Web is the ebbing and flowing of traffic data on Wikipedia. That data points to hot topics that Evri needs to build topic pages to serve their publisher customers. 1 | ReadWriteWeb | The Real-Time Web and its Future
  • 5. For search engine OneRiot, the real-time Web is made up of the links people share on Twitter wellas Digg, Delicious and the click-streams of more than a million users who have opted-in to exposingwhat they see online through the OneRiot toolbar.For Q&A service Aardvark, the real-time Web is the people inside the social circle of a user whohappens to be available online at a given moment and interested in the topic of a user’s question.There are hundreds of thousands of blogs that now deliver updated content to any other applicationthat subscribes to a PubSubHubbub or RSSCloud feed, immediately after that content is published.NYU Journalism Professor Jay Rosen says the real-time Web creates a sense of flow for users that’scomparable to the way television holds our attention. Google’s Brett Slatkin, developer of thePubSubHubbub real-time protocol, says the real-time Web is a foundation for efficient computing anduse cases we can’t yet even imagine.In writing this report we interviewed 50 people who work on technologies that power or leveragewhat they consider to be the real-time Web. Those people have had a very diverse array of experiences,but articulate a common story. It’s a story of increased computational efficiency – and softwarethat struggles to keep users from feeling overwhelmed. It’s a story of radically new possibilitiesbut strategies based on adding value in conjunction with more traditional, slower moving onlineresources.We hope you enjoy reading this overview of the emerging real-time Web. We believe this phenomenonis one that will play a major role in the Web and world of the future. The page-based model ofdestination sites, created by centralized expertise and navigated through authority-based search andclicking link by link is being transcended. We think this survey of current strategies and experiencesto date will prove very useful in helping you effectively participate in and help build the future of thereal-time Web. ReadWriteWeb | The Real-Time Web and its Future | 3
  • 6. Matrix of Issues and Companies This matrix allows you to navigate the contents of this report by topic. For example if you are a User Experience expert, the second column shows you where the most relevant content for you is. STANDARDS, DATA CHANGING BENEFITS OF USER ANALYTICS & NORMALIZATION OLDER REAL-TIME AS REAL-TIME EXPERIENCE ADVERTISING & TEXT ANALYSIS ORGANIZATIONS A SERVICE CASE STUDIES NYT • • • SuperFeedr • • Evri • • • Warner Bros. • • • Urban Airship • • Nozzl Media • • • Aardvark • • Mendeley • • • Black Tonic • • • Red Cross • • PEOPLE PROFILES John Borthwick • • • • Chris Messina • • • • Slatkin/ Fitzpatrick • • • • Steve Gillmor • SECTOR OVERVIEWS Stream Readers • • Search • • • • Text Analysis • • •4 | ReadWriteWeb | The Real-Time Web and its Future
  • 7. Case Studies
  • 8. Ted Roden puts real-time into and the New York Times By day, Ted Roden works on the very top floor of the New York Times building, in the R&D department. The Times has a great team of engineers: it does cutting-edge work in APIs, data visualization and computer-assisted reporting. Roden does work with real-time data at his day job, but he gets full creative freedom when working on a side project called The primary contributions that Ted Roden makes to understanding the real-time Web include articulating the following: • The material benefits of going real-time; • The importance of user experience; and, • The changing landscape in analytics and advertising. Roden is also writing a book about real-time for O’Reilly Publishing. We had a conversation with him about what happened after he added a real-time feed to He articulates well some of the biggest advantages of a real-time infrastructure. is a visual bookmarking site, like Delicious for images and other media. Even bookmarked text snippets are highlighted visually. User experience is a key consideration in all of the site’s developments, and the service is a lot of fun to use. This summer Roden added a premium subscription option to the site, called Joy accounts. A Joy account costs $20 per year for access to all current and forthcoming premium features, or users can pay $5 for an individual premium feature, such as disabling ads on the site or being able to view NSFW content. One of the features that Joy account holders get is access to a real-time view of new shared content. That real-time stream can be viewed in any browser but may be best served up in a Firefox sidebar. A real-time feed as an up-sell value add? That’s remarkable, and Roden says the response has been positive. The sidebar is simple but compelling. New content, including images, is pushed live into the side of the browser as soon as it’s shared on the site.6 | ReadWriteWeb | The Real-Time Web and its Future
  • 9. At first, Roden said he used AJAX to poll his site every few seconds. Then he switched to a is still very small, but the implications of adding real-time to this site could likely benefitsites of any size.1. INCREASED TIME-ON-SITE“People leave it open all day long,” Roden said of the sidebar. “Time-on-site has seen a huge increase.It’s like when the new content comes in on the Facebook Live Feed: if you know it’s about to pop in fiveseconds, you’ll stick around.”A number of different factors are making time-on-site an increasingly important metric on the Web,compared to page views. Increased consumption of video is the best known, but as real-time streamsof aggregated content become increasingly common, increased time-on-site will be an importantmeasurement of how successful an implementation is.2. DECREASED SERVER COSTSAfter implementing real-time infrastructure, Roden reports that “my site runs a lot more smoothly. I’llprobably move the whole site to that technology, because deep down it’s much easier on the databasefor me.”“ I used to get hit by Stumbleupon and [the site] would start to crawl. Then I changed to some of this real-time stuff, and I’ve reduced the number of servers. Instead of the users sitting on the page and refreshing, I push it out to them. My EC2 bill has gone way down.”Roden’s experience complements the story that Google’s Brad Fitzpatrick told us about usingPubSubHubbub to push feeds to deliver shared items in Google Reader to FriendFeed. Changingfrom polling to real-time push cut traffic between the two sites by 85%. Likewise, magazine-style feedreader Feedly says that the part of its service that now consumes PubSubHubbub from Google Readerhas seen a 72% reduction in bandwidth.3. ADVERTISING COMPLICATIONS“Analytics totally change,” Roden told us. “If you never click around off the home page, then GoogleAnalytics says it’s one page view. Now if you’re pushing stories to the top of the page, then you don’tknow how many stories people have seen unless you start measuring differently.”“ Measuring user engagement totally changes. People use in a sidebar in Firefox: do I count that as whole page view? Do I count it as one, even though some people have it open for eight hours? Can you convince an ReadWriteWeb | The Real-Time Web and its Future | 7
  • 10. advertiser that they are going to see an ad 100 times while looking at a page just once, and do they want that? For projects like, it’s going to be a scary world out there for advertising for a while.” Roden has been placing display ads in the real-time feed and prioritizing the attractiveness of the creative. That’s been somewhat effective so far, but he says it’s very early days in advertising in a real- time model. He says that real-time won’t be an effective differentiator for ad sales in the future because everything will be real-time. “Otherwise it’s like looking at a Word doc in a Web browser. It has to be real-time,” Roden says. Ted Roden says that at the root of the change towards real-time is a long list of emerging technologies that make it easy. “It’s blowing my mind how quickly the tech is coming out,” he told us. What’s Roden most excited about now? Tornado, the highly scalable, open-source real-time infrastructure released by Facebook after its acquisition of FriendFeed. He’s switched all his prototypes at the New York Times to it. “I’ll be really interested to see if people pick that up as quickly as they did Django,” he says. “It’s an easy framework to work with.” The technology is becoming easier and easier, now it’s largely just the frame of mind that has to change. “It’s not hard to write real-time code,” says Roden, “but if you’re in a LAMP mindset, that doesn’t scale in real-time.” See also: • Ted Roden’s shared items on at; • Roden’s Delicious bookmarks (technical) at; • Roden on Twitter at; • New York Times Labs on Twitter at | ReadWriteWeb | The Real-Time Web and its Future
  • 11. Superfeedr: Transforming theLegacy Web into real-timeSuperfeedr’s slogan is, “We’re doing something stupid sothat you don’t have to.” Julien Genestoux’s Superfeedr is aservice that pulls in content feeds from around the Weband then offers updates for those feeds in XMPP orPubSubHubbub format.Superfeedr’s primary contributions to understanding the real-time Web include articulating thefollowing:• The opportunity to add value through technological transformation of legacy resources into real-time;• The ease of leveraging real-time, normalized data through use of services such as Superfeedr;• How consumer markets may not be as prepared for real-time data as developers.That means, instead of polling feed publishers over and over again to check for new updates, afeed-consuming service can just sit and wait for Superfeedr to deliver updates automatically as theybecome available. The publisher doesn’t even have to publish real-time feeds: Superfeedr takes care ofthat. It’s real-time-as-a-Service.“We don’t just do polling,” Genestoux says. “For each feed, we actually try to determine what is themost appropriate way to get the updates: PubSubHubbub, RSSCloud, SUP, specific APIs (Twitterstream, etc). We do polling as a failover.”One year ago, Julien Genestoux launched a service called Notifixious. It delivered real-time updatesfrom any feed to a user’s IM client or email. Ten thousand people signed up for it, but 90% of themwere having just one blog delivered, usually by email. Not an inspiring predicament for Genestoux.A very small subset of users were using the service to follow thousands of blogs. Genestoux inquiredand learned that they were using the service like an API. “The vast majority said they would pay to dothis, too,” Genestoux told us,” as long as it was cheaper than doing it themselves.”Superfeedr now offers just that: transformation of feeds into real-time, at lower than the cost of yourcurrent feed-parsing system and in 15 minutes or less after publication – or your money back. ReadWriteWeb | The Real-Time Web and its Future | 9
  • 12. The company is working on lowering that to 3 minutes or less. Your first 1000 feeds are free; if you want to consume more than that, the company charges $1 for every 2000 items it delivers. Superfeedr pings feeds once and shares updates with all subscribed customers, dramatically lowering the polling overhead in the RSS ecosystem. “We also do feed normalization to make things easier for the subscriber and avoid the hassle of dealing with RSS/Atom + namespaces,” says Genestoux. Google’s Brett Slatkin, the primary developer of PubSubHubbub, is very supportive of what Superfeedr is doing. Genestoux says the companies using his service so far include SixApart, Adobe, Twitterfeed, StatusNet and a number of small services such as Webwag, EventVue, Quub, AppNotifications, and SmackSale. “So many services fetch feeds from other services,” Genestoux says. “The market is huge. In the end, everyone’s going to need real-time. It’s going to be the differential between services.” Genestoux firmly believes that the real-time Web will have the biggest impact on developers, not consumers. “The fact that services do not need to poll over and over, as well as have access to ‘normalized’ data, considerably lowers the bar to allow ‘free data’ to flow from one service to another. Up until now, if you wanted your app to include data from other apps, you had to massively invest in that (see Friendfeed), and maintaining such a component was a nightmare. If you make this ‘data flow/ stream’ transparent to the services, you start seeing richer mashups and apps that integrate data from others. I sincerely think that more than for end users, the real-time will eventually change how Web apps are built and interact together.” What’s the downside? Genestoux admits that not all companies are comfortable relying on a third- party service for this kind of functionality. Superfeedr went down for several hours one evening in November. Genestoux wrote a blog post discussing the problem and his solution.2 Superfeedr isn’t the only real-time-as-a-service company online. Others we’ve spoken to include and Kaazing. Surely, there are many more. But when it comes to lightweight feed- transformation services that are developer-friendly and engaged in cutting-edge Web technology conversations, Superfeedr certainly fits the bill. See also: Julien Genestoux’s lifestream of links and bookmarks at Genestoux is on Twitter at His circle on Twitter3 includes: • Ilan Abehassera, NY entrepreneur • Stephane Delbecque, SF entrepreneur • Sylvain Hellegouarch, French developer • Johann Romefort, CTO at Seesmic • Guillaume Dumortier, SF entrepreneur 2 3 | ReadWriteWeb | The Real-Time Web and its Future
  • 13. Real-time as a Trigger: Evri’sNews-Parsing TechnologyEvri is a semantic Web recommendation service for onlinepublishers. The company tracks the real-time Web to knowwhen it needs to create or update a topic page for one of itsemerging news topics.The primary contributions that Evri makes to understanding the real-time Web include articulatingthe following:• Creative ways that real-time and slower moving data sources can be used together to create value;• Wikipedia as a source of real-time data beyond Twitter and Facebook. We’ve seen Wikipedia used by other services before for disambiguation, but not as a source of real-time trending topics data;• Another example of text analysis as a very important part of a service provider working on time- sensitive content delivery; and• Struggles experienced by forward-looking startup companies seeking to bring real-time services to older businesses, in this case publishers.Evri watches news sources to see when a news topic is trending, including articles on Wikipediathat publicly available data shows have leaped in page views. Then it visits structured databaseslike Wikipedia and FreeBase to check for updates to entries about related entities. It then creates orupdates a topic page with news links, photos and Twitter search results. The language used in thoseTwitter posts is analyzed and the names of news entities in the posts are linked to other Evri topicpages, like pivots.“We’ve got it down to 15 minutes from when an event happens to when facts get updated,” DeepDhillon, CTO of Evri, told us. “Nothing is manual.” That may have been true of Patrick Swayze’s death,as Dhillon pointed out in our interview, but it was not true of the death of anthropologist ClaudeLevi-Strauss. The Levi-Strauss topic page was filled with news of his death, but for hours afterward theexcerpt from Wikipedia on his date of birth and death had not been updated to match the informationabout his death that Wikipedia and Freebase contained.“Another example is emergent entities,” Dhillon said. “The day after Michael Jackson died, therewas a bunch of info online about Conrad Murray, the physician. Within minutes, we had structured ReadWriteWeb | The Real-Time Web and its Future | 11
  • 14. information for a page but also for the rest of the system to link his ID with things like physician, Michael Jackson. It ripples through our whole system. We have some API customers that are all about emergent entities – we’re not just going to say that Conrad Murray is a person and a male.” It’s a work in progress and Dhillon acknowledges that more work has to be done, on text analysis in particular. Evri is working with the publishers that it draws content from (it’s wider than just a Web search) on matters such as structured data and push notifications. The publishing industry has a lot of catching up to do, though, in moving on from old content management systems that did little to create meta data. The content that Evri receives for analysis comes in various forms (National Imagery Transmission Format is one of the most common), and it has a wide variety of problems, but Dhillon says that publishers have a motive to make sure their product is annotated. More obvious is the incentive to do push notifications, Dhillon says. Timeliness being an advantage for Google ranking is obvious. In the future, then, everyday publishers may push highly structured content out to aggregators for analysis, but today Evri is watching the real-time Web for news spikes, then using those as a trigger to go out and query other parts of the Web. See also: • Deep Dhillon on Twitter • Deep Dhillon’s blog | ReadWriteWeb | The Real-Time Web and its Future
  • 15. How Warner Brothers usesThe real-time Web in the Music BusinessEthan Kaplan is VP of Technology at Warner Brothers Records,and he’s a pretty savvy guy. He has built a real-time dashboardto display the number of people who visit each WarnerBrothers artist website at any given time. When a site spikeson the dashboard, the team can hover over that part of thebar graph and see search results from blogs, Twitter andelsewhere to determine what caused the increase in traffic andto respond immediately.The primary contributions that Ethan Kaplan offers to understanding the real-time Web are articulatingthe following:• A legacy industry capable of taking new forms of action based on substantially decreased delays in information delivery;• The value of having your own data in real-time, instead of relying entirely on third parties;• Opportunities that arise from being able to create interfaces for real-time data display in-house; and• Opportunities still untapped when real-time data is analyzed in bulk.Kaplan tells us:“ We used to be oriented around getting data only once a week, because that’s how it was fed to us from SoundScan, Mediabase, etc. We’d then reconcile that data against our plan for the week.” “Now we’ve got a whole back end that exposes data in near and real-time: purchases going through the system, site visits, visitors logged in, comments left. The culture of that real-time environment has impacted how bands are being marketed and products are created. People want more and more real-time. “One day, for example, I saw a site with marginal traffic that suddenly had 7,000 people on it. We did a Twitter search, checked [celebrity blog aggregator] and found out that the artist was having a baby. No one told us! We immediately started planning to change the merch on the site, maybe have ReadWriteWeb | The Real-Time Web and its Future | 13
  • 16. a baby shower; we added a poll asking people if they thought it would be a boy or a girl; all steps to take advantage of the traffic that was coming to the site at that moment. “Something like that happens every day. One of the sites might be trending more than usual because the artist just released a record. We can correlate and react right away. Omniture is good data, but it’s not as fast as we have here.” Kaplan says the next step is to expose this data in ways that best suit different people throughout the company. He views it through an Adobe AIR application that he built in dashboard form, but different departments have different needs. He’d like to figure out effective ways to present that data all the way up to the CEO level. Traffic data is just one type of information that the company sees. Kaplan says Warner Brothers knows, for example, that promotions on Twitter tend to get higher click-through rates but lower conversions than promotions on Facebook. Authentic artist sites, even if they aren’t as contemporary in design as, say, Facebook, remain very important to the online music ecosystem. Kaplan says he’d like to see all data the company captures, including anonymous user-specific data in aggregate, run through artificial intelligence systems that quantify and detect patterns of engagement. “This user did X,Y and Z in a time period. That’s a huge amount of computation,” Kaplan says. He told us that he’s looking at Mapreduce, cluster analysis and other methods, but the big takeaway for him is that the company can do a lot because4 it has the raw data and understands what types of data it needs. Lessons learned? “We’re still at such an early stage that we don’t have any lessons learned,” Kaplan says. “We’re just constantly learning new things.” See also: Ethan Kaplan’s personal blog Jeremy Welt, SVP of New Media at Warner Bros Records Kaplan’s circle on Twitter includes: • Eston Bond, Fox Entertainment • Mathew Ingram, Toronto Globe and Mail • Kyle Neath, GitHub • Andy Gadiel, • Mikael Mossberg, Warner Bros. 4 | ReadWriteWeb | The Real-Time Web and its Future
  • 17. Urban Airship doesreal-time Mobile PushUrban Airship is a mobile phone push-notification and in-app sales-infrastructure provider. The company powers pushnotifications for a wide variety of customers, large and small,filling a gap created primarily by Apple’s implementationof push in a way that’s just complicated enough for manydevelopers to believe it warrants outsourcing.Urban Airship’s primary contributions to our understanding of the real-time Web include articulatingthe following:• The wide variety of potential use cases for real-time, including onto mobile platforms;• Another example of a real-time service provisioning as a business;• Limitations introduced by delivering real-time data through networks owned by other companies.Starting with the iPhone but aimed at cross- and multi-platform mobile services, Urban Airship told usa number of interesting things about its experience with real-time information delivery:• Machine-to-machine real-time messaging is now cheap and relatively easy to implement.• You can now get updates on a wide spectrum of activities. The technologies to deliver notifications are evolving faster than the use cases, and there remains some question of just what to do with these real-time capabilities. A number of real-time companies have told us that the technology is dropping in price and complexity so quickly that people are looking for particular ways to implement a clearly compelling general concept (real-time messaging). In other words, the real- time Web may be more tool-driven than demand-driven so far.• Use cases that Urban Airship has seen so far range from mobile social games to reminder apps to mobile storytelling that uses push notification to let a plot unfold over time. The company says it has other customers in sports and medical fields that it can’t discuss publicly. One that has just launched is a prescription drug-tracking service that pushes notifications soon before a user’s prescription needs to be refilled. ReadWriteWeb | The Real-Time Web and its Future | 15
  • 18. • Push notifications have been used most visibly by media companies to send simple messages, but each iPhone push can carry a payload and allow recipients to take actions such as voting or approving a purchase. “There will be richer content in the future, not just a line of text,” founder Scott Kveton told us. “It’s going to move from alerts to real-time interactive: more personal, more social.” • Scaling large quantities of high-priority real-time information remains a challenge. (Shortly after our interview, Urban Airship launched a product aimed at filling this need.) • One very new expectation that clients have of many developers who they hire is an ability to quickly build out real-time features. • Push notifications on the iPhone also require a download; push can come only from apps on the phone. So, Urban Airship says it is a cheap and easy mobile push-notification service and that rich use cases of the future are limited only by our imaginations.16 | ReadWriteWeb | The Real-Time Web and its Future
  • 19. Nozzl Media: Bringing real-timeto Old MediaSteve Suo and Brian Hendrickson were newspaper guysfor decades. Then the confluence of declining revenue andinstitutional risk-aversion, during a period of historic changefor the industry, led them to leave those institutions and strikeout on their own. Suo has a background in automated public-records extraction and analysis and Hendrickson in real-time.The primary contributions to understanding the real-time Web that Nozzl Media offers are articulatingthe following:• The gap between legacy publishing and the real-time Web; that’s both opportunity and barrier;• Another filtering strategy: user-centric, client-side and full-text, instead of strategic, programmatic and as a pre-determined value-add; and,• The opportunity available in transforming old data into real-time.Early this year, another long-time newspaper guy, Steve Woodward, joined them to found a startupcalled Nozzl Media. Nozzl aims to help newspapers embellish their original content with a real-time,filterable stream of hyper-local public records, news and blog posts. The company is building a mobileWeb app and Web page widget that push that content live to readers.Public records tend to be largely inaccessible, relegated to arcane, search-driven websites and dumbPDFs. Nozzl Media says it has built technology to extract that information, put it in geographic contextand push it live to the Web as soon as it’s discovered. ReadWriteWeb | The Real-Time Web and its Future | 17
  • 20. Nozzl is doing a number of particularly interesting things. PUBLIC RECORDS Nozzl extracts public records of interest – including Occupational Health and Safety Administration (OSHA) citations to businesses, approved building permits and doctors’ licensing information – from online repositories with what Nozzl calls its “automated form-pumping robot.” Many computer-assisted reporting specialists write scripts to perform one-off acts of data extraction for their research, but Nozzl has built software to perform these functions systematically, regularly, reliably and behind the scenes – and then make the information available in a published stream in real-time. The results can be quite interesting – and could qualify as news content. Is the raw feed of public records valuable, though? Or is a journalist with a trained eye still needed to find the real news in the feed and put it into context? Presumably, both the raw feed and the journalism it enables will support one another, but a raw feed of public records could possibly have a signal-to-noise ratio that no one but a journalist would find compelling. The fire hose is valuable, but sometimes the hand of a skilled, real-time curator is more valuable. Nozzl Media specializes in pushing the fire hose to the public as an act of media. Finding new forms of information that haven’t been available in real-time and making them easily available is a meaningful addition of value. People say that information about more and more social activities are becoming available as data – but someone has to build the infrastructure for that to happen, and that requires more technology in certain milieus than in others. Government data – so often made available in unsyndicated, opaque PDF files – is particularly challenging. Dislodging it into the cloud, then, becomes a particularly valuable act. LIVE FILTERING The Nozzl team built its Web page widget with a live jQuery feature that allows for filtering of the current corpus of data on the fly; items on display are filtered as each letter is typed by the user in the filter box. It looks like Google Suggest in reverse. Filtering the flow of data is something that every company in this space is talking about, and Nozzl has a unique way of doing it. Real-time, on-demand, full-text filtering at the user’s fingertips may or may not be a compelling user experience. It’s an option, though, that stands in contrast to the text analysis, entity extraction and imposed categorization that other filtering strategies emphasize and are slowed by. EMBELLISHING LEGACY CONTENT The time-frame for freshness in publishing is shrinking rapidly. While online publishing was so much faster than print publishing that it disrupted an entire industry, the manual creation of original content is the slow horse in the race online. That doesn’t mean it’s not valuable; it’s the primary source of value for the institutions in question (newspapers), but it’s not necessarily sufficient.18 | ReadWriteWeb | The Real-Time Web and its Future
  • 21. Embellishing the original content of newspapers with local real-time content is reminiscent of the oldnewswire model of newspapers syndicating AP or Reuters content. Will it save newspapers? A dose ofFacebook newsfeed-style delivery of things like new doctor licenses, local restaurant health violationsand aggregated blog posts on a newspaper website? That could make a big difference. Time will tell.NEWSPAPER RETICENCENozzl originally intended to focus on a mobile Web app, or delivering content for newspapers thatneed mobile apps. The company believes that newspapers and broadcasting organizations in generaldo not yet have effective mobile implementations; spend a little time using all but a few mobilenewspaper efforts, and you’ll see the validity of this argument.Newspapers were reticent to use Nozzl in that way, though. They wanted widgets for their websitesinstead. We assumed that was because websites are more effectively monetized using display ads.The widget economy and experience are crowded, though, and Nozzl argues that displays ads havepeaked and will only decline from here. Nozzl as a stand-alone, highly functioning local-news mobileapp strikes us as incredibly compelling; Nozzl as one more widget on a Web page, less so.Nozzl’s Steve Woodward says it was simpler than that, though. “The real reason [that newspapers werereticent about the mobile app],” he says, “has more to do with comfort level than any direct thoughtsabout monetization. Mobile is a new technology that most newspapers aren’t yet comfortable with.On the other hand, they feel they understand the Web, and they certainly understand content. So theyare able to see value in adding real-time content to a news site, while they have a harder time seeingthe same or greater value in mobile.”So goes the story of innovators who would break free of aging institutions only to establish businessesthat are built on adding value to those same institutions. Web widgets it is, for now at least.Bringing content to Web pages in real-time may not be a sufficient differentiator for Nozzle indefinitely,though. As Ted Roden of the NY Times R&D Department and says, “Otherwise it’s likelooking at a Word doc in a Web browser. [Everything in the future] has to be real-time.”Woodward says that the type of content Nozzle delivers will be key. “We need to step up our game tobring in more, not fewer, public records,” he says. “That kind of content will be the thing that sets usapart most from future competitors.”See also:Steve Woodward on Twitter Hendrickson on Twitter ReadWriteWeb | The Real-Time Web and its Future | 19
  • 22. Aardvark and the real-time Web of People Aardvark is a social search engine that combines artificial intelligence, natural-language processing and presence data to create what the company calls “the real-time Web of people.” The end result is “a magical experience,” CEO Max Ventilla says. The primary contributions that Aardvark makes to understanding the real-time Web include: • Leveraging presence data; • Communicating across platforms; • Emphasizing user experience; • Harvesting social data from third-party profiles; • Text analysis on-the-fly; • Mediating human interactions with machine intelligence; and, • Filtering the flow for both inquirers and respondents. You can ask Aardvark any question, and it will try to find a person in your extended social circles who knows about that topic and is available to answer at that moment. Aardvark facilitates these conversations through a very polite IM bot, an iPhone app with push notifications, the company’s website, Twitter or email. Instead of broadcasting your question to everyone’s stream of information, Aardvark delivers the question only to people who are relevant and available. Founded in 2007 but launched just this year, Aardvark’s got an all-star team of engineers from Google and Yahoo and high-profile investors. It’s already cutting deals with major tech brands, and the use cases are just beginning to be explored. The Web 2.0 Summit had a dedicated Aardvark circle for attendees to answer each other’s questions, and Federated Media will soon roll out a campaign sponsored by Microsoft in which Aardvark will facilitate a Q&A with relevant IT experts around the clock. The company says that 90% of questions get answered in five minutes or less. During our extensive use of the system and conversations with many other users, we found the answers that were delivered were generally satisfactory or better. The system gets smarter the more you use it. “When users come in and have a magical experience,” CEO Max Ventilla says, “that’s more important than the info they get back, to know that there are people who would help you immediately. This is20 | ReadWriteWeb | The Real-Time Web and its Future
  • 23. social search as a complement to web search. The billions of pages on the Web are static data; that’sjust a fraction of what’s available in peoples’ heads.”Aardvark goes so far as to say in a blog post about the real-time Web5 that, “What really matters is theincreased accessibility of people online, not just information online.”Users are tagged with areas of interest or expertise by the friends who invite them to the system, andthen they add additional tags on their own. Further information about what a person knows is gleanedby analyzing the user’s Facebook profile page or Twitter stream.“Data gets stale, even your profile data,” Ventilla says. “We want to keep that fresh, by taking advantageof all the data that’s passing by. The things you’re posting about [on other social sites] are things youhave recent experience with. Being able to converse with someone who just had a learning experienceadds a lot of relevance. Social graph and profile data built up over time, the fact that people aremaking that info available for building value with communication tools – that’s a dramatic shift withthe Web.”In addition to user tags and social network profiles, Aardvark analyzes the text of inquiries to find relatedusers to query, and it keeps track of response times and types. The service notes the vocabulary that peopleuse (including ‘off-color’ conversations), who likes little chats and who engages in extended conversations. Itthen pairs sets of users with questions and with answers that it believes will be compatible.5 ReadWriteWeb | The Real-Time Web and its Future | 21
  • 24. “This is a serendipity engine,” Ventilla says. “There’s variability in peoples’ experience, and we have to maximize the chance that something goes beautifully instead of bad. It’s about designing a user experience to keep a conversation on the rails.” Aardvark scores high on user experience for most of its interfaces, the latest iteration of its website being one possible exception. With this service, the website isn’t that important. Filtering the flow of information from the real-time Web is a concern that everyone who is touched by these technologies raises. Aardvark says it performs a filtering function by limiting the broadcast of a user’s question to relevant people they are socially connected to. “[With Aardvark,] you have the ability to have a conversation,” CEO Max Ventilla says. “This is fundamentally different from other forms of real-time search.” Conversations are so easy to have on demand with Aardvark that I once instigated and conducted three extended, simultaneous live interviews with topical experts around the world during a tech industry event6, all through the Aardvark IM interface. QUESTIONS THAT AARDVARK HAS ANSWERED WELL IN TESTING. • Is there any good way to serve a butternut squash and a sweet potato in the same meal? I’m thinking maybe I should just do the squash. [I ended up making a great soup.] • What are some examples of publicly available real-time data still excluded from search after today’s announcements by Bing and Google? [Best answer: commodities prices.] • What’s a good email address for Mozilla PR? [I should have had this already, and it took one line of explanation, but a Mozilla employee gave me contact info for the head of PR there within minutes.] • I have 5 minutes to choose: what tech, business, news or art podcast should I load up to take on a walk with my dogs? [Best suggestion: Monocle Weekly.] • What’s in Arm & Hammer baking soda laundry detergent, and can I spread it on my carpet to vacuum up? [I would have been to embarrassed to ask this in other contexts, but Aardvark subjected just a small number of people to my cry for help.] QUESTIONS THAT AARDVARK HAS NOT ANSWERED WELL IN TESTING. • What’s a romantic ocean cabin rental near San Diego that I might be able to get near new year’s? [No answer.] • What question should I ask the founder of Blog Talk Radio podcast service in an interview? [A 15 year old gave me a generic question, and I didn’t resubmit.] • Where can I get pizza delivered in North East Portland after 10pm? [To be fair, this may be an unanswerable question. I can’t believe I bought a house in an area with such bad pizza coverage.] 6 | ReadWriteWeb | The Real-Time Web and its Future
  • 25. Mendeley and the real-timeWeb of ScienceMendeley is a service for organizing scientific research papersand includes social features such as recommendations ofresearch and other scientists you might like. The company saysit’s like or iTunes for scientific research and has backersthat include co-founders of and Skype. The companyoffers both Web and desktop software.The primary contributions that Mendeley makes to understanding the real-time Web includearticulating the following:• Opportunities to transform legacy institutions in qualitative ways by reducing time and harnessing network effects;• The importance of offering non-real-time, non-social value in order to get individual buy-in; and,• The value of implicit data.What’s the real-time element? Whereas scientists traditionally have had to attend events to learn aboutthe hot research topics in their fields and who is doing related research, Mendeley can track readingand citation activity in real-time to provide recommendations and trending data. The company is alsoconsidering adding a feature to its Word plugin that captures and tracks citations as they are written.Bringing real-time, social network effects and recommendation to science? If successful, theconsequences could be profound. Effective online recommendations could change work in the laband the quality of the face-to-face conversations. Real-world interaction now has a whole lot morepreliminary context, thanks to the Web in general and services like Mendeley in particular.Mendeley says it is on pace to become the largest repository of scientific literature on the Websometime next year. The key to adoption of the software, the company says, has been that Mendeleyoffers value even when used alone: the meta data extraction and paper organizing are useful enoughon their own. There are many different kinds of software for organizing scientific papers, though, andearly versions of Mendeley had some trouble processing the content that users inputted. The softwareis really aimed at social recommendations, and many scientists enjoy it for that. ReadWriteWeb | The Real-Time Web and its Future | 23
  • 26. Librarians interested in discovering which journals are publishing the hottest research articles also use Mendeley; that is information that publishers of high-priced research journals haven’t had an interest in exposing. Mendeley envisions a future when university departments use the service to capture data about the productivity of their researchers, information that could influence hiring and tenure decisions. “The real benefit of real-time is for those doing the science,” Mendeley’s Research Director Jason Hoyt told us. “The most relevant research to yours could be in a minor journal you might miss. If it’s popular and relevant, this search process will show you that.” “You find researchers downloading a lot of papers,” Hoyt says. “Many times people will cite bad research; but implicit data – like opening a document several times, sharing it, etc. – that data says that a research document is really relevant.” Mendeley isn’t the only real-time company that derives a lot of its value from a desktop client and the implicit behavioral data that it provides. Many of the best-known real-time search engines leverage local software that captures implicit data. There is far more implicit data (like clickstreams) in the world than explicit data (like shared links) – it’s just a matter of building support for software that makes it available. Aren’t scientists famously private with their research in progress, though? “There might be some trade off, even with anonymous aggregate data,” Hoyt told us. “But you have to communicate in science anyway – and you have to give a little to gain a lot. You do have the option to make what you’re reading private in Mendeley, but less than 5% of articles and citations are hidden from complete view.” That’s a reasonable account, but some reviewers have said that Mendeley’s disposition towards sharing creates a flow that encourages users to either share publicly or not use the service at all. (Private group sharing isn’t yet supported, for example.) Time will tell how well Mendeley can move a market that’s already crowded with other research organization tools that are far less social. Hoyt says the company is still learning what to do with all the data it captures, but there are a lot of possibilities.24 | ReadWriteWeb | The Real-Time Web and its Future
  • 27. “ If we have a subset of research on a topic right now, we can then predict where the research is going to take us in future. We can predict how research topics are going to morph. Then you can know where to apply research funds or remove funds. People could start modeling their careers based on the data they are seeing.”One of the next steps on a technical level, Hoyt says, will be for Mendeley to learn how to extract setsof data from papers and offer scientists recommendations of data that are similar to what they areworking with.This is disruptive work that Mendeley is doing.See also:Jason Hoyt on Twitter Hoyt’s social graph on Twitter includes:• William Gunn, scientist,• Daniel Mietchen, scientist, biophysics, ReadWriteWeb | The Real-Time Web and its Future | 25
  • 28. Black Tonic Re-Imagines the real-time Web as a Controlled Experience Black Tonic is unlike any other company covered in this report. The Black Tonic product is a presentation tool for designers to give controlled, remote presentations of proposed design work to clients. The Black Tonic experience is not public. It’s not collaborative. It’s not a lot of things we associate with the most visible examples of real-time technology. It’s actually very controlled. Black Tonic is a download-free, HTML- and JavaScript-only browser-synchronization and browser- sharing application with unlimited viewership and support for broadcasting to mobile browsers. Still pre-launch, the company says it plans to “offer prices and plans that scale from independent designers to large agencies.” The company calls this type of browser synchronizing technology DOMCasting.7 It’s an interesting, relatively simple, model. A common problem for designers working for remote clients is that work tends to be sent in PDF or PowerPoint formats, via email. The client then clicks through the presentation at their own pace, with no explanation from the designer, well before the two parties have a phone conversation to go through it together. Designers don’t like this very much. “It frustrates the necessary process and work flow when reviewing work,” Black Tonic co-founder David Price says. Black Tonic offers a way for designers to control in real-time what is displayed in the viewer’s browser, through nothing but a Web link, and with as many remote viewers on Web or mobile browsers as they choose to share the link with. Presentations – complete with explanations, concepts and story – can then be given at the designer’s pace. Black Tonic argues that on real-time social networks such as Twitter and Facebook, the emphasis is on empowering individuals, and there’s no structure to the relationships between people. A spectrum of options is available on the real-time Web, though, ranging from technologies that reinforce and empower the perspective of the individual to those that force an individual to view content from a different perspective or a larger structured context. 7 | ReadWriteWeb | The Real-Time Web and its Future
  • 29. “If you’re doing a remote client presentation, how do you prevent the client from having a subjectiveexperience of the work?” Black Tonic co-founder Phillippe Blanc asks. “First, force them to view the workfrom a perspective guided by the designer. Once they understand the work and the context, you canhave a collaborative, constructive discussion about the work.”“Conversation is the new content. And true conversation only happens when people share timeand space,” Blanc’s co-founder David Price says. “The designer’s inability to storyboard is a failureof the process.”Historically, the two argue, when people find the limits of a technology, they develop workarounds.Then, when more powerful technology becomes available, people often fail to reconsider theworkarounds and so change the process.The Black Tonic team believes that lightweight real-time technology is an opportunity to reconsiderremote presentations, to add some structure to them and add the necessary control over presentationthat they haven’t had with the workaround of emailing PDFs.A whole lot of options arise when a new computing paradigm emerges. Real-time doesn’t have to onlymean delivering a chaotic or filtered stream of social information to an individual at the center of thesystem. Black Tonic is a good example of looking outside the standard application of a new technologyand instead taking advantage of the opportunity to reconsider standard practices that have beeninfluenced by technological limitations that no longer exist. ReadWriteWeb | The Real-Time Web and its Future | 27
  • 30. At the Red Cross, the real-time Web Saves Lives The real-time Web isn’t just changing our lives online; it’s starting to make a big difference offline as well. Disaster relief efforts at the American Red Cross have been transformed by real-time technology. Walmart may be world famous for its powerful inventory-control system, but some people say the Red Cross is becoming another leading example of a highly effective, large-scale organization co-ordinating activities around the world in real-time. The primary contributions that Michael Spencer’s discussion of the Red Cross makes to our understanding of the real-time Web include articulating the following: • The real-world consequences of real-time technology; • Transforming a legacy institution using real-time technology; • Strategic reliance on third-party software in a real-time context; and, • The importance of planning, relative to technology implementation. Michael Spencer, lead for SharePoint technology at the American Red Cross National Headquarters, puts it like this: “ The Red Cross has been around for over 100 years. I’ve been here for 12 years, and with what I’ve seen over the last year in terms of real-time information, co-ordination and our dashboard overseeing everything, I think we’ve made 50 years worth of advancement in a year or two because of real-time technologies. At the Red Cross, the real-time Web saves lives.”28 | ReadWriteWeb | The Real-Time Web and its Future
  • 31. The national Red Cross disaster response center responds to about 350 disasters every year, whenevera local chapter is beyond its capacity. When hurricanes strike, the organization has days to plan; withearthquakes or aviation disasters, it has no time at all to plan.Spencer says:“ It used to take two days to inventory our available volunteers. Now that can be done in one or two hours. We used to call them, send them emails, try to process all of these incoming emails. It was a struggle to get people on the ground. Now I can see exactly who is available, trim the list down by region, by language, by specialty skills. That’s all at my fingertips instantly.” “We now put videos and photographs in an online disaster news room, where victims can also go for shelter locations. We’re feeding information into SharePoint and then posting that to All that info feeds into a public shelter database; as soon as one opens or closes, the information is available to the public. It’s a way for the media to see what they can publish on the radio and TV. This is critical info. With shelters, once one is filled to capacity, people need to be sent to a different shelter. “We also have something called ‘Safe and Well.’ We can now register people through our website and then publish this information, so that anyone looking for info on family can search for peoples’ names, addresses or phone numbers. Displaced people can leave a message there – we can reassure so many people that their loved ones are safe.”The Red Cross makes sure to keep latency and downtime on that “Safe and Well” site as low as possible.One thing the organization has to do when responding to disasters is to verify the claims of home lossthat people file. That used to take a long time, but no longer, Spencer says.“ In this last year, we’ve sent volunteers out with PDAs. We used to go around with a car and sheet of paper to verify damage. Now we have handhelds that let you take a picture of a house – it has GPS in it – upload it to a satellite, and then we can do real-time monitoring from a dashboard. “That dashboard view of houses damaged? That would have taken weeks before. Now we can do it right away. The government can also do fly-overs that feed rough estimates of damage from a plane into our portal, so we can get an overview within a few hours, and then our volunteers go out with ReadWriteWeb | The Real-Time Web and its Future | 29
  • 32. devices. That used to take me a week and a half or two weeks, even longer. I could never get a fly-over by the government or get my volunteers in. Now it’s fed automatically to my dashboard. I don’t have to call people and report our new numbers. We even used to do shelter numbers by hand for meal ordering. Now it’s all done through the Web.” From volunteer and shelter co-ordination to the “Safe and Well” program to sometimes millions of dollars in donations collected online in a single day, the Red Cross is heavily dependent on its Web presence. The organization uses a service called AlertSite to monitor its uptime. AlertSite runs continuous automatic tests of website functionality and sends the Red Cross real-time alerts and diagnostics whenever there’s a problem. “We were having critical problems with SharePoint going out for 5 to 24 minutes,” Spencer says. “We can’t withstand that. AlertSite now pages all the engineers with diagnostics, and we respond immediately, sometimes just from our BlackBerrys.” Despite those problems, Spencer remains a big advocate of SharePoint. “ We’ve seen the evolution of SharePoint over time. The biggest problem with SharePoint 2007 is when you fail to put a good governance plan in place. Your work should be 80% planning, 20% implementation. It tends to be just the opposite. People tend not to plan it out well and don’t have a good idea of what SharePoint could do. We’re only leveraging about 15%, maybe 20%, of its capabilities. We had to spin up a call center for Katrina, for example: we needed to track calls, see who’s following up, etc. I was able to create a solution in SharePoint in one day, and they are still using the same system three years later. It’s all about training users how to use it, empowering them to take it off IT’s shoulders.” Another third-party service that the Red Cross uses heavily? Breaking News Online (BNO), the international newswire on Twitter and the iPhone. BNO is an amazing story. The service was founded two years ago by a 17 year old from Switzerland and is now run by a plucky little crew of online journalists around the world. It’s the fastest way to get breaking news from around the world, around the clock. Rafat Ali of the UK Guardian’s paidContent wrote last month that BNO is eating the mainstream media’s lunch and that someone really ought to try to buy the organization. Apparently, BNO is so on top of things that even the Red Cross watches it closely. Spencer says that a lot of people at Red Cross headquarters are subscribed to BNO. He told us the story of an eight-hour work session on simultaneous disasters that the team finished late one recent night, only to receive push notifications from BNO as soon as they closed their laptops, breaking news that another disaster had struck.30 | ReadWriteWeb | The Real-Time Web and its Future
  • 33. Key Players
  • 34. John Borthwick: Thoughtful prince of the real-time Web John Borthwick is a complicated, thoughtful man. Business Week called him “perhaps the real-time Web’s key articulator.” He has already built, bought and invested in more high-profile real-time Web technologies than probably anyone else in the consumer Web world. He’s hardly an unqualified cheerleader Creative Commons for the real-time Web, though. Borthwick Attribution Brian Solis is unafraid to consider different sides of a situation or to change his mind. In 1997, John Borthwick built and sold to AOL the content publishing company behind the site Total New York. The New York Times focused on the irony of the deal in its coverage: Borthwick had publicly called for independent content producers to stay independent just a month earlier. While at AOL, he testified in the US government’s case against Microsoft – but now he says he thinks the position he took was wrong. These days he argues instead that innovation will outpace monopoly in technology and that regulation isn’t the solution. Borthwick saw AOL fall from grace, but he kept in touch with many of the smartest people he met there, and he has ties to several of their startup companies today. That circle of people includes Gerry Campbell of real-time search engine Collecta and the Summize crew, which both Borthwick and Campbell invested in before it was acquired to become Twitter’s in-house search engine. Borthwick points to the rise of YouTube as proof that an entirely new kind of search can emerge fast. YouTube is now the second-most popular place for people to perform searches online, after Google. This summer he wrote, “I now see search as fragmenting and Twitter search doing to Google what broadband did to AOL.” These days, Borthwick is the CEO of Betaworks, the best-known investment group on the real-time Web. After Summize went to Twitter,, a link-sharing and analytics tool built by Betaworks and invested in by a constellation of Silicon Valley superstars, became the default URL shortener for | ReadWriteWeb | The Real-Time Web and its Future
  • 35. Other Betaworks investments include the most popular Twitter client (TweetDeck), Howard Lindzon’sTwitter experience for stock traders (Stocktwits), the new database of gadget reviews (Gdgt), fromEngadget and Gizmodo founders Ryan Block and Peter Rojas, the humor site Someecards, hyper-local news aggregator, lightweight customer support service UserVoice, content curationplatform Tumblr and 13 others. Betaworks itself bought Twitterfeed, the service that every organizationfrom CNN to the White House uses to pump RSS feeds into Twitter and now into Facebook.For all this real-timeness, Borthwick watches out vigilantly for his own ability to think andcommunicate in long form:“ I write about one long blog post per quarter. I don’t show them to anyone. I’m long-winded and verbose. I try to make it intentionally long form because there’s a lot of things we’re touching on right now. I write about history. A lot of the tools we’re using today are washing away history. There’s a bunch of really profound implications of that. I try to do long form things periodically because you can get so fragmented in our world that you never dig into the long-term issues that we’re contributing to but not talking about.”This leader of the real-time Web, one of the main men behind the biggest little link shortener on earth, isworried about the consequences of rapid-fire short-form communication? Thank goodness. Thoughtfulconsideration is very reassuring and too rare. Here’s Borthwick on why he does what he does:“ John Barlow said there was no Prana or life source energy in an Internet interaction, but could there be some sense of life and of energy that gets transmitted? Part of what’s happening in the real-time Web is the synchronicity that takes place in a real-time conversation. There’s not time to package and prepare the meaning around the meaning of what you’re discussing; the liveness of the event yields an order of magnitude different interaction, and that interaction is more human. The Web is becoming a more human place. We’re humanizing the machine a bit. I think that’s a good thing. I have three kids, and I see the way they interact with machines, and this is something I strive toward. There’s a moral imperative in this – but I don’t want to imply that for anyone else.”Borthwick is a believer in the data portability vision; he believes that identity will be separate fromservices in the future and that people will pick and choose between best-of-breed service options.“In the early days, there was a sense that people were going to build portal sites,” Borthwick says. “Thenpeople thought that social networks would provide a new way to navigate.” Now he sees search as aprimary form of navigation, a way to track conversations, not pages. ReadWriteWeb | The Real-Time Web and its Future | 33
  • 36. “ We believe things are becoming more connected. In the future, everything will consume APIs and publish APIs. People on the business side would say over the last 5 or 10 years, ‘That’s not a company. That’s a service’ [i.e. services with APIs at both ends]. I would say to them, ‘If it’s just a product, then what is the whole it should be a part of. They’d say Yahoo should buy it, but in most cases they squander it. I sold a company to AOL and went through the squandering of my company, then did that to other companies. If the next generation is cohesive parts, the whole they belong to is the Internet. “[Betaworks investment] Gdgt is a database. They are aggregating user- generated content around a structured data set. That’s central to what we think about at Betaworks. We view it as data structuring – that fits into our worldview of what’s important. They aren’t a gadget blog or a media company. They understand that many of those contributions won’t happen on their website, that the boundaries of their site need to be permeable. They are all involved in social real-time. They are also to a greater extent sharing open data. “In the real-time stream, a core reason why we jumped in with TweetDeck (we wanted to buy the company) was because Iain was articulating the data in a column format. The Web is striving for new representations of data types. We’re supplementing the page-based metaphor with the stream- based metaphor. When you screw with metaphors, you destabilize things. All the clients before TweetDeck used the heritage metaphor of instant messaging. “The metaphors people choose are so powerful for how people both publish and subscribe. I think we’re just scratching the surface of this stuff. The lock- in that we’ve had around pages has held us back in terms of innovation and how to use this medium. When we got here [to the Web] there was nothing, and we flopped a 500-year-old metaphor of pages, a browser that by its name says you will browse, not touch, this content. But it was not meant to be a one-way experience. We’re only a fragment of the way into this journey.”34 | ReadWriteWeb | The Real-Time Web and its Future
  • 37. ARE WE GOING TO GET BRAIN IMPLANTS?I made casual mention of brain implants and what a bad idea I think they are in a recent conversationwith Borthwick, and he had something to say about the matter.“ The brain implant is implicitly happening. I spend seven hours a day looking at and tied to the screen. We’ve extended ourselves into this network already; we’ve accepted it de facto. A good piece of the revolution for me is to humanize it more. There’s a large degree of computing and Web work that has occurred in the last twenty years that’s dehumanizing. The transition from portal to search to social distribution – part of that trajectory is that it’s becoming more human. But we are also placing ourselves into the network and into the machine. The day we wake up and realize that the network has ‘become self’ will be too late – we will have extended ourselves into the network. “Once upon a time, people thought eyeglasses were technology. In that Umberto Eco book ‘The Name of the Rose’, a character made eyeglasses. People thought he was modifying sight. You read this and it’s quaint. We embrace them as an extension of self, but we don’t think of eyeglasses as technology. We’ve become comfortable with the technological mediation of what we see. It’s an example of how human beings are capable of extending sense of self and embedding technology into our sense of self. “Filtering is already endemic to the stream. To some extent, everybody is curating the inputs into their stream, but sharing the curation tools is not available today or is very, very crude. Using other people’s brains to filter and help curate that data stream in a dynamic fashion is implicit to where all this is going. The data structuring stuff is important because we’ve got to find ways beyond search to find things. But as one of the engineers on Summize said, a computer science professor wouldn’t consider this search because the axis on which we measure is time, not relevancy. To me, it’s much more of a filtering metaphor. What we found with Summize was that people left multiple tabs open to run concurrent searches. All of the old PubSub Wyman stuff was coming back to the fore. Human filters, understanding how we can share, how we can do data structuring, using search and navigation for discovering relevant info is where this is going. ReadWriteWeb | The Real-Time Web and its Future | 35
  • 38. “ I feel like we’ve got this concurrent stream of how we can plug into what other people are doing, thinking, feeling and experiencing. We can bring greater humanness to that, make the world more connected and more understanding because we can understand other people’s context. That’s what you’re feeding in. A lot of that is what I’m working on, what I wish for and think is fascinating. “That said, I have a lot of respect for the sole contributor. My brother is an artist and has no interest in other people’s ideas. Many of the greatest works have been created that way. There’s a tension there that’s very interesting.” See also: John Borthwick’s blog posts and other information is at John Borthwick’s circle on Twitter1 includes: • Andrew Weissman, Betaworks, • Bijan Sabet, VC at Spark Capital, • Terry Jones, CEO at Fluidinfo, • Nathan Folkman, Engineer at Foursquare, former Systems Architect at Betaworks, • Mary Hodder, serial entrepreneur, 1 | ReadWriteWeb | The Real-Time Web and its Future
  • 39. Chris Messina: Rebel with aproposed technical standardJust 10 years ago, Chris Messina was asuburban teenager in New Hampshire wholost his faith in authority, stopped doing all hishomework and tried to hold his high school’swebsite hostage after he was suspendedfor running an ad on it for a proposed gay/straight alliance student group. Photo of Messina from Wikipedia, taken by Tara Hunt.Since then, he’s enjoyed some impressiveaccomplishments. He designed the two-page ad that ran in the New York Timesannouncing the launch of Firefox2; heco-founded a network of public events(Barcamp3) in more than 350 cities; heserves on the Boards of the OpenIDFoundation4, the influential new Open WebFoundation5; and he is now one of the mostclosely watched players in the world ofonline social networking. He’ll turn 29 yearsold in January.Now working as an independent consultant, Messina is one of the leading people behind a technicalformat for syndicating user activity data from one service to another in a human-readable way, calledActivity Streams6. Facebook, MySpace and Windows Live have already begun producing user data inthe Activity Streams format. Twitter does not yet.2 10,000 people donated $30 each to buy that ad and it featured all their names.3 http://barcamp.org4 ReadWriteWeb | The Real-Time Web and its Future | 37
  • 40. WHAT IS THE ACTIVITY STREAMS FORMAT? Everybody talks about filtering the real-time stream of information online, but the Activity Streams community is where conversations take place between leading engineers at the world’s biggest and smallest social networks with the goal to replace the “walled garden” model of social networking with an open, inter-operable communication marketplace. If Activity Streams succeeds, you will be able to subscribe to and filter the activities of your friends across multiple different networks, without having to sign up for or even know about those other networks. This is almost the equivalent of AT&T phones being able to make calls to Verizon phones, or of rail- transport companies being able to ship goods across the country over different railroad networks – because the rails for the trains are the same size. It’s different, though, because of the granular filtering by type of activity. Applications built on top of Activity Streams will allow for the equivalent of a phone that accepts phone calls only about certain subjects from certain people... because, of course, we’re now receiving a lot more inbound communication than we did in the telephone era. “ The real-time river of news makes information available to you as it is created,” Messina told us, “but you need a way to consume it that respects your time, enhances the content or makes it easier to consume. The Activity Streams format aims to allow people to receive a stream in a way that they can manage.” An extension of the Atom feed format, the spec explains it like this: “An activity is a description of an action that was performed (the verb) at some instant in time by some actor (the subject), usually on some social object (the object). An activity feed is a feed of such activities.” In the current draft spec, you can perform such actions as Post, Share, Save, Mark as Favorite, Play, Start Following, Make Friend, Join and Tag Object. An Object could be an Article, Blog Entry, Note, File, Photo, Photo Album, Playlist, Video, Audio, Bookmark, Person, Group, Place or Comment. These actions can have such contexts as Location, Mood and Annotation. Stream aggregator Cliqset publishes Activity Streams feeds that don’t require API authentication to view. You can see a sample one at: The aim of Activity Streams is to have multiple social networks use a common language and have a common understanding of what all those things mean, so that messages can be read across different networking sites. Messina explains that both publishing and subscription technologies need to become more sophisticated in reading and writing streams of data in order for this vision to become a reality.38 | ReadWriteWeb | The Real-Time Web and its Future
  • 41. He says:“ The real-time Web is a shift towards something more like how humans interact with the world: the information just flows right in. When it comes to thinking about Activity Streams, how can we add a few more semantic hints to the original info coming to our [subscription] agents? And then how do we filter what’s relevant? Here’s an analogy. Dogs have 300 million receptors in their noses, so they can parse smells really well. We only have 6 million receptors in our noses. Imagine if we went from having 6 million to 300 million receptors that we could use to filter information. We haven’t developed those sensors yet in order to create more possibilities.”Standardized, semantic clues from feed publishers and the ability to read them in whatever applicationwe use to read updates are the kinds of receptors that Messina is helping to design and implement.THE WEB OF PEOPLE“ The thing non-geeks can understand and bring to this is their identity,” Messina says. “We’re getting back to the individual as the primary actor in the system. They can hook up systems to their identity providers and do things. “Facebook is one of the first services to orient itself in this direction; it is providing some good R&D into where this is going, and it is doing good work in this kind of direction. You log in to your Facebook account, and everything flows to you. Right now, that’s the best metaphor that we have. “I think Facebook is going to play a very important roll. I think it has a desire to align itself with the Web, just as Google does. “Video games provide a great experience about what real-time on the Web would be like. Gaming has to be real-time to be enjoyable. Right now, most of the Web uses interfaces from the document-centric era of the Web that don’t scale or translate to the real-time Web. “For example, we want to have longer conversations, but email is one of the big linchpins that’s broken. Outlook is so entrenched. It’s clear that these conversation systems are broken. “But the ‘river of news’ doesn’t have handles that regular people can grasp. ReadWriteWeb | The Real-Time Web and its Future | 39
  • 42. The number of old people who make Facebook wall posts and think they’re private is enormous! But there are a lot of benefits to this real-time Web, like being able to reply immediately to a photo. My mom would like iPhone push notifications of pictures of me or my girlfriend. How do we lead with a carrot to get people to shift away from email and into a real-time model?” When Messina was 13 years old, he traveled to Greece and Italy and was shocked to find out that people in some European cultures left work in the middle of the day to have lunch with their families and take a nap. “The fact that a whole culture could exist and be so different from mine broke all my assumptions,” he says. That realization gave him a great sense of hope. Now, as an adult, the tagline on his blog reads, “All of this can be made better. Ready? Begin.” He’s been working to make the world better ever since, and now he has a whole lot of traction. Watch his work for an important window onto the future of the Internet. See also: Chris Messina’s blog Messina on Twitter His Flickr collection of notable user interfaces To understand Messina and his work, pay attention to: • David Recordon at Facebook • Scott Kveton, Urban Airship • Will Norris, independent software developer • Joseph Smarr and John McCrea, Plaxo/Comcast40 | ReadWriteWeb | The Real-Time Web and its Future
  • 43. Brett Slatkin, Brad Fitzpatrickand PubSubHubbubBrett Slatkin has long been an idealist. “If I made a greatproduct, and Microsoft offered me a lot of money, I wouldspit in their faces,” he told Newsweek while a brash freshmanat Columbia University in 2002. He joined Google aftercompleting a computer science degree in 2005. Last year,Slatkin sprung into public view with the launch of GoogleApp Engine, a product that lets developers run their Webapplications on Google’s infrastructure.Slatkin works on App Engine as his day job, but for his 20% time project he has led the creation of animportant new real-time syndication format called PubSubHubbub. Slatkin calls it Hubbub for short.HOW HUBBUB WORKSThe PubSubHubbub model has three parties. There’s a Publisher (FeedBurner, for example) and aSubscriber (perhaps Netvibes), and communication is facilitated through a Hub (Google’s AppSpotHub was the demo and is the most popular Hub so far). The publisher knows that every time newcontent is published, it will notify the hub; the hub that gets notified will be declared at the top of thepublisher’s document, just like an RSS feed URL. So, the publisher delivers new content to the hub, andthen the hub delivers that message immediately to all the subscribers who have subscribed to receiveupdates from that particular publisher.This is very different from the traditional model in which a subscriber polls a publisher directly every 5to 30 minutes (or less) to see if there’s new content. There usually isn’t new content, and so that modelis inefficient and slow. Hubbub is nearly immediate and only takes action when something importantoccurs. Protocol co-creator Brad Fitzpatrick says that the current system of websites polling each otherfor updates is like a kid in the back seat of a car saying “Are we there yet?” over and over again. Hubbubsays, “Shut up, kid. I’ll tell you when we get there.” That’s how Fitzpatrick explains it.It’s remarkably simple, at the end points in particular. If things ever get complicated, it would be in thehub, and that’s easily available as a service if a publisher doesn’t want to host their own. The hub doesthings like authenticate subscribers, check in with feeds that haven’t pinged it lately, deliver a singleupdate from a publisher to multiple subscribers and act as a publisher itself for other hubs to subscribe ReadWriteWeb | The Real-Time Web and its Future | 41
  • 44. to. Neither publishers nor subscribers have to worry about the hub’s details, though, unless they are looking for things like subscriber analytics. Real-time PubSubHubbub feeds are already being published by FeedBurner, Blogger, LiveJournal, LiveDoor, Google Alerts and the feed republishing service Superfeedr. Facebook’s FriendFeed, LazyFeed and the newest version of Netvibes are consuming Hubbub feeds so far, as are a number of small sites and services that are using the feeds for machine-to-machine communication. Slatkin is the public face of the protocol, but he created it with Google’s Brad Fitzpatrick. Fitzpatrick, now 29 years old, grew up in Oregon and built the popular social-networking service LiveJournal while he was in high school in 1999. One year later, he hired Martin Atkins, then a high- schooler in the UK and now a SixApart engineer and a leader in the online identity community. (Atkins also had a big hand in formalizing Hubbub.) In 2003, LiveJournal grew fast and hired a number of additional engineers, including then high-school senior and now Senior Open Programs Manager at Facebook David Recordon. Also in 2003, Fitzpatrick’s company developed Memcached, an open-source memory caching system that’s used today by Twitter, Digg, YouTube, Craigslist, Wikipedia, WordPress, Flickr and more. In 2005, Fitzpatrick sold LiveJournal to SixApart. Later than year, he created the first OpenID authentication protocol for LiveJournal. In other words, he’s been a whirlwind of technical innovation for the last 10 years. Fitzpatrick is now at Google working on what could become the infrastructure for distributed, independent and inter-operable social networks, PubSubHubbub among them. Fitzpatrick explained: “ Real-time stuff is one dependency around federated social networking. No one would suggest a chat function that’s based on polling, for example. You can’t compete with walled gardens that have real-time internally if you don’t. One of the obstacles has always been real-time: engaged conversation, news feed, etc. So in order to solve social networking we need to implement PubSubHubbub and WebFinger [a profile-syncing technology that Fitzpatrick is now working on:]. “Things are about to get interesting. I don’t need another social networking site – we need competition, we need the basic crap that all these sites do [posting, commenting, sharing, etc.] to be federated and all working together.” So Atom-based Activity Streams may be the language in which functions such as posting, commenting and sharing are expressed; and then PubSubHubbub may be the method of delivering the Atom feeds of updates in real-time.42 | ReadWriteWeb | The Real-Time Web and its Future
  • 45. The use cases are essential to consider, but Slatkin thinks of this work mostly as creating betterbuilding blocks that can then be used for anything. He emphasizes that engineers need to be buildingnow to scale for the unforeseeable use cases of the future.“Real-time implementers need to think about consistent [application] workloads,” he told us. “That’sthe only way they can scale.”“ To sip from the fire hose you need to only get what you care about. If you have to cut anything out, then you’ll drown. People say ‘RSS and Atom are good enough!’ I don’t think people know where we’re going to be in 10 years. Right now our back ends can handle the load – but if we only cared about today, then we’d just stay home. The whole point of technology is to make new things. When people think about the real-time Web, they need to think about new use cases that no one has considered because they seemed technically unfeasible. If you told someone 10 years ago that you could have 15 people concurrently editing a document – that was crazy!”Slatkin emphasizes that we can’t know what the ultimate killer apps for push will be, but he rattled offto us a short list of ways in which he could imagine them being put to use:“ Push as compliance with SEC for filing financial reports. Real-time monitoring of the performance of cloud services using Hubbub. Sensor networks: tiny sensors everywhere with little bits of data, sonar modules or IR pings. Put a thousand of those in a field and get a 3-D picture of what’s going on. So far, that’s been done with binary, proprietary, one-off protocols, hard to use. Open, real-time Web data could enable vast numbers of people to consume that sensor data. It could be used on battlefields, football fields or as road data.”Fitzpatrick thinks Hubbub could even replace Google’s crawls of the Web. “All content should bereal-time and subscribable,” he says. “You could replace crawling with this, every page on the Web. Youcould probably get most pages pretty soon, but one could imagine modifying Apache to support thisby default.”Former Googler Paul Buchheit (Googler #23, in fact), now at Facebook after selling FriendFeed to thecompany earlier this year, zooms into the smallest details. “The next step is for people to open more oftheir current activities and plans,” he wrote in a recent blog post.77 ReadWriteWeb | The Real-Time Web and its Future | 43
  • 46. “ This is often referred to as ‘real-time’, but since real-time is also a technical term, we often focus too much on the technical aspect of it. The ‘real-time’ that matters is the human part -- what I’m doing and thinking right now, and my ability to communicate that to the world, right now... When this activity reaches critical mass, it should be very interesting for society. It dramatically alters the time and growth coefficients in group formation. It enables a much higher degree of serendipity and ad hoc socializing. “The basic pattern of openness is that better access to information and better systems lead to better decisions and better living. This general principal is broadly accepted, but we’re just now discovering that it also applies to the minutiae of our lives.” See also our May article about Buchheit, “The Man Who Made Gmail Says Real-Time Conversation is What’s Next” conversation.php So, matters large and small will be shared on the Internet; they’ll be marked up in standard formats, and they’ll be pushed in real-time to anyone or any application that wants them. Then, we’ll analyze and learn from them individually and in aggregate. See also: • Brett Slatkin streams his activities at • Brad Fitzpatrick posts frequently to Twitter at • Google’s DeWitt Clinton is good to follow as well for related topics: | ReadWriteWeb | The Real-Time Web and its Future
  • 47. Steve Gillmor: The real-timeWeb’s Leading JournalistSteve Gillmor is a long-time technologyjournalist who now specializes in coveringemerging real-time technologies in consumerand enterprise markets. That’s the tamest wayyou could describe Gillmor. He has describedhis work to us like so: “I am an anarchist tryingto be as subjective as possible about what I © Copyright Laughing Squid,think is important.” used with permission.Gillmor’s primary contributions to our understanding of the real-time Web include:• Convening in-depth conversations with major players building the real-time Web;• Reporting on the spread of Twitter-like messaging in the enterprise; and,• Articulating a user-centric model of filtering real-time streams based on explicit and implicit attention data.Put those two descriptions together and you get a prolific stream of agenda-driven, perhapsstrategically semi-coherent, multimedia interviews with many of the most high-profile people whoare advancing the bleeding edge of the Web. Gillmor has had every major real-time player highlightedin this report on his show at some time. If you want a deep and broad understanding of this class oftechnologies, you should pay attention to Gillmor’s work, as frustrating as it can be at times. He wearshis heart on his sleeve, but he is the leading journalist covering the real-time Web. ReadWriteWeb | The Real-Time Web and its Future | 45
  • 48. Recent Interviews by Steve Gillmor Microsoft’s Bob Muglia, President of Microsoft’s Server and Tools Business, on Silverlight and Real-time Steve Mills, SVP IBM Software, on Real-time Phil Windley (Internet Identity Workshop), Chris Messina, and Craig Burton (networking consultant) Bret Taylor, Director of Products at Facebook Ray Ozzie (Microsoft), Sergey Brin (Google), Mark Benioff ( STEVE GILLMOR’S WRITING CAN BE HARD TO READ Reading Gillmor can be frustrating because he often abandons the formal, journalistic writing that he employed in covering the enterprise for publications like Information Week and InfoWorld. He was Editor in Chief at Enterprise Development Magazine and at XML Magazine. He wrote the first blog at eWeek. And he spent a long time at ZDNet. Now Gillmor has carved out a place for himself at TechCrunchIT, the small, ostensibly enterprise- focused channel on the Web’s leading tech blog. There, he indulges in long, metaphor-filled flights of beat-poetic tech analysis at every opportunity. Those posts can be hard to read. The metaphors are many and a little hard to follow, the agenda just a bit unclear. It’s not very accessible writing – but it’s often funny, something very few tech bloggers can claim. If you can follow the inside-baseball, you’ll be several steps ahead in comprehension. Read it once and you won’t want to read it again. Read it twice or very carefully and you’ll want to spend more time thinking about it. That’s just the analytical pieces, of which there are many. Gillmor’s news writing, if no longer what he’s best known for, remains as well written and readable as ever. Not everyone is a Gillmor fan, and Steve himself has argued that he doesn’t belong on a short list of key players in this field. We respectfully disagree.46 | ReadWriteWeb | The Real-Time Web and its Future
  • 49. STEVE GILLMOR’S VIDEO JOURNALISM IS ESSENTIAL VIEWINGIf you’ve got an iPhone, keeping up with these videos is easy.1. First, make sure you’ve got a YouTube account.2. Then fire up the YouTube app on your phone, and search for “stevegillmor” (two l’s, no e).3. Click on the blue arrow to the right of one of his videos, then click on the blue arrow again on that page.4. Click to the “More videos” tab, where you’ll find a button that reads “Subscribe to videos from stevegillmor.” Click on that and you’re set. This is the best way to consume Steve Gillmor’s work. Stick the phone in your pocket to listen to the videos like an audio podcast.The best journalism Gillmor does these days is on video. Once or twice a week, he posts either an hour-long-plus episode of his long-running show The Gillmor Gang or a short interview with someone fromthe tech industry. Most of his guests are engineers-turned-executives at major companies. Gillmor hasgreat access to interesting people and dives deep in the conversations.The Gillmor Gang is not for the light of heart. It tends to be a group of five or more industry-leadingpeople talking about their work in fair detail. Gillmor sometimes tries to guide his guests to making thediscourse accessible to viewers who are new to the topics, but most guests seem to recognize that fewpeople new to the field are likely to watch an hour-long round-table video podcast about the relativelyarcane plumbing of the real-time social Web. It’s not that it’s terribly technical, just rather detailed andforward-looking.As a fellow journalist, I listen to each episode of the Gillmor Gang and wish I had the time to excerpt15 or 20 key news items or explanations of important topics and explain them (more slowly) to myreaders in text. Some day I may do that, but for now I just listen to the show and recommend thatothers interested in these matters do, too.To listen to Gillmor’s video content, you’ll need to be comfortable with his tendency to push a strong, ifnot always clear, agenda.STEVE GILLMOR’S AGENDASteve Gillmor was President of a non-profit organization called the ‘’ AttentionTrust, which launched in July 2005. The Attention Trust was all about supporting the rights of users tocontrol their own attention data. Attention data can be roughly understood as our history of activitiesonline. It includes but goes beyond our browsing histories. It’s made up of “gestures.” The concept ofgestures is less obtuse than it might sound. In one unsympathetic translation of some of Gillmor’swriting, programming guru Joel Spolsky called it “an esoteric astronaut architecture.”Gestures are the ways in which we interact with content (reading, sharing, saving, commenting on,subscribing to, unsubscribing to or consciously ignoring it), and Gillmor believes these communicateintent. That intent can be harnessed to build filters that deliver personalized streams of content, which ReadWriteWeb | The Real-Time Web and its Future | 47
  • 50. will be most useful given our shortage of time and over-abundance of options. These ideas aren’t of Gillmor’s invention, but he’s one of the primary articulators of them today. The Attention Trust was a short-lived group that aimed to gather both political and technical support for users to capture their attention data. Another leader of the group, now CEO of social-media advertising company, Seth Goldstien, was then working on a futures market for users to sell that data. That was and remains just one of many possible uses of attention data. All of the above carried on against a backdrop of the rise of blogging and RSS, two technologies that changed the world dramatically. Here’s how that agenda has been manifested in this most recent era online, the era of Twitter. In September 2007, Twitter added a feature called “Track,” which allowed users to subscribe to a site- wide search for any keyword and have the results delivered by instant messaging. It was a valuable feature but expensive for Twitter to maintain, and so it lasted less than a year until it was removed. A year and a half later, Gillmor still talks about it all the time. His focus on real-time “started with the withdrawal of Track in May of 2008,” he told us. “Once Twitter started to collapse inward, then there was a bunch of people who tried to work around API rate limits.” By “collapsing inward,” Gillmor refers to the now widely held belief among developers that Twitter the corporation slowly became an disengaged, uncommunicative picker of favorites more than an open, engaged development platform as had been hoped. (Some contend that the company is just small, young and overwhelmed with attention.) In July of that year Twitter bought Summize, a Betaworks-funded sentiment analysis service founded by a team of ex-AOL scientists. Summize, which turned into Twitter’s search tool, was explicitly identified in Twitter’s announcement of the acquisition as the next step to bring Track back to the platform. The XMPP instant-messaging feed never returned. That pain point and the related wave of innovation around parsing the Twitter stream were the primary topics of discussion at a Gillmor- organized event in September 2008 called ‘’ Bearhug Camp. Gillmor says: “ When Summize was acquired, then people were realizing there was something to this track thing. Then FriendFeed manifested itself as semi-real-time with an open doorway; you could talk to the founders and engineers and it wouldn’t just fall into a black hole like with Twitter. FriendFeed had certain characteristics that, if amplified, could be really important: real-time for one, not just for users but also for developers. FriendFeed ignited the fuse. I think they were surprised and disrupted, as we all were, by the fact that their work led to the acquisition by Facebook.” Facebook acquired innovative cross-network social aggregator FriendFeed, built by ex-Googlers Paul Buchheit and Bret Taylor, for an estimated $50 million in August 2009. Development on FriendFeed has all but stopped, and the FriendFeed team members have new jobs at Facebook, where they are being very cautious about subjecting Facebook’s mainstream users to the power features that people loved48 | ReadWriteWeb | The Real-Time Web and its Future
  • 51. about FriendFeed. This according to interviews by Gillmor with Bret Taylor. Gillmor is pressing hardto get details and to advance his agenda in support of user-controlled granular filtering in his publicinterviews with the FriendFeed-turned-Facebook team. If you’re interested in understanding wherethe largest social network in the history of the Web (Facebook) is going, the conversations that Gillmorgets himself into are a good place to start.“Once people are used to a relatively fast reaction time [as with Track and several features ofFriendFeed], then things will move into full-blown stream readers. Like the death of RSS, that’s whatthis filtering is about [i.e. speed and responsiveness]. I think we’re at the doorway, but there’s somenoise and smoke along with some fire.”Gillmor contends that RSS is dead because it is based on an old slow architecture of polling for updates(instead of a real-time push of updates) and because RSS readers are cumbersome compared to newerfrictionless tools like Twitter and online video.“ This goes back to attention and gestures. Although it was opaque to a lot of people, we were trying to explain their own business models to [back in the Attention Trust days], it is all about gestures. How people demonstrate their interests and willingness to consume streams is the center of this next great evolution of the network: what is now called the social graph intersecting with the stream. “I expect that there will continue to be a misunderstanding of this, fueled by the economics. Investors aren’t particularly interested in a utility that doesn’t come with a payout. They are interested in a big cloud or strategies that mine it for economic gain. VCs typically take that attention-gestures model and go for explicit data. There was a company at our last event that showed a screen about authority and highlighted Pete Cashmore [of Mashable]. If your algorithm says that Mashable is the biggest authority on something, you need to fix the algorithm. As Scoble said, number of followers doesn’t translate into authority. “It’s about understanding who the user respects in terms of gestures. The notion of the social graph intersecting with the stream is where the big payoff is going to come. Right now they [the streams] are in aggregate not very efficient. But this isn’t about algorithms. The human brain is much more efficient, and modeling the human brain will hopefully take a very long time. Being able to mine the social graph has value not just in harnessing ReadWriteWeb | The Real-Time Web and its Future | 49
  • 52. the user’s thinking but in discovering the people who the user thinks are important, and the people who those people follow. That’s the sweet spot. That’s what the work on gestures was about.” That’s the perspective of Steve Gillmor when he’s interviewing many of the leading players who are building the consumer and enterprise real-time Web. See also: • Steve Gillmor on Twitter • Steve Gillmor’s shared links at Steve Gillmor’s circle on Twitter8 includes: • Karoli Kuns, political blogger, • Cliff Gerrish, Web designer in financial services, • Andrew Keen, skeptical tech writer, • Hugh MacLeod, tech cartoonist, • Loren Feldman, videographer, 8 | ReadWriteWeb | The Real-Time Web and its Future
  • 53. Another 15 Important People to Follow toUnderstand the real-time WebThroughout this report, we discuss a substantial number ofindividuals and their companies, but the following are somepeople we want to make mention of lest we forget. These areall people to be aware of if you’re tracking the development ofthe real-time Web.PAUL BUCHHEIT Follow him on Twitter @paultooPaul Buchheit was the co-founder of FriendFeed, which was acquired this year by Facebook. As Googleemployee #23, he developed the first prototypes of both Gmail and AdSense. We wrote at lengthin May about his belief that real-time conversations are the next big thing. Buchheit says he’s stillmost active on FriendFeed ( because of Facebook’s closed nature, but hebelieves that Facebook is moving in a good direction. Presumably he’s going to influence that directionnow. Buchheit coined the Google motto “Don’t be evil” nine years ago; hopefully he can help makesure that Facebook is a force for good in the emerging real-time Web. You can read his blog posts,which tend to circle for a while before providing a lot of insight, atÏC LE MEUR Follow him on Twitter @loicLoïc Le Meur is a French entrepreneur now living in San Francisco. He has already sold threecompanies, one to an advertising company, one to France Telecom and another to leading blogsoftware provider SixApart. Now he’s the founder of Seesmic, which is stream-reading softwarefor the Web, desktop and soon mobile. Seesmic began as a short-video conversation app but afteracquiring one of the leading desktop Twitter clients became a service for reading social networkupdates from Twitter, Facebook and elsewhere. Seesmic is probably far behind competitor TweetDeckin marketshare but is very innovative. Le Meur is also the organizer of the Le Web conference in Paris.In its 6th year, the conference’s theme is the real-time Web. Le Meur is important to watch becausehe’s super-connected, a good communicator and a risk-taker with a history of success. Unlike most ofSilicon Valley, he’s been focused on real-time for years. You can find Le Meur on Twitter @loic. His circleon Twitter includes blogger Robert Scoble (@scobleizer), educator and author Howard Rheingold (@hrheingold), TechCrunch writer Paul Carr (@paulcarr), technologist Kevin Marks (@kevinmarks) andHeiko Hebig, a former SixApart co-worker and now media development co-ordinator at German firmHubert Burda Media (@heiko). ReadWriteWeb | The Real-Time Web and its Future | 51
  • 54. DAVE WINER Follow him on Twitter @davewiner Dave Winer is a serial innovator of Web technologies, particularly syndication-related ones. One of the first bloggers and a key innovator in podcasting, Winer is responsible for moving RSS forward and for creating OPML (the format in which RSS feeds are moved between feed readers), among other things. Winer is now working on a feed reader called River2 and a real-time syndication format called RSSCloud. RSSCloud is similar to PubSubHubbub but was created in 2001, is based on RSS instead of Atom and is different in a number of other ways. Winer is a controversial figure but is very important to pay attention to as he has a long history of digging into what turns out to be the future of the Web. Winer’s blog is at, he’s on Twitter @davewiner and his circle on Twitter9 includes NYU Journalism Prof. Jay Rosen (@jayrosen_nyu), political blogger and consultant Karoli Kuns (@karoli), blogger Robert Scoble (@scobleizer), analyst Michael Gartenberg (@gartenberg) and angel investor Francine Hardaway (@hardaway). KEVIN MARKS Follow him on Twitter @kevinmarks Kevin Marks is VP of Web Services at British Telecom (BT). Previously at Google working on social network interoperability and at Technorati working on blog search, Marks is one of the social Web’s most outspoken and intelligent critics of shallow thinking. He is involved in several matters that the rest of us consider part of the “real-time Web,” from Google Wave to activity streams, but doesn’t like the term at all. In an insightful blog post this summer, Marks drew the following conclusion: “Much of the supposed ‘real-time’ Web is enabled by the relaxation of real-time constraints in favour of the ‘eventually consistent’ model of data propagation. Google Wave, for example, enables simultaneous editing by relaxing the ‘one person can edit at a time’ rule in favour of reconciling simultaneous edits smoothly.” In addition to his blog, Marks can be seen a lot on Twitter (@kevinmarks). Marks’ circle on Twitter includes entrepreneur Mary Hodder (@maryhodder), developer Gabe Wachob (@gwachob), BT chief scientist JP Rangaswami (@jobsworth), microformats innovator Tantek Çelik (@t) and social media consultant Suw Charman-Anderson (@suw). MONICA KELLER Follow her on Twitter @ciberch Monica Keller is an engineer at MySpace and one of the most important contributors to the Activity Streams specification. She’s all about the semantic Web; and as interoperability is built between real- time publishing platforms, Keller will be a key voice behind the scenes. She can be found on Twitter @ciberch, but she’s best connected on MySpace at RON CONWAY Ron Conway is one of the most prolific investors in Silicon Valley and announced in August10 that he is now focused almost entirely on companies in the real-time market. In August, he outlined a list of 10 9 10 | ReadWriteWeb | The Real-Time Web and its Future
  • 55. ways in which real-time can be monetized.11 Conway is not publicly active on any social networks, buthis public appearances offer insight into what kinds of real-time strategies will be backed by a budgetand introductions to potential high-profile partners.MICHAEL ARRINGTON Follow him on Twitter @arringtonMichael Arrington is the most controversial tech blogger on the Web but also its most effective.12 Hisannual summertime events are now called Real-Time Crunchups13 and focus primarily on the businessside of Silicon Valley’s love affair with real-time. You can see all the TechCrunch blog posts aboutreal-time companies via the site’s search page.14 Arrington is good to watch because he covers SiliconValley’s most high-profile companies. He has a knack for finding companies on a high-profile trajectoryand makes lesser-known companies high-profile through his coverage.PETER THIELPeter Thiel is the billionaire co-founder of PayPal and was one of the original backers of Facebook.Facebook is the primary real-time interface for almost 400 million people around the world, andThiel’s influence is not something to take lightly. He’s a big believer in “The Singularity,” a theory thatmachines will someday become smarter than humans. He also reportedly funded the group that didthe high-profile video sting of ACORN. He told Business Insider in a November interview:“ [An artificially intelligent computer] could be very good, it could be very bad, it could be somewhere in between. Certainly we would hope that it would be friendly to human beings. At the same time, I don’t think you’d want to be known as one of the human beings that is against computers and makes a living being against computers. So probably at the margins it would be prudent not to make a name for yourself as a anti-technological human being just as these computers are coming onto the scene.”That perspective from one of the men behind the real-time flow of activity data from hundreds ofmillions of people around the world seems relevant to the future of the real-time Web.BIJAN SABET Follow him on Twitter @bijanBijan Sabet is an unusually tech-savvy venture capitalist who sits on the Twitter Board of Directors15.All of Sabet’s online activities can be found via Sabet’s circle on Twitter1611 http://techcrunch.com13 ReadWriteWeb | The Real-Time Web and its Future | 53
  • 56. includes VCs Fred Wilson (@fredwilson), Brad Feld (@bfeld), Charlie O’Donnell (@ceonyc) and Moshe Koyfman (@mokoyfman), and entrepreneurs Nabeel Hyatt (@nabeel) and Scott Rafer (@rafer). That’s an unusually “roll up your sleeves” circle among Twitter leadership. It implies that Sabet is an influence in this company that will be paying particular attention to what other innovative companies are doing. ROBERT SCOBLE Follow him on Twitter @Scobleizer Robert Scoble is a power user of social media tools and a prolific publisher of content that analyzes cutting-edge Web tech. His primary focus is figuring out how such tools will help him with his own publishing, but he also enjoys other geeky things just for their geekiness. From live video broadcasting to data extraction from social networks to curation of content, Scoble has been engaged with the real-time Web longer than most people. Initially famous for his work as the public face of Microsoft in the blogosphere, he is now the public face of Web hosting company RackSpace. He has strong industry contacts and is better described as a blogger and experimenter than as a marketer. He stirs up important conversations and strikes an unusual balance between being high-profile and thoughtful. His work can be found at and STEVE RUBEL Follow him on Twitter @steverubel Steve Rubel is Director of Insights for Edelman Digital, a very large PR firm. Rubel is a thinker and a doer. He built up a popular blog full of power-user hacks and then left it to publish a lifestream instead. That site is watched closely by people throughout PR and marketing for insights into cutting-edge use of social media. Rubel has a clear interest in real-time and warrants attention as a harbinger of power- user practices of the future. He can be found on Twitter @steverubel, and his circle there includes Edelman’s Rick Murray (@rickmurray), Houston Chronicle journalist Dwight Silverman (@dsilverman), developer Paul Mooney (@moon), marketer Michael Wiley (@wiley) and marketing analyst Jeremiah Owyang (@jowyang). JEFF PULVER Follow him on Twitter @jeffpulver Jeff Pulver is co-founder of VOIP telephony service Vonage, a professional events organizer and an Internet legal policy advocate. He’s now organizing events on the cultural and commercial implications of the real-time Web in general, and Twitter specifically, called the #140Conf17. These events, held in LA, New York and London to date, emphasize marketers and celebrities. Pulver is valuable to watch because he’s been effective in previous technology communities and is interested in the human experience. Combine that with a celebrity communication platform and interesting things are bound to happen. Pulver can be followed at and on Twitter @jeffpulver. 17 | ReadWriteWeb | The Real-Time Web and its Future
  • 57. MARK KRYNSKY Follow him on Twitter @krynskyMark Krynsky is the Director of Web Production for the XPRIZE Foundation by day and the authorof by night. Krynsky provides in-depth coverage of the newest software forindependent publishing of online activity data. The site defines a lifestream as “a chronologicalaggregated view of your life activities both online and offline. It is only limited by the content andsources that you use to define it.” Whether these services become popular beyond the niche ofcreatives who use them today or continue to influence the way large social networks display useractivity data, they are important. As a big part of the user experience of the real-time Web (one of themost important factors in its viability), lifestreaming software is where you’ll find many illuminatingexperiments, and Krynsky is the leading voice chronicling and analyzing this small but significantniche. He can be joined on Twitter at @krynsky. Krynsky’s circle on Twitter18 includes ReadWriteWebwriter Sarah Perez (@sarahintampa), social media consultant Corvida Raven (@corvida), Rails developerDan Ahern (@danahern), Warner Bros. Records New Media Community Director Mike Fabio (@revrev)and LA entrepreneur Mike Prasad (@mikeprasad).ANIL DASH Follow him on Twitter @anildashAnil Dash is a long-time blogger who until recently was at blog software company SixApart. Thisfall, Dash announced that he is now the Director at, a new incubator for emergingtechnology to serve government. Dash has been in the middle of tech innovation for years andarticulated the state of the real-time Web at length this summer in a piece titledThe Pushbutton Web: Realtime Becomes Real.19 In addition to his blog, Dash can be foundon Twitter @anildash. Dash’s circle on Twitter includes Lifehacker founding editor and Google Wavefan Gina Trapani (@ginatrapani), Matt Haughey of Metafilter fame (@MattHaughey), New York DJ JaySmooth (@jsmooth995), self-described “Narcissistic Dilettante” Rex Sorgatz (@fimoculous) and designerMike Monteiro (@mike_ftw). Dash has ridiculously hip friends but is good to keep an eye on as hemakes the inaccessibly hip accessible.THE GOOGLE WAVE TEAM Follow them on Twitter: Lars @larsras Jens @jensrasmussenand Stephanie @twephanieThe Google Wave Team (Lars Rasmussen, Jens Rasmussen and Stephanie Hannon). Wave is a real-timedocument collaboration profile that we haven’t discussed in this report. Google has posted a shortinterview with Wave’s engineers, and the official hour-long demo video that was released prior tolaunch has been cut down to a highlight reel by Lifehacker’s Gina Trapani. Trapani has also authored ane-book called “The Complete Guide to Google Wave.” We suggest checking out those resources to learnmore about this innovative collaboration tool, but know that Wave isn’t a technology to replace everyother technology you use (as some pre-launch hype suggested).18 ReadWriteWeb | The Real-Time Web and its Future | 55
  • 58. Sector Overviews
  • 59. Stream Readers: Interfacesfor the real-time flowStream readers. What better name could be applied to thecrowded market of emerging tools that let you view updatesfrom your friends and the feeds flowing in from across socialnetworks and blogs? Hundreds of millions of people nowconsume Facebook Newsfeeds, a model that is reminiscentof instant messaging but incorporates asynchronous updatesfrom outside the network as well as from inside. Outside ofFacebook, more innovative startups than we can keep trackof are building interfaces to read and write to personalizedstreams of data.Interface is the key word, because user experience and design are the fundamental contributionsthat these services are making to the evolution of the Web. Design and features add value to thestreams, from visualization to customer relationship management. From art to business, these differentapplications serve a wide variety of niches. No one has solved all of the problems that the streamposes, but Facebook may be doing the best job overall.That said, many of these smaller services offer single features or design elements that Facebook doesnot offer – but that would be on our list of features for the “ultimate stream reader.”In addition to user experience and design, other issues are being tackled in the stream reader market,such as data feed standardization (see the section of this report on the activity streams protocol, forexample). Cliqset is the only stream reader focused substantially on activity streams, but a number ofparties are publishing standardized activity feeds (among them, Facebook, MySpace and Netflix).Monetization is another issue that this market is just beginning to address. Some products are aimedat the enterprise market (PeopleBrowsr, for example), and others will end up there.Netscape founder Marc Andreesen is backing a stealth project called RockMelt, that will likely take theform of a stream reader – although it’s being described as a Facebook browser for now. (We reportedfirst on that project1, and the company is looking to hire a hotshot Mac developer as we speak.)1 ReadWriteWeb | The real-time Web and its Future | 57
  • 60. That’s just one sign that this is a field of great opportunity. Here’s a selection of some of the best features across 12 of the most interesting stream reading services on the market today. Put them all together and perhaps you’ll have the ultimate stream reader after all. Check them out individually and you’re sure to find some hints about where the user experience part of the real-time stream is headed. You may want to use some of these services, you may want to know about some of these features, and you may benefit from applying some of the thinking behind this software in other contexts. THE ULTIMATE STREAM READER MIGHT INCLUDE... PROFILE DISCOVERY AND MERGING Nomee auto-discovers accounts that your contacts on one network have across more than 100 other services around the social Web, then merges those profiles together per person. It’s an incredibly compelling experience: like one of the best features of FriendFeed taken to a much higher level. With the addition of this feature, Nomee went from being a ghost town to being a useful tool buzzing with content that you might have never seen otherwise from people you care about.58 | ReadWriteWeb | The Real-Time Web and its Future
  • 61. A BEAUTIFUL UISkimmer2 is a beautiful AIR app for viewing updates from across Twitter, Facebook, Flickr, YouTube andblogs. It is hands down the prettiest stream reader in its class, a model of simple aesthetic elegance.THIRD-PARTY SERVICE INTEGRATIONSeesmic started out as a video conversation service; it acquired a desktop Twitter client, and it is nowa full-featured cross-network stream reader. The Seesmic Web interface is particularly beautiful in bothform and function, but its new integration with the Mr.Tweet API to perform social graph analysis anddisplay each user’s closest contacts on their profile pages is pure genius. Twitter may be most useful forlistening, though it’s most often talked about as a broadcast technology. Programmatic determination2 ReadWriteWeb | The Real-Time Web and its Future | 59
  • 62. of who those you listen to are paying attention to is a great listening-oriented value-add on top of the stream. Threadsy does something similar but is more ambitious and ultimately disappointing. This “universal inbox” service puts your social network replies into the same inbox as your email, which doesn’t make sense if the signal-to-noise ratio in email is much worse than it is in social network replies (and it probably is). One useful service it does do is display information from other networks whenever you look at the profile of someone who has sent you a message. Unfortunately, until the WebFinger protocol (see the section in this report about Brett Slatkin and Brad Fitzpatrick) becomes a reality, this will be very hard to do well.60 | ReadWriteWeb | The Real-Time Web and its Future
  • 63. SMART PROCESSING OF SHARED ITEMSSobees is a Silverlight-based Twitter client that includes awesome integration with an API from FacteryLabs3. Factery follows links shared on Twitter and pulls out key sentences from the shared article.Sobees gives users the option to view those sentences underneath the links in their streams. The result:a great way to get a feel for what the linked article says. Performance is a little hit-and-miss, but whenit works, it’s great. It’s a fabulous example of a stream reader reaching into the stream and adding valueto the experience based on the content being shared.CIRCLES OF SOCIAL PROXIMITYOrsiso organizes your contacts by circles of social proximity, including automatic suggestions based onyour past interactions. You can even, for example, prioritize a contact’s photo albums higher than theirTwitter messages.3 ReadWriteWeb | The Real-Time Web and its Future | 61
  • 64. EASY SHARING Streamy is a gorgeous stream reader that lets you share items with people or external networks. You do this by clicking and dragging a link until a menu wheel of options and contacts appears. Drop the link into the corresponding sphere, and a sharing box pops up. The vision of a flowing stream that you can part with your hand to curate content for one friend or network at a time, is a reality at Streamy. PUBLISH AND SUBSCRIBE TO STANDARDIZED ACTIVITY STREAMS Cliqset publishes its streams in the Activity Streams format. Because it’s an aggregator, it transforms feeds from other services into this standardized format. That makes it a super-good citizen of the open Web. When stream readers are inter-operable, then they will have to compete on features, not lock-in. RSS is the standard that makes importing activity feeds possible, but by normalizing the namespaces of different activity types, the Activity Streams protocol will allow for much more sophisticated filtering across services.62 | ReadWriteWeb | The Real-Time Web and its Future
  • 65. REAL-TIME CONVERSATIONTwingly Channels pulls in feeds and uses long polling to show when people are typing commentsabout an item in a feed with multiple subscribers. It uses those comments as well as other gestures tocreate a “popular items” view that sits beside every raw aggregated feed from multiple sources. ReadWriteWeb | The Real-Time Web and its Future | 63
  • 66. COLLABORATIVE CURATION is like a reverse Tumblr or Posterous, for OPML files. You curate collections of feeds, read them through a beautiful interface and subscribe to other people’s collections. The service includes recommendations based on any collection you view. A brand new service, could use a lot more development, but it’s already a great way to build a collection of reading material curated by people interested in particular topics. is one of an increasing number of stream reader services we’re seeing that don’t offer a “river of news” view; rather, it forces you to look at collections one at a time. Lazyfeed does it that way as well. A compiled list of the most recent updates across all subscriptions (i.e. the river of news) is most useful for serious work-related feed reading; but this forced channel-clicking model preserves context and is a fun way to do discovery.64 | ReadWriteWeb | The Real-Time Web and its Future
  • 67. A LITTLE CRM ACTIONPeopleBrowsr4 offers several levels of service, up to a Business account or custom development for theenterprise. It includes auto-replies, manual sentiment tracking, notes, response assignments, emailalerts for high-priority messages based on your criteria and more. It’s limited to Twitter, though.4 ReadWriteWeb | The Real-Time Web and its Future | 65
  • 68. ARTIFICIAL INTELLIGENCE? TweetDeck consumes Twitter, Facebook and MySpace feeds and is the most robust-feeling client available after its recent updates. Its column-based display was the primary differentiator in its early days, and its mobile group sync is only going to get better once Twitter List support is added. But the next steps the service will take will be towards a “lightweight AI,” according to an interview we did with founder Iain Dodsworth this fall5. “At its most basic,” Dodsworth said, “if TweetDeck could predict what the user was probably about to require next, based on current activity, then it could start to collate that data in the background – cross Twitter, Facebook, LinkedIn data, for example.” 5 | ReadWriteWeb | The Real-Time Web and its Future
  • 69. BLOW MY does a whole lot in one stream-reading client – so much that the user might take a littlewhile to get comfortable with the tool. It recommends Twitter and Favit users to follow. It does majorand minor filter creation: minor filters are for keywords over feeds that you’re subscribed to, and amajor filter is just a big button at the top of your screen that you press to turn a global filter on or off. Itrecommends filtered collections of blogs assembled by other users with similar interests. It’s amazing.Put all of this together and what do you get? Something that doesn’t exist yet and may never exist.There is some very exciting work is being done in the stream reader market though. ReadWriteWeb | The Real-Time Web and its Future | 67
  • 70. Real-time Search: Challenges Old and New In the Spring of 2006, the story goes, Google launched Google Finance onto the Web and was promptly dismayed to find that the service didn’t appear in a Google search for its own name later that day. It was after that and a few other similar experiences that Google engineers created an algorithm called QDF, or Query Deserves Freshness. QDF determines when results for a query need to be augmented with the newest content available, in addition to the content with the highest PageRank. Three years later, Google executives still talk publicly about the need for search to become more real-time6. In that time, Twitter and Facebook have captured the public’s attention. A long list of startup companies have emerged trying to solve the real-time search problem. Most are presumed to be over- hyped Twitter search engines, but the real-time Web is much larger than Twitter and so too are the ambitions of most of these real-time search engines. GOOGLE’S NOT TOO SHABBY Meanwhile, Google itself is doing a better job already with real-time search than most people in the real-time startup community would like to admit. Between the very responsive Google Suggest and Google Onebox insertion of news results into search results, Google is often more timely than people give it credit for. Want to see pages Google has indexed in the last 2 minutes? You can just change the search results URL7 and get exactly that. Google Labs has begun to roll out a product called Social Search; Google will highlight the search results it finds on pages owned by your friends on social networks. Soon Google will unveil the first version of its 6 Lary Page: “I have always thought we needed to index the Web every second to allow real-time search. At first, my team laughed and did not believe me. With Twitter, now they know they have to do it. Not everybody needs sub-second indexing but people are getting pretty excited about real-time.” 7 | ReadWriteWeb | The Real-Time Web and its Future
  • 71. partnership with Twitter; hopefully it will be more exciting than what Bing and Yahoo have done.A plane crashes on the Hudson River and the first reporting comes from a Tweet? A Google search for“Plane on Hudson” may not highlight Twitter, but that Tweet is at the very top of the page when yousearch for “Plane Hudson Twitter.”People say that ranking is a challenge with real-time signals and they use the plane on the Hudsonstory as an example of how hard it is. In real-time it is more difficult than it is for Google to find thehigh-profile Tweet in question months later. As Bing’s Antonio Gulli said in a recent interview8, it’sa problem to find content that is both popular and fresh. (Gulli, by the way, says that Twitter’s highvolume of messages is in fact a substantial challenge.)In the meantime, you can get naked Twitter search results put on the top of your Google search resultspage now with several Greasemonkey scripts9. In many cases using these scripts makes the value ofreal-time search immediately evident.Finally, someday websites could use the Google-developed real-time protocol PubSubHubub to notifyGoogle of new content for indexing immediately after publication. That’s part of the vision of protocolco-creator and serial world-changer Brad Fitzpatrick.THE SEARCH MARKET CAN BE CHANGEDThink real-time search is just a flashy trend? That standard Web search is the only form of search withlasting power?See our profile elsewhere in this report of real-time Web investor and former AOL exec John Borthwick.From that section: Borthwick points to the rise of YouTube as proof that an entirely new kind of search can emerge fast. YouTube is now the second-most popular place for people to perform searches online, after Google. This summer he wrote, ‘I now see search as fragmenting and Twitter search doing to Google what broadband did to AOL.’Borthwick’s emphasis on Twitter may be misplaced, though. Real-time search engineers estimatethat there are between 500 and 1000 messages posted to Twitter per second and 5 to 10 million linksshared per day, before de-duplication.That might sound like a lot, but there’s a huge real-time Web that’s much larger than that. ConsiderFacebook, Myspace, Blogs, browser click-streams, traditional websites, voting/bookmarking/sharingservices, IM-like user presence data, photos, videos and machine generated data. The list of data typesflooding the Web and potentially available for real-time analysis is much larger than Twitter. Thatmeans real-time search is much larger than Twitter, too.8 ReadWriteWeb | The Real-Time Web and its Future | 69
  • 72. IS REAL-TIME SEARCH TOO BIG FOR ONE WEBSITE? In fact, real-time search is so big that some companies believe it’s too big to tackle alone. At least three well-known real-time search engines are using downloaded software on their users’ computers to power distributed indexing of the real-time Web: Wowd, Faroo and OneRiot. Wowd and Faroo are P2P-powered, in fact. Wowd co-founder Borislav Agapiev writes at length10 about how he believes distributed search through client software is the most effective and efficient way to scale extensive indexing of the real-time Web. In a world of web applications, asking users to download software might seem crazy – but sought-after investors Draper Fisher Jurvetson, KPG Ventures and the Stanford University Engineering Venture Fund don’t think so. They invested in Wowd. Server farm, lightweight web app – or something in between like client-side software with a whole lot of engineering stitching it together; there are a number of different strategies being applied to powering real-time search. THEN THERE ARE SPAM AND SECRETS Spam in real-time streams is no small matter. Searching Twitter turns up a very large amount of it. Collecta’s Gerry Campbell says his company only pushes live to its site about 10% of the content it discovers – and it has relationships with publishing platforms as its primary source. Spam and relevance are related issues, and relevance must be balanced with timeliness. OneRiot explains how explicit and implicit data work together in real-time for relevance and spam control. “On any service where you’re sharing info, you’re going to share the cool stuff,” Tobias Peggs of OneRiot said to us. “I’m not going to tweet about my secret love of ABBA. But that ABBA content is actually very useful. The explicit data we all publish is filtered to sound cool. The panel data OneRiot captures [through opt-in clickstream tracking by its browser toolbar] is unfiltered data about what people are actually doing. If you peel away a little bit more, then you’ve got spam control – there’s so much spam on Twitter, there are spam rings retweeting Viagra ads. If you’re able to cross reference that with toolbar data and other social services, that really helps spam detection.” LIMITING YOUR SCOPE Some people believe that a good real-time search engine needs to capture the whole flowing stream and keep it forever. Others disagree. Gerry Campbell is the CEO of Collecta and a search veteran. Collecta develops relationships with established publishing platforms like WordPress, Reuters, Flickr, and Qik. That data isn’t held for long though before Collecta dumps it. “ At Altavista we were indexing documents that updated infrequently. Now the way the Web works and it’s only going to accelerate, is through short utterances. The temporal nature is that you don’t need to keep a tweet for 10 | ReadWriteWeb | The Real-Time Web and its Future
  • 73. 5 years. Our data proves that when you get out past 120 days the timely value of something is zilch, if they want to go beyond 120 days people won’t come to a real-time search engine. I don’t want to replace Google, it’s an adjunct to existing experiences. “There’s lots of opportunity here, I think there’s between a half billion and $6 billion per year potentially in real-time search. 20% of our queries are monetizable. I’ve built a big business on less.”SEARCH VS. DISCOVERYBecause of the role that time plays in what’s called real-time search, some people don’t consider thissearch at all. Respected SEO consultant Bruce Clay expressed exasperation in a recent episode of theWebMaster Radio podcast about real-time search11 – “it’s not even search!” Listening to Search EngineOptimization specialists offer their advice for dealing with real-time search, usually mistaken forTwitter search alone, is interesting. Publish a lot and listen for sales opportunities, seem to be the mostcommon suggestions.For users, the flowing stream of real-time data is something best experienced with immersion, ratherthan a refined search. Some people say that real-time search is best suited for tracking conversations,not looking for destinations.Dan Olsen, CEO of topic-driven discovery engine, says that search and dicovery aretwo different things. He argues that the real-time Web is better suited for discovery. “Search is likelooking for a needle in a haystack – but it’s not optimized for recurring discovery,” he says. “For topicsyou’re interested in again and again, search isn’t optimized for that. The real-time Web is most usefulwhen you’re doing ongoing discovery.”YourVersion, LazyFeed and Regator are all examples of services that act like a persistant search fortopics that users have an ongoing interest in. You enter topics keywords and these services streamin blog posts, videos and other online media assets about those topics, as soon as they can bediscovered. (LazyFeed reads PubSubHubbub and RSSCloud feeds for part of its discovery proccess.)If those services can keep the pipeline of high-quality content full, fresh and spam free, even for nichetopics, then the Web’s understanding of search could change dramatically.From the growth of publishing platforms like blogs, Twitter and Facebook to the increasing accesssocial software affords developers to things like IM presence or distributed computing – it seemsclear that search is very likely to change at the hands of the real-time Web. Just like YouTube rose fromnowhere to become the 2nd largest search destination on the Web, so too could any number of real-time search engines be substantially disruptive.11 ReadWriteWeb | The Real-Time Web and its Future | 71
  • 74. Text Analysis and Filtering the real-time Web Middleware for the real-time Web is a growing market filled with startups serving business customers. Most of these companies are performing functions like sentiment analysis, spam control and entity extraction. We interviewed a number of companies in this market and we offer below some of the most illuminating highlights of those conversations. These should help paint a more detailed picture of the opportunities and challenges in real-time text analysis and filtering. In some cases text analysis delivers immediate value from the real-time Web, but most often that value is derived from putting that analysis in context with historical information. SYSOMOS Nick Koudas Business intelligence for social media “From a science perspective, if you take traditional techniques and algorithms, non-real-time computer science with the luxury of real-time, if you put those into real-time they simply break down. If you want to have real-time guarantees then you need to change your techniques. There is new thinking that one has to put into the tech because of the volume and lack of luxury to say let’s wait for an hour.” FACTERY LABS Sean Gaddis and Paul Pedersen Extracts key “facts” from real-time feeds of text “The challenge is determining who are the good people and what’s the good content they are pointing to. The best technique is to fetch the pages, do a lightweight semantic analysis of the page, then judge the page on its intrinsic value. Then we create a fact index based on that. We tell users, ‘here are the good facts generated by good links.’ A user who shares a good link is a good user. Preferred users and good pages give each other a bump up. “We learned this from experience with the semantic overkill at [now Microsoft owned] Powerset: you can do 10% of the work and get 90% of the value.”72 | ReadWriteWeb | The Real-Time Web and its Future
  • 75. LEXALYTICSJeff CatlinSentiment and text analysis software used by companies like Thomson Reuters and Scout Labs“We take streams of data and extract key concepts. We work well in applications where you’re not surewhat the precise question will be. If you need to browse around through the data, do discovery etc.we enable that. We’re primarily technology middleware, when we’re direct it’s usually in the financialspace.“At our foundation we’re a speech tagger, verbs, nouns etc. On top of that we’ve got a full grammarparser. So when we’re scoring sentiment we will understand who the speaker is and the tone will beapplied to the subject of the sentence, we don’t just look at the proximity of words in a sentence.“In doing sentiment analysis, if you’re looking at an individual story then a human will beat a machineto death. But on the aggregate a machine will do the same to a human.“One lesson we’ve learned, that has never changed, but nobody seems to grok is this: in this world ofunstructured data, 90% of the problem is getting your hands around the data and getting it cleanedup to work with. The data is messy and getting the technology around it is the big part of the workthat needs to be done. Can someone you’re working with take somewhat dirty data and work withit? Anyone can work with clean data. Most people sweep this under the Web, they don’t want to talkabout it because that’s where they look stupid.”POSTRANKJim Murphy and Ilya GrigorikFinds distributed social media feedback about feeds from publishers“Content parsing and normalizing in real-time is hard, it took years to figure out how to effectively dothings like finding the author tags in different blogs, filling in the language fields, etc. all in real-time.Even though PubSubHubbub has lowered the barrier to real-time, it’s just the tip of the iceberg.“Now we do full language classification. Most blogs don’t offer this and when they do it’s oftenformatted wrong. We now incorporate sentiment analysis. We’re building the data services toproductize all that and then let other people build on top of it.”FIRSTRAINMarty Betz and YY LeeFinancial services research suite“Our goal is to percolate up business-relevant events in real-time. These are the kinds of thingsthat are easy for our system to point out in day-to-day, but what’s happening with these onlinediscussions over a day? Are there meaningful things we can conclude over the course of the day? Asyou compress the time period to intra-day, when looking at open and unstructured sources, like blogsand industry journals, it requires a different strategy. Bloggers and Twitter feeds publish at random ReadWriteWeb | The Real-Time Web and its Future | 73
  • 76. times, sometimes journalists are Tweeting about things they are still writing about, so we are trying to figure out emerging trends in that context. Our tools are able to identify a frequency of activity, but the frequency of relevant tones is getting faster and faster. For us to predict or identify that something is happening means we have to be incorporating those new, faster signals. “The biggest issue with real-time sources is that we have to filter out junk. Most of it is junk, you have to pick out the rare, interesting things. We have some idea about what sources are good and bad, but even good sources will only write something interesting 20% of the time. That’s a much lower ratio than press releases. “Of course a single bit of data coming through from real-time won’t change most decision making, instead we assemble a bunch of these signals to make a mosaic of information. Rarely does that single bit percolate through immediately. Our customers are sales teams trying to be smart before walking into customers, or looking for info on industry trends. That’s the mosaic of information. People see our system and they picture themselves using it to find a gem. For example, we once delivered a personal blog post from an engineer working at Qualcomm who complained about missing a family vacation because of a late tapeout. That was very relevant to the chip industry. But that’s rare, it’s really about briefing sheets and trend graphs. There is significance to things that happen moment by moment when they show trends. What we want is to build those mosaics with few enough data points to push out alerts. “Our users know and are willing to put in time to craft very precise descriptions of what they need. Then they go day in and day out back to that same process. They want us to detect when changes occur. The majority of positive feedback we get has to do with longitudinal pattern recognition. Even if there’s a single point of value, it’s based on historical knowledge. Mostly the value is in understanding the trends and seeing the rising edge earlier.”74 | ReadWriteWeb | The Real-Time Web and its Future
  • 77. Visualizations
  • 78. THE PATH TO VALUE A Continuum of Real-Time Web Use Cases People to People People to Machine/ Machine to People Machine to Machine Object to People76 | ReadWriteWeb | The Real-Time Web and its Future Traditionally Light Traditionally Heavy Now Heavily Developed Now Just Emerging Twitter Webfinger Sync Aardvark Olark Replace Crawling Most RTW Search PostRank IM
  • 79. REAL-TIME IN CONJUNCTION WITH THE STATIC OR SLOWER WEB REAL-TIME WEB AOL sells Netscape REAL-TIME WEB GM files bankruptcy AAPL releases iPod REAL-TIME WEB REAL-TIME WEB REAL-TIME WEB MSFT REAL-TIME WEB YHOO MY BLOG Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus mattis, nunc eu molestie sollicitudin, est turpis pharetra ante, a consequat nibh risus vehicula leo. Morbi vitae nisi nisl, a ultricies diam. Aenean eu ante nulla, nec dictum orci. Cras quis dui velit, in REAL-TIME WEB scelerisque quam. Pellentesque volutpat tristique elit ac auctor. In blandit faucibus luctus. Quisque a erat dolor, FirstRain et lobortis tellus. Suspendisse elementum fermentum EVRI lorem, in pellentesque ipsum ornare faucibus. Aenean STATIC REAL-TIME WEB Wikipedia TOPIC COMMENTS REAL-TIME WEB PAGE REAL-TIME WEB Echo Microsoft SLOWER WEB Yahoo takeover PIP.IO FIRSTRAIN ECHO EVRI puts real-time will pull patterns finds bookmarks finds spikes in the presence and from the real-time and tweets that real time stream and stream to combine uses them as a communication have flown away with patterns from trigger to go through around static from static blog and check slower the slower web to posts and pulls web media changing sites like produce finance copies of them Wikipedia and assets reports back into Freebase and then comments uses all that to create Evri Topic PagesReadWriteWeb | The Real-Time Web and its Future | 77
  • 80. INFORMATION OVERLOAD Making use of the river of data NOW FUTURE: Best Case The amount of information on the real-time With the right education, tools, and useful stream leaves a majority of users overwhelmed. compelling use cases, most will find the real-time web useful, and few will overwhelming find it purely overwhelming.78 | ReadWriteWeb | The Real-Time Web and its Future overwhelming useful FUTURE: Worst Case useful People will sacrifice usefulness in order to avoid being overwhelmed. overwhelming
  • 81. Selected background articles on real-timeTechnology, from the ReadWriteWeb Archives
  • 82. INTRODUCTION TO THE REAL-TIME WEB This May ‘09 collection of 20 early articles we’ve written on the topic provides good background context. Real-time information delivery is fast emerging as one of the most important elements of our online experience. No more waiting for the Pony Express to deliver a parcel cross-country, no more waiting for web services to communicate from one polling instance to another. This is information being available to you at nearly the moment it’s produced, whether you’re watching for it or not. In May ‘09, Google declared real-time search to be one of the biggest unsolved challenges it faces, the NYTimes put a link to a new real-time view of all its news stories on the front page of its site, and Facebook announced a new feature that will let users be notified instantly when their friends interact with media related to themselves on the site. This is big stuff, but what does it all mean? THE REAL-TIME WEB: A PRIMER Serial entrepreneur Ken Fromm wrote an in-depth three-part primer on ReadWriteWeb this Summer from his perspective, focusing largely on Twitter. REAL-TIME WEB PROTOCOL PUBSUBHUBBUB EXPLAINED Google’s Brett Slatkin is the co-creator of the PubSubHubbub real-time protocol. In this post we shared his slide deck from a trip to Facebook where he explained the technology. 3 MODELS OF VALUE IN THE REAL-TIME WEB Our May ‘09 article on forms of value latent in the real-time stream wandered toward the psychedelic, but was based on extensive implementations of real-time research technologies. Hey web DJ. Reach into your magic bag of search tools and pull out a big result – dripping with related ephemera born just moments ago. Those could hold the grain of information you’re really looking for, or they could sparkle with data that changes your course of action in unexpected ways. Alert! Another factor has emerged, elsewhere on another site. You said you wanted to be told, right away, about any online artifacts that crossed a threshold of popularity within a certain group of people in your field. That has just occurred, so it’s time to watch the replay of how it got so hot, evaluate its usefulness and decide whether to bring this emergent phenomenon into the work you were doing before you were interrupted, drop the former for the latter or return to your original focus. How would you like this to be your job description? It could well be – if the red hot real-time Web keeps showing up on sites all around the internet...80 | ReadWriteWeb | The Real-Time Web and its Future
  • 83. If you liked this report, check out our other reports: Guide to Online Community Management The ReadWriteWeb Our first premium report for businesses comes in two parts: Guide to Online Community Management a 75 page collection of case studies, advice and discussion concerning Edited By Marshall Kirkpatrick May 2009 the most important issues in online community; and a companion online aggregator that delivers the most-discussed articles each day written by experts on community management from around the Web. ReadWriteWeb Premium Guide to Online Community Management page 1 ReadWriteWeb | The Real-Time Web and its Future | 81
  • 84.