Your SlideShare is downloading. ×
Web Meets World: Privacy and the Future of the Cloud
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Web Meets World: Privacy and the Future of the Cloud


Published on

An introduction to privacy issues around cloud computing, with an eye to the ubiquitous computing future of the cloud. First given 20/11/2008 to the Privacy Forum in Auckland, NZ.

An introduction to privacy issues around cloud computing, with an eye to the ubiquitous computing future of the cloud. First given 20/11/2008 to the Privacy Forum in Auckland, NZ.

Published in: News & Politics, Technology
1 Comment
  • Excellent work.

    zunita |
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Web Meets World Nat Torkington
  • 2. Stories About The Present Thank you. Hello, everyone. Before I talk about the future of the cloud, I’d like to begin with a few stories about the present.
  • 3. (1) First story.
  • 4. AOL America OnLine. In August 2006, AOL released
  • 5. 20,000,000 650,000 20M web queries from 650k users, sampled over three months, to help researchers improve the state of search technology. AOL claimed that the data had been
  • 6. anonymized anonymized by turning usernames into unique IDs. However many searches contained identifying information such as
  • 7. addresses addresses,
  • 8. names names,
  • 9. e-mail messages and even e-mail messages. The New York Times was able to identify, for example, user
  • 10. 4417749 4417749 as
  • 11. 4417749 Thelma Arnold Thelma Arnold of
  • 12. 4417749 Thelma Arnold Lilburn, Georgia Lilburn Georgia, who has an interest in
  • 13. 4417749 Thelma Arnold Lilburn, Georgia numb fingers numb fingers
  • 14. 4417749 Thelma Arnold Lilburn, Georgia numb fingers 60 single men 60-ish single men, and
  • 15. 4417749 Thelma Arnold Lilburn, Georgia numb fingers 60 single men dog that urinates on everything dogs that urinate on everything. The Times went to visit her and ran a great caption to the photo accompanying the story:
  • 16. On the subject of AOL, remember that they
  • 17. tried tried to anonymize. The privacy loss, while you and I might think it was predictable, wasn’t
  • 18. deliberate deliberate. Shit, as they say, happened.
  • 19. (2) Second story
  • 20. Google Google. They collect a lot of data about people:
  • 21. searches searches,
  • 22. ad impressions ad impressions,
  • 23. clickthroughs clickthroughs,
  • 24. mail mail
  • 25. chat messages chat messages
  • 26. documents documents
  • 27. spreadsheets spreadsheets
  • 28. presentations presentations,
  • 29. addresses addresess
  • 30. medical records medical records, and more. Fortunately Google know that all this data can be dangerous, and take steps to safeguard the
  • 31. privacy privacy of their users. In February 08 they posted to the Google Blog saying they would
  • 32. cookie shorten cookie lifetimes
  • 33. two years to a mere two years, and
  • 34. anonymize anonymize their search logs
  • 35. two years after 18-24 months. It didn’t take long for these steps to be shown up as
  • 36. good enough not good enough. You see, another Google property,
  • 37. Y ube ouT YouTube, which serves
  • 38. 1,000,000+ more than a million videos every day, came under attack.
  • 39. Viacom v Google by Viacom, parent company of Paramount, Dreamworks, MTV, and Nickelodeon. In March 2007, Viacom sued Google over YouTube claiming that its copyrighted TV shows were available on YouTube and that Google wasn’t doing enough to prevent this unauthorised copying and distribution.
  • 40. U.S. District Judge Louis L. Stanton In July this year, five months after Google announced their “anonymize after two years” policy, U.S. District Judge Louis L. Stanton granted a motion to give Viacom
  • 41. “the motion to compel production of all data from the Logging database concerning each time a Y ube video has been ouT viewed on the Y ube ouT website or through embedding on a third-party website is granted” a copy of the YouTube access logs. This information includes:
  • 42. IP addresses your IP address
  • 43. Y ube username ouT your YouTube user name
  • 44. time of day the time of day you
  • 45. which videos watched a particular video. While you and I might be as horrified as Google was, the Judge said
  • 46. “privacy concerns are speculative” privacy concerns are
  • 47. “privacy concerns are speculative” Speculative. And he quoted a line from that blog post made back in February ’08, where Google said that an IP address
  • 48. “The reality is though that in most cases, an IP address without additional information cannot [identify you]” - Google Blog, Feb ‘08 alone isn’t identifying information. Fortunately Viacom are on top of the privacy concerns raised by their request for YouTube logs. They say they will limit access to the data
  • 49. “is going to be limited to outside advisers who can use it solely for the purpose of enforcing our rights against Y ube and ouT Google” - Michael D. Fricklas, Viacom’s general counsel to Viacom’s advisers. Whew, thank you Viacom. And thank you, Google, for choosing to retain that data and only anonymize after two years!
  • 50. (3) Now let’s look at Europe.
  • 51. Directive 2006/24/EC On March 15, 2006, the EU adopted Directive 2006/24/EC. This mandates
  • 52. “the retention of data generated or processed in connection with the provision of publicly available electronic communications services or of public communications networks” the retention of data generated or processed in connection with the provision of publicly available electronic communications services or of public communications networks.
  • 53. 6 months - 2 years For anywhere between 6 months and 2 years, depending on the type of data. Fortunately it only covers
  • 54. telephony, mobile telephony, Internet access, Internet email and Internet telephony. telephony, mobile telephony, Internet access, Internet email and Internet telephony. Member states have the option to delay the Internet bits until March 2009 and most have opted to delay. Member states also have the option to go beyond the EU recommendations, as
  • 55. .dk Denmark has chosen to do. Their draft provision enacting the EU order would require ISPs to log the
  • 56. source source
  • 57. time time, and
  • 58. destination destination of
  • 59. every single internet data packet every single Internet data packet. I’m reasonably confident this is impossible, but it does show the ambitions of the states here. The states are all confident that the data won’t be
  • 60. “misused” misused, but don’t defined “misuse”. I won’t define “misuse” either, but will simply note that Germany makes their logs available for prosecutions under
  • 61. © civil copyright law so that Disney will be able see when you logged in and who you spoke to.
  • 62. Ok, that’s the end of the stories.
  • 63. common What did they all have in common?
  • 64. Cloud They’re all about the cloud. Or, rather, the
  • 65. “Cloud” “the cloud” (air quotes). It’s a broad and popular term, and like all broad popular terms it’s become more of a buzzword to get venture funding, or to sound like you know what you’re talking about than a useful marker of technological progress.
  • 66. “Cloud” = On Demand + Utility Computing + SOA + ASP + SaaS + ... It covers a multitude of sins. We talked about Software As A Service (YouTube), and that’s a real trend. Lots of different types of software are moving into “the cloud”.
  • 67. Meet Google Healthcare. Patients upload their medical records and get a friendly useful interface for checking drug interactions, etc. The privacy aspect to this is a little odd, though.
  • 68. Google Healthcare slips through a crack in the medical record legislation in the USA-- because they’re not holding records on behalf of providers, no security is mandated. The expectation is that the market will determine an appropriate level of security. Hoorah for the market! And as we all know, markets immediately find the appropriate level of security with no incentive to provide too little. And fortunately there can be no privacy problems flowing from an inadequate level of security. Right? Right?
  • 69. is Customer Relationship Management--salespeople and their managers can keep track of contacts, accounts, etc. online. You know, “Joe at IBM buys widgets, here’s his name and address and a record of every conversation that anyone in the company has had with him”.
  • 70. Google Mail.
  • 71. Here’s, one of several web-based apps playing in Quicken’s back yard. They fetch your bank statements from your bank, process them, and tell you where you’re spending your money.
  • 72. Utility Computing That was software as a service. Another aspect to the cloud is utility computing, such as
  • 73. Amazon Web Services Amazon’s web services. They offer many services over the web that have nothing to do with books, including
  • 74. Amazon EC2 their “Elastic Compute Cloud”, EC2. You upload a virtual machine image (think: copy of a hard drive) and Amazon runs it for you. No longer do you have to worry about managing bandwidth, installing operating systems on servers, replacing hard drives, and all the stuff that IT departments used to have to do. Consequently
  • 75. many people have build their companies on EC2 and its sister storage service, S3. Many of these are startups, attracted because it lets them scale without substantial investment, but many EC2 customers (Eli Lilly, for example) are sizable. Amazon’s services are getting a lot of use,
  • 76. as this graph shows. Amazon’s Web Services overtook, in amount of traffic, in mid-2007, just 18 months after launch.
  • 77. Google also have a utility computing play: you write your program to run on their environment and away you go, all of Google’s more than 2 million servers are yours for the using. Like Amazon, you’re charged by the CPU-hour.
  • 78. Microsoft are getting into the game as well, with their Live platform. According to a recent Economist article, they’re rolling out.
  • 79. 35,000/month 2,500/container $500M/data centre 35,000 new servers every month in data centers. They’re building the data centers from 40- foot shipping containers that each house 2,500 servers. The new data center in Chicago cost $500M. These numbers aren’t unusual for the industry.
  • 80. “Cloud” What does Utility Computing and Software as a Service have in common? What unites Google Apps, your ISP, Amazon EC2, and
  • 81. SEP They all add up to the same thing: they make computing
  • 82. Someone Else’s Problem someone else’s problem. It’s
  • 83. Outsourced IT Outsourced IT. Every program on every machine is a pain in someone’s ass. IT departments must keep the versions up to date, deal with bitrot (when Windows stops working and everything has to be reinstalled), etc. Using Google Docs means all the administrative hassle of Microsoft Office has become Google’s problem. And part and parcel of outsourced IT is
  • 84. Outsourced Security outsourced security. It’s a pain in the bum locking down machines and applications against
  • 85. Hackers, Viruses, et al. bad guys. Every IT department asks itself “can Google do a better job of this than I can?” and probably answers, correctly, “yes”.
  • 86. Privacy & Security But because there’s a strong relationship between privacy and security, namely it’s impossible to have privacy without good security,
  • 87. Outsourced Privacy using Google Docs means that we’ve outsourced privacy as well.
  • 88. Where’s the problem? Many people ask “so what?” when you tell them that they’ve outsourced their privacy. “Google will look after me, right?” they say. There are two reasons why this isn’t necessarily true.
  • 89. (a) Google decides First, Google’s gets to decide how much security to put around your data, and have no doubt--it is a decision. There’s always
  • 90. the next threat another threat to defend against. If you’re looking for
  • 91. perfect security perfect security, then
  • 92. perfect security you might as well look for perfect happiness. There is no such thing as perfect security, just acceptable levels of risk. Your company makes decides upon that acceptable level every time they decide not to train their IT staff in the latest security practices, every time they choose between one antivirus product and another, every time they decide not to buy a $75,000 network security appliance. That’s perfectly normal. What’s different in the world of the cloud, though, is that
  • 93. decide you don’t get to decide. Your vendor does. And no vendor will tell you all the security precautions and procedures they have. Their security is
  • 94. opaque opaque. Few vendors even report incidents, and even fewer countries have mandatory reporting laws. So you’re asked to make a security (and thus privacy) decision on the basis of imperfect, or even absent, information. That’s the first reason that outsourced privacy isn’t all good.
  • 95. (b) Gubmint = Black Hats The second reason is that it’s not just rogue teenagers hellbent on getting your credit card details that you have to worry about. As the EU directive clearly show, governments are black hats, bad guys. They just have
  • 96. subpoena / court order / search warrant / request different attack vectors from those of your average 21 year old Ukrainian virus writer. And as
  • 97. Y ube ouT usernames IP addresses Viacom Viacom’s logfile data grab shows, many private companies have learned to piggyback on the judicial and legislative’s attack vectors.
  • 98. Warshak v USA The situation’s even worse in the USA. Until now there have been two standards for searching your records:
  • 99. constitutional Constitutional, preventing unreasonable search and seizure, which covers stuff in your home. Constitutional protections are hard to circumvent, as they were written by angry 18th century revolutionaries and tend to be quite explicit in their distrust of government. The other protection is
  • 100. statutory statutory, the laws that fill the gaps in the Constitution. Statutory protections guard your data outside the home
  • 101. statutory = weaker and generally require law enforcement to have less justification than do the constitutional standards. This difference between two standards means law enforcement can more easily intercept your data outside the home than in it -- bad news for hosted apps like Google Docs, Mail, etc. Let me go over this again:
  • 102. data in the home if you download your mail to your laptop and keep it there,
  • 103. data in theE ! A F home S the police need a search warrant to get to it, and the burden of probable cause that goes with it. But if you use Gmail and
  • 104. data in the cloud keep your mail on Google’s servers,
  • 105. D ! E data in the cloud N P W and the police need only a court order, which has a much lower hurdle to clear. A recent lawsuit,
  • 106. Warshak v USA Warshak v USA has lead to a ruling that the two should be treated the same (and Constitutional standards should apply to getting data from ISPs). It’s bouncing around appeals at the moment, and the double standard will continue until and unless it’s resolved in Warshak’s favour.
  • 107. “Relying on the Government to ensure your privacy is like asking a peeping T to install om your window blinds.” - John Perry Barlow In short, security and privacy are as much threatened by legal and regulatory means as by viral and Trojan.
  • 108. But that’s enough about the past. Let’s get back to the Future of the Cloud.
  • 109. Future of the Cloud Did you hear those capitals? Futurism needs capital letters. I don’t plan to make predictions about
  • 110. CO2-burning flying cars we’ll all be driving, or
  • 111. the silver robots that will do our bidding. All I do is identify
  • 112. trends trends, growing numbers of similar things done by people who live on the cutting edge of what’s possible with today’s technology, because Alan Kay knew how to do futurism:
  • 113. “The best way to predict the future is to invent it.” - Alan Kay He invented object-oriented programming, the the windowing system, pulldown menus, GUIs and computer interfaces as we know them today basically. So I look for people or projects that seem to be to be inventing the future. Here are a few.
  • 114. Let’s start in an unlikely place: the foot. The Nike+ is a shoe with a accelerometer that counts steps. It communicates with an iPod and syncs your running record to a web site. People who use it report a new way of thinking about their run: it’s become a
  • 115. “My run is now a videogame, and I want you to play with me.” - Jane McGonigal video game. You can have collaborative and competitive challenges and you get virtual trophies. Now many pieces of gym equipment can also report to the iPod, so people can track their treadmill, cross-trainer, and bike miles on the same web site as they keep track of their running.
  • 116. Or take the Amazon Kindle. This is an electronic book reader--super high resolution screen, keyboard, and all that but the main innovation behind it is cellular connectivity. You don’t buy a network contract when you buy the Kindle, but the books you buy (from, naturally) arrive via the cellular data network. Introduced a year ago, Amazon will have sold 380,000 of them by the end of this year.
  • 117. This is the Wattson. It’s a power meter that reports the information to you, not just the power company. This is the display, there’s also a transmitter that clips onto a mains cables. There’s companion software, Holmes, that uploads your energy usage to the Holmes web site and lets you track and compare over time.
  • 118. This is Botanicalls, a gizmo that sends Twitter updates when your plants need watering. What you see is a sensor that clips into the potplant’s soil and a network cable to send the updates.
  • 119. Andy Stanford-Clark, an IBM “Master Inventor”, has instrumented his house and hooked it up to Twitter. You can follow his house and see his power consumption, when the phone rings, when the motion-sensitive lights turn on and off, etc. These people use Twitter, by the way, because Twitter runs a free SMS gateway in the USA and so it has become a cheap and easy way for programs to send SMS. For example,
  • 120. Former Apple engineer Gordon Meyer hooked his doorbell up to Twitter. Now, no matter where he is, he knows if his doorbell is ringing.
  • 121. This is the Availabot by Matt Webb and Jack Schulze. It’s a little toy that plugs into your USB port and stands to attention whenever a particular instant message buddy comes online.
  • 122. These researchers from Iowa State University are holding sensors that will help farmers understand nutrient and water flow in their soil. They’re 2 inches by 4 inches at the moment and will live underground, communicating wirelessly with a central computer.
  • 123. WineM is a prototype of a smart wine rack: a reader senses the RFID tags on bottles and uses lights to display the type of wine so you don’t have to turn the bottle to check the label. When you add a new bottle to your collection, the hardware scans the UPC code and uses public databases to translate that into a variety of wine--no keyboard necessary.
  • 124. This is a MobileTEEN GPS unit. AIG, before they needed bailing out, offered a Teen GPS insurance. In exchange for lower premiums (anyone here tried to insure a teenage driver lately? You know what I’m talking about) your teen (or their car) must carry this GPS device around, which reports their location back to AIG. You can tell AIG’s web site to SMS you if your teenager speeds, and you can even set up a GeoFENCE (that’s a registered service mark, by the way) and it’ll SMS you if they leave that area.
  • 125. This is the Dash mobile GPS unit for your car. It has a cellular network card in it, and uploads your location to the Dash servers. In return, you get incredibly accurate up-to-the-second traffic information as reported by the Dash units of the other cars in the city.
  • 126. This is the next President of the United States of America, using his Blackberry. The Blackberry is a mobile phone with email, it lets you send and receive email no matter where you are. The latest news is that Obama will have to give up his Blackberry because the emails will be a matter of public record, and because it’s not a good idea for a cellphone company to have a database in which can be easily found the location of the President of the United States of America.
  • 127. Ok, enough examples.
  • 128. Common? What do these devices have in common?
  • 129. Web meets World They all connect the Internet to a physical device, whether it’s a potplant, a teenage driver, or a book. In some cases the devices are
  • 130. sensors sensors, reporting back speeds, power consumption, Nitrogen concentration. In other cases the devices are
  • 131. displays displays, reflecting the state of the networked data: whether your friend is online or which books you have paid for. But in none of these cases is
  • 132. dumb device the device dumb. We’re entering an age of
  • 133. smart devices smart devices, where
  • 134. everything everything
  • 135. will be will be
  • 136. networked networked. Not because it’s cool and trendy but because
  • 137. the network the network
  • 138. makes makes
  • 139. the device the device
  • 140. better better, or because
  • 141. the device the device
  • 142. makes makes
  • 143. the network the network
  • 144. better better. You might be thinking this is all sounding a bit
  • 145. science fiction science fiction. You’re right, there are a few science fiction authors who have done a great job of predicting the near future. One of them,
  • 146. William Gibson William Gibson, was interviewed by Rolling Stone on their 40th anniversary. He was asked about the major challenges we faced, and he identified
  • 147. ubiquitous computing ubiquitous computing, this idea that everything is connected all the time. He said: One of the things our grandchildren will find
  • 148. quaintest quaintest about us is that we distinguish
  • 149. the digital the digital from
  • 150. the real the real,
  • 151. the virtual the virtual, from
  • 152. the real the real
  • 153. literally impossible In the future, that will become literally impossible
  • 154. cyberspace The distinction between cyberspace and
  • 155. ! cyberspace that which isn't cyberspace is going to be
  • 156. unimaginable unimaginable. But you don’t have to look to science fiction to see that as we go through life today, we leave
  • 157. invisible traces invisible traces of our presence,
  • 158. data trails data trails that linger behind us like contrails in the air after the plane has disappeared from sight. You know we do it now: we leave trails in
  • 159. credit card credit card databases,
  • 160. phone calls phone company databases,
  • 161. ISPs ISP databases. But when we pay cash, or visit someone in real life, or use a payphone, we can
  • 162. go off the grid go off the grid, be
  • 163. anonymous anonymous. But as the web meets the world, this will be harder and harder to do. Each new
  • 164. service = database service is backed by a database, and that database is vulnerable to
  • 165. Viacom bad guys,
  • 166. Directive 2006/24/EC governments, and
  • 167. AOL incompetence. I know it sounds grim, but it’s
  • 168. BAD not all bad.
  • 169. easy It’s not easy, but it’s not bad. There are three things that would make a lot of these problems go away.
  • 170. (0) Don’t collect it The most obvious solution is simply not to collect the data in the first place. A lot of these advertising-driven companies are packrats--think of Google and the two year lifetime of unanonymized data (after the Viacom ruling, by the way, they announced they were changing those two-year lifetimes to nine months). They keep this data because it might be useful to their future selves and not for you or your future self.
  • 171. (1) Same goals Next, vendors should have the same goals as you do. Amazon does: when you run your web site on EC2, you’re paying Amazon for that service. Amazon isn’t mining what you do and they’re not storing your data for later use. Google’s business model, however, is advertising. So you get GMail and it looks like it’s free, but the price you pay is Google’s data collection.
  • 172. (2) Cryptography Finally, your data should be encrypted in such a way that the service provider can’t decrypt it without you. We can do this technically, but few companies do it in practice.
  • 173. problems But there are problems with these solutions, and they cut to the heart of what’s so attractive about the very things that could threaten privacy:
  • 174. 0) data makes services better It’d be sad if Google weren’t to collect data because the data that Google collects makes its services better. And we benefit when this happens. Yes, it makes Google rich, but it also makes the Internet navigable, our email spam-free, and fills the world with unicorns and rainbows.
  • 175. 1) free is cheap Aligning interests is all very well, but advertising business models make services free that would otherwise be a direct cost. We can all host our domains and our email on Google without paying a cent, whereas ISPs typically charge for it. Aligning Google’s interests with our privacy interests may damage our pecuniary interests. And, finally,
  • 176. (3) shared data makes individual experiences better If you encrypt personally-identifying data, you make it very difficult to create those sites that crunch everyone’s data and make suggestions or recommendations. If the Holmes web site can’t see the power use of my Wattson, how can I compare myself to other people?
  • 177. impossible? So is it impossible to do this right? To respect privacy yet have a cloud application?
  • 178. impossible? No, there are companies trying hard to do it right. For example,
  • 179. Wesabe is a Quicken-like program on the web, like Mint. These guys get privacy right. First, they don’t hold the keys to your Internet banking to download them: you run an uploader on your PC that fetches your electronic statements from the bank and then sends them to Wesabe. Second, Wesabe encrypt your personal data so your records can’t be subpoena’d because your records can’t be tracked back to you. Third, it’s a commercial service and not advertising supported--your best interests are their best interests. And finally, Wesabe still manages to provide collective intelligence (you’re not saving as much as everyone else, for example). I contract to O’Reilly Media, who invested in Wesabe precisely because they get this so right.
  • 180. So here we are, nearly to the end. What did we learn?
  • 181. (1) Web meets World Physical devices are being integrated with the Web, or “cloud” if you will
  • 182. (2) Data in the cloud can be a privacy problem Data in the cloud can be a privacy problem, because
  • 183. (3) You’ve outsourced privacy you’ve outsourced your privacy, so you’re vulnerable to attack not just from hackers but also from
  • 184. (4) Governments governments,
  • 185. (5) Competitors competitors, and
  • 186. (6) Incompetence AOL. And while it is possible to build useful services while
  • 187. (7) Choose not to collect choosing not to gather some data,
  • 188. (8) Cryptography encrypting the data you do collect, and
  • 189. (9) Aligning interests making sure that your service provider doesn’t have a motive to undermine your privacy ...
  • 190. easy It’s not easy. Thank you, I’ll now take
  • 191. questions? questions.