2. understanding systems by making modelsWe have always tried to understand systems by creating models of them. We create rules thatmatch reality just closely enough that we can study reality by studying the model. MONIAC isone such example, created at the London School of Economics in 1949 by Bill Phillips. It usesﬂuid dynamics to model an economy, with the ﬂow between water tanks standing in for themonetary ﬂow between the Treasury, Education and so forth.
3. “The more we learn about biology, the further we ﬁnd ourselves from a model that can explain it.” Chris Anderson, http://www.wired.com/science/discoveries/magazine/16-07/pb_theory“All models are wrong, but some are useful.” — George Box, Statistician, quoted in http://www.wired.com/science/discoveries/magazine/16-07/pb_theoryAs our knowledge advances in a ﬁeld like biology, our inaccurate models give us diminishingreturns. In “The End Of Theory”, Chris Anderson argues that the future of science is transitioningto analysing empirical data gathered from observation of the world. He calls this The PetabyteAge, pioneered by companies such as Google who created techniques for large-scale analysis ofdata out of the necessity to analyse the whole internet.Credit: http://www.ﬂickr.com/photos/timo/851027757/
4. people are city biologyWe can try to study cities with models. But human behaviour, the biology of the city, makescities too complex to model.
5. Recent visualisations of the movement of hire-bikes through London emphasise for me theorganic, biological nature of human city-data.
6. “We can’t see how the street is immersed in a twitching, pulsing cloud of data.” Dan Hill: http://www.cityofsound.com/blog/2008/02/the-street-as-p.htmlDan Hill continues, “This is over and above the well-established electromagnetic radiation,crackles of static, radio waves conveying radio and television broadcasts in digital andanalogue forms, police voice traffic.  This is a new kind of data, collective and individual,aggregated and discrete, open and closed, constantly logging impossibly detailed patterns ofbehaviour. The behaviour of the street.”The data that ﬂows through modern cities is not even visible to the human eye. We can’tgather this data with interviews, surveys and clipboards.
7. city samplersSo at Nokia, we’ve been asking the question, can the phone be the entire source of data thatallows us to know our cities?
8. This is plausible because so many people carry a phone with them 24 hours a day, whereverthey go in the city. It’s also because the modern mobile phone is packed with sensors. Earlyphones had a microphone and a radio. Phones today know which way up they are, where theyare in the world, can record images and video, and can sense the presence of many otherdevices, networks and signals.
9. This brings the city into the Petabyte Age. What allows us to process the data is a techniquedeveloped by Google and popularised in open-source in the Hadoop project.Map-Reduce is a system for specifying a data-processing algorithm that allows the work tobe split up and distributed to a network of computers to solve in pieces. It maps raw inputdata to processed output data, then reduces the output data into ﬁnal results.
10. With map-reduce, we can run an algorithm on a rack of servers...http://www.ﬂickr.com/photos/johnseb/3425464/
11. ... or a corridor full of racks of servers ...
12. ... or data-centre full of corridors full of racks of servers.We can start small and scale up our processing capability to keep pace with the scale of ourdata. It sidesteps the limit we hit with traditional single-machine analytics, when we can nolonger process 24 hours of data in 24 hours of CPU time.
13. learning from searchMy ﬁrst example shows what we can learn by looking at what people search for on a map,and where they are when they search.
14. Ikea Spandau Ikea Schoenefeld Ikea TempelhofThis map of Berlin (made by Nokia’s Josh Devins) aggregates searches made over the lastThursday, January 27, 2011Ikea geo-searches bounded to Berlinfour months for the word “Ikea”. It clearly shows that people all over Berlin look for Ikea, butcan we make any assumptions about whatBerlin Ikea stores.that there are obvious clusters near the 3 the actual locations are?kind of, but not much data hereclearly there is a Tempelhof cluster but the others are not very evidentcertainly shows the relative popularity of all the locationsIkea Lichtenberg was not open yet during this time frame
15. Prenzl Berg Yuppies Ikea Spandau Ikea Schoenefeld Ikea TempelhofThe fourth obvious cluster is a demographic - the young middle-class families who tend toThursday, January 27, 2011Ikeain the Prenzlauer Berg district of Berlin.live geo-searches bounded to Berlincan we make any assumptions about what the actual locations are?kind of, but not muchalso shows that people don’t search for Ikea on a Sunday as much asIncidentally, the data data hereclearly there is week. This is cluster but the others are not very evident laws and even Ikea isthe rest of the a Tempelhof because Germany still has Sunday-closingcertainly shows the relative popularity of all the locationsnot open on Sundays.Ikea Lichtenberg was not open yet during this time frame
16. learning from mapsWe can learn plenty about a city just from looking at its maps, and the places on the map.
17. The “Starbucks Index”, invented by designer Tom Coates, is calculated from the number ofStarbucks cafes per square kilometre of the city. By analysing Nokia’s places registry, we canshow the difference between difference cities, or different parts of a city, by looking at whatcompanies choose to base themselves there. We could equally well calculate a McDonaldsindex, or an Italian food index, or a public parks index.
18. Searches are goal-driven user behaviour - someone typed something into a search box on aphone. But we can even learn from activity that isn’t so explicit.When someone views a Nokia Ovi map on the web or phone, the visuals for the map areserved up in square “tiles” from our servers. We can analyse the number of requests made foreach tile and take it as a measure of interest or attention in that part of the world.
19. Searches are goal-driven user behaviour - someone typed something into a search box on aphone. But we can even learn from activity that isn’t so explicit.When someone views a Nokia Ovi map on the web or phone, the visuals for the map areserved up in square “tiles” from our servers. We can analyse the number of requests made foreach tile and take it as a measure of interest or attention in that part of the world.
20. LA attention heatmapThis is the attention map of Los Angeles, California. We can clearly see several importanthotspots such as Downtown, Hollywood and LAX airport.
21. LA driving heatmapIf we turn to the navigation logs, we get another map of Los Angeles. This data is recordedwhenever someone requests a car route from one place to another. You can clearly see theroads, and it heavily emphasises major roads because that’s what is favoured by route-planning algorithms. It’s also a map made by people who don’t know where they’re going - ifthey knew exactly what route to take, they wouldn’t be using navigation on their phones.
22. business perspectiveCity data also reﬂects business activity. In Berlin our local coffee shop owner uses pen andpaper to record every sale he makes. He uses this to optimise his pricing and the kinds ofcoffee he sells. We can do some of the same analysis on a larger scale.
23. business contextLooking at the check-in and search patterns around coffee shops, we made this map of theSan Francisco Dolores Park area. Red circles are coffee shops, and blue circles are otherbusinesses. The larger the circle, the more popular the location is to visit.
24. usage patternsWe discovered we could deduce more than just business information from this data. When welooked at one speciﬁc venue, Dolores Park itself, we can tell that San Francisco is cold atnight. No matter the time of year, checkins at the park are much lower in the evening andnight than in daytime.When we looked at the day of the week that people visit the park, we thought we had a bug inour data collection. Why would Thursday be different from other days for popularity of parks?When we cross-referenced the data with weather records, we realised that this particularThursday was wet and cold.Like many other examples in this presentation, we were excited by the fact that we can ﬁndveriﬁable real-world information in pure data, without any human guidance.
25. “Information is quickly becoming a material to design with.”Mike Kuniavsky: http://orangecone.com/archives/2010/08/information_is_.htmlIn his recent book “Smart Things”, Mike Kuniavsky compares information to traditionalmaterials such as wood and rubber. It has now become a material that we can build with inthe real world, to connect the physical and the digital worlds together.
26. [nod to Matt Jones, for many conversations we had about cities while working together atDopplr]
27. Thank you. Matt Biddulph @mattb | firstname.lastname@example.orgAfter the talk, there were questions from the audience...
28. Audience question What about individual privacy, and the ethics of proﬁting from individual user data?1. We only ever analyse the aggregate, anonymised set of all users’ data. We didn’t track anyindividuals in any part of this work.2. I believe that it could only be unethical to proﬁt from analysing user data if you don’treturn some value by making them a useful, desirable product in return.
29. Audience question I’m not uncomfortable with services analysing my data, but I am unhappy if I feel like I don’t own my personal data.In my personal opinion, individual data belongs to the individual. Putting your data into alarge service gives you access to economies of scale, allowing it to do useful analysis of theaggregate data that you couldn’t achieve with your data alone. You beneﬁt from this whentheir service gets better the more you use it.A company you deposit data with should act like a bank: hold it in trust, generate somebeneﬁt, give it back when you ask.