SlideShare a Scribd company logo
DATA MINING THE CITY
Weds 7p-9p 200 Buell
Violet Whitney, vw2205@columbia.edu
please take a moment to
say why you’re here:
shoutkey.com/carrot
No computers please
Except when we need them
Class Overview
Class Overview
D<>D Getting Data
Class Overview
D<>D Getting Data
What are data?
Class Overview
D<>D Getting Data
What are data?
D<>D Cleaning Data
Class Overview
D<>D Getting Data
What are data?
D<>D Cleaning Data
Reflection/Attendance
hi!
( ゚ヮ゚)
Platform Society
Foursquare 2008
Big Data
Widespread Adoption
Democratization of data
Statistics
Bayes Theorem (1763)
Regression (1805)
Computer Age
Turing (1936)
Neural Networks (1943)
Evolutionary Computation (1965)
Databases (1970s)
Genetic Algorithms (1975)
Data Mining
KDD or Knowledge Discovery from
Databases (1989)
Supervised machine learning (1992)
Data Science (2001)
Platform SocietyStatistics Computer Age Data Mining
class overview
(ʘᗩʘ')
cities are platforms
cities are networks
redevelopment of “blighted” areas racial redlining
city data isn’t new
quantity and ubiquity
data for designers
develop hypothesis
Python
APIs
Processing
Batch Processes
After Effects scripts
Sorting Excel
formulas
Geocoding
Recommendation
Systems
critical data usage
Machine Learning Pattern Recognition
Algorithms
High-Performance
Computing
Statistics
Database Systems
Data Warehouse
Information Retrieval Applications
Data Mining
Visualization
class overview
(ʘᗩʘ')
class overview
(ʘᗩʘ')
Data Selection
Pre-processing &
Cleaning
Data Mining
Interpretation/
Evaluation
Feature Selection
class overview
(ʘᗩʘ')
40%
30%
30%
Attendance
Work
WklyPostsFinalProject
class overview
(ʘᗩʘ')
40%
Attendance
class overview
(ʘᗩʘ')
30%
30%
Work
WklyPostsFinalProject
class overview
(ʘᗩʘ')
30%
30%
Work
WklyPostsFinalProject
class overview
(ʘᗩʘ')
class overview
(ʘᗩʘ')
30%
30%
Work
WklyPostsFinalProject
YOUR EPIC PROJECT!
Are Airbnb prices higher in
neighborhoods that are more
diverse?
class overview
(ʘᗩʘ')
30%
30%
Work
WklyPostsFinalProject
party!!!
d<>d
(☞゚ヮ゚)☞ ☜(゚ヮ゚☜)
d<>d
(☞゚ヮ゚)☞ ☜(゚ヮ゚☜)
d<>d
(☞゚ヮ゚)☞ ☜(゚ヮ゚☜)
Best of Luck with the Wall
Officer Involved
d<>d
(☞゚ヮ゚)☞ ☜(゚ヮ゚☜)
Data Selection
API
What are data?
¯_(ツ)_/¯
What are data?
¯_(ツ)_/¯
Why visualization matters
What are data?
¯_(ツ)_/¯
Anscombe’s Quartet
What are data?
¯_(ツ)_/¯
Anscombe’s Quartet
What are data?
¯_(ツ)_/¯
Why size matters
What are data?
¯_(ツ)_/¯
Why size matters
What are data?
¯_(ツ)_/¯
How are data
represented?
What is data?
¯_(ツ)_/¯
11111111 01010011 00001000
What is data?
¯_(ツ)_/¯
11111111
FF
01010011
83
00001000
08
What is data?
¯_(ツ)_/¯
11111111
FF
red = 255
01010011
83
green = 83
00001000
08
blue = 8
What is data?
¯_(ツ)_/¯
What is data?
¯_(ツ)_/¯
binary <----------------- human readable
“encoding”“encoding”
What is data?
¯_(ツ)_/¯
binary -----------------> human readable
“decoding”“decoding”
d<>d
(☞゚ヮ゚)☞ ☜(゚ヮ゚☜)
Cleaning data
(ノ◕ヮ◕)ノ*:・゚✧
d<>d
(☞゚ヮ゚)☞ ☜(゚ヮ゚☜)
Pre-processing &
Cleaning
Cleaning Data
(ノ◕ヮ◕)ノ*:・゚✧
Pre-processing &
Cleaning
Cleaning Data
(ノ◕ヮ◕)ノ*:・゚✧
Pre-processing &
Cleaning
Cleaning Data
(ノ◕ヮ◕)ノ*:・゚✧
AttributesPre-processing &
Cleaning
rating
rating
rating
rating
Cleaning Data
(ノ◕ヮ◕)ノ*:・゚✧
AttributesPre-processing &
Cleaning
address
address
address
address
Cleaning Data
(ノ◕ヮ◕)ノ*:・゚✧
AttributesPre-processing &
Cleaning
open hours
open hours
open hours
Cleaning Data
(ノ◕ヮ◕)ノ*:・゚✧
Pre-processing &
Cleaning
Objects
Restaurant 1
Restaurant 2
Restaurant 3
Restaurant 4
Cleaning Data
(ノ◕ヮ◕)ノ*:・゚✧
Feature Selection
address
address
address
address
First Assignment
DATA MINING THE CITY
Weds 7p-9p 200 Buell
Violet Whitney, vw2205@columbia.edu
attendance/reflection:
shoutkey.com/us

More Related Content

Similar to Week 1 - Data Mining the City

Data driven innovation
Data driven innovationData driven innovation
Data driven innovation
Big Data Value Association
 
Technology for the Digital Citizen
Technology for the Digital CitizenTechnology for the Digital Citizen
Technology for the Digital Citizen
Levi Kabwato
 
CDT Away Day Talk: Qualitative–Quantitative reasoning and lightweight numbers
CDT Away Day Talk: Qualitative–Quantitative reasoning and lightweight numbersCDT Away Day Talk: Qualitative–Quantitative reasoning and lightweight numbers
CDT Away Day Talk: Qualitative–Quantitative reasoning and lightweight numbers
Alan Dix
 
This is not your grandmother's online map: Advancing your mission with GIS tools
This is not your grandmother's online map: Advancing your mission with GIS toolsThis is not your grandmother's online map: Advancing your mission with GIS tools
This is not your grandmother's online map: Advancing your mission with GIS tools
Chicago Technology Cooperative
 
Data as a Creative Material
Data as a Creative MaterialData as a Creative Material
Data as a Creative Material
Audree Lapierre
 
Functional Leap of Faith (Keynote at JDay Lviv 2014)
Functional Leap of Faith (Keynote at JDay Lviv 2014)Functional Leap of Faith (Keynote at JDay Lviv 2014)
Functional Leap of Faith (Keynote at JDay Lviv 2014)
Tomer Gabel
 
The Digital Divides or the third industrial revolution: concepts and figures
The Digital Divides or the third industrial revolution: concepts and figuresThe Digital Divides or the third industrial revolution: concepts and figures
The Digital Divides or the third industrial revolution: concepts and figures
Ismael Peña-López
 
Iftf 20191206 v9
Iftf 20191206 v9Iftf 20191206 v9
Iftf 20191206 v9
ISSIP
 
Using Graphs to Enable National-Scale Analytics
Using Graphs to Enable National-Scale AnalyticsUsing Graphs to Enable National-Scale Analytics
Using Graphs to Enable National-Scale Analytics
Neo4j
 
top 10 Data Mining Algorithms
top 10 Data Mining Algorithmstop 10 Data Mining Algorithms
top 10 Data Mining Algorithms
Nagasuri Bala Venkateswarlu
 
eChicago Conference
eChicago ConferenceeChicago Conference
eChicago Conference
Shireen Mitchell
 
geostack
geostackgeostack
geostack
Joana Simoes
 
HS DAM Chicago 2019 - Reframing the Conversation
HS DAM Chicago 2019 - Reframing the ConversationHS DAM Chicago 2019 - Reframing the Conversation
HS DAM Chicago 2019 - Reframing the Conversation
Christina Gibbs
 
Cs501 dm intro
Cs501 dm introCs501 dm intro
Cs501 dm intro
Kamal Singh Lodhi
 
Data Science Chapter 1.pdf
Data Science Chapter 1.pdfData Science Chapter 1.pdf
Data Science Chapter 1.pdf
Mpumelelo Ndlovu
 
Social media quiz
Social media quizSocial media quiz
Social media quiz
public-i
 
unit 1 DATA MINING.ppt
unit 1 DATA MINING.pptunit 1 DATA MINING.ppt
unit 1 DATA MINING.ppt
BREENAHICETSTAFFCSE
 
A Training & Simulation Perspective on Maritime Information & Automation
A Training & Simulation Perspective on Maritime Information & AutomationA Training & Simulation Perspective on Maritime Information & Automation
A Training & Simulation Perspective on Maritime Information & Automation
Andy Fawkes
 
BigData & Supply Chain: A "Small" Introduction
BigData & Supply Chain: A "Small" IntroductionBigData & Supply Chain: A "Small" Introduction
BigData & Supply Chain: A "Small" Introduction
Ivan Gruer
 
Data Colonialism and Digital Sustainability: Problems and Solutions to Curren...
Data Colonialism and Digital Sustainability: Problems and Solutions to Curren...Data Colonialism and Digital Sustainability: Problems and Solutions to Curren...
Data Colonialism and Digital Sustainability: Problems and Solutions to Curren...
Matthias Stürmer
 

Similar to Week 1 - Data Mining the City (20)

Data driven innovation
Data driven innovationData driven innovation
Data driven innovation
 
Technology for the Digital Citizen
Technology for the Digital CitizenTechnology for the Digital Citizen
Technology for the Digital Citizen
 
CDT Away Day Talk: Qualitative–Quantitative reasoning and lightweight numbers
CDT Away Day Talk: Qualitative–Quantitative reasoning and lightweight numbersCDT Away Day Talk: Qualitative–Quantitative reasoning and lightweight numbers
CDT Away Day Talk: Qualitative–Quantitative reasoning and lightweight numbers
 
This is not your grandmother's online map: Advancing your mission with GIS tools
This is not your grandmother's online map: Advancing your mission with GIS toolsThis is not your grandmother's online map: Advancing your mission with GIS tools
This is not your grandmother's online map: Advancing your mission with GIS tools
 
Data as a Creative Material
Data as a Creative MaterialData as a Creative Material
Data as a Creative Material
 
Functional Leap of Faith (Keynote at JDay Lviv 2014)
Functional Leap of Faith (Keynote at JDay Lviv 2014)Functional Leap of Faith (Keynote at JDay Lviv 2014)
Functional Leap of Faith (Keynote at JDay Lviv 2014)
 
The Digital Divides or the third industrial revolution: concepts and figures
The Digital Divides or the third industrial revolution: concepts and figuresThe Digital Divides or the third industrial revolution: concepts and figures
The Digital Divides or the third industrial revolution: concepts and figures
 
Iftf 20191206 v9
Iftf 20191206 v9Iftf 20191206 v9
Iftf 20191206 v9
 
Using Graphs to Enable National-Scale Analytics
Using Graphs to Enable National-Scale AnalyticsUsing Graphs to Enable National-Scale Analytics
Using Graphs to Enable National-Scale Analytics
 
top 10 Data Mining Algorithms
top 10 Data Mining Algorithmstop 10 Data Mining Algorithms
top 10 Data Mining Algorithms
 
eChicago Conference
eChicago ConferenceeChicago Conference
eChicago Conference
 
geostack
geostackgeostack
geostack
 
HS DAM Chicago 2019 - Reframing the Conversation
HS DAM Chicago 2019 - Reframing the ConversationHS DAM Chicago 2019 - Reframing the Conversation
HS DAM Chicago 2019 - Reframing the Conversation
 
Cs501 dm intro
Cs501 dm introCs501 dm intro
Cs501 dm intro
 
Data Science Chapter 1.pdf
Data Science Chapter 1.pdfData Science Chapter 1.pdf
Data Science Chapter 1.pdf
 
Social media quiz
Social media quizSocial media quiz
Social media quiz
 
unit 1 DATA MINING.ppt
unit 1 DATA MINING.pptunit 1 DATA MINING.ppt
unit 1 DATA MINING.ppt
 
A Training & Simulation Perspective on Maritime Information & Automation
A Training & Simulation Perspective on Maritime Information & AutomationA Training & Simulation Perspective on Maritime Information & Automation
A Training & Simulation Perspective on Maritime Information & Automation
 
BigData & Supply Chain: A "Small" Introduction
BigData & Supply Chain: A "Small" IntroductionBigData & Supply Chain: A "Small" Introduction
BigData & Supply Chain: A "Small" Introduction
 
Data Colonialism and Digital Sustainability: Problems and Solutions to Curren...
Data Colonialism and Digital Sustainability: Problems and Solutions to Curren...Data Colonialism and Digital Sustainability: Problems and Solutions to Curren...
Data Colonialism and Digital Sustainability: Problems and Solutions to Curren...
 

Recently uploaded

Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
g4dpvqap0
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
74nqk8xf
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
Roger Valdez
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
nuttdpt
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
GetInData
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
vikram sood
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
v3tuleee
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
Timothy Spann
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
74nqk8xf
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
Social Samosa
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Kiwi Creative
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
rwarrenll
 

Recently uploaded (20)

Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
一比一原版(Glasgow毕业证书)格拉斯哥大学毕业证如何办理
 
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
一比一原版(Coventry毕业证书)考文垂大学毕业证如何办理
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
Everything you wanted to know about LIHTC
Everything you wanted to know about LIHTCEverything you wanted to know about LIHTC
Everything you wanted to know about LIHTC
 
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
一比一原版(UCSB文凭证书)圣芭芭拉分校毕业证如何办理
 
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfEnhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdf
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
Global Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headedGlobal Situational Awareness of A.I. and where its headed
Global Situational Awareness of A.I. and where its headed
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理一比一原版(UofS毕业证书)萨省大学毕业证如何办理
一比一原版(UofS毕业证书)萨省大学毕业证如何办理
 
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Dat...
 
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
一比一原版(Chester毕业证书)切斯特大学毕业证如何办理
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
4th Modern Marketing Reckoner by MMA Global India & Group M: 60+ experts on W...
 
End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging DataPredictably Improve Your B2B Tech Company's Performance by Leveraging Data
Predictably Improve Your B2B Tech Company's Performance by Leveraging Data
 
My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.My burning issue is homelessness K.C.M.O.
My burning issue is homelessness K.C.M.O.
 

Week 1 - Data Mining the City

Editor's Notes

  1. We’re going to do some exercises: this first one will be on getting data which will start the weekly assignment. D<>D just means paired designers, were going to pair up with whoever has computers because its more fun together, and then we can meet each other
  2. I just graduated with my MArch from GSAPP
  3. Aleppo project at CSR
  4. sidewalk
  5. Where we fit into history
  6. Kings College practiced statistics through engineering The world’s most powerful computer at Watson Lab 1954, Paperless studio (CAD) CBIP - Columbia Building Intelligence Project - data/metric-driven design of the built environment Columbia also hosted Cities Lab and Network Cities Center for Spatial Research - humanitarian mapping This is the best place for technology and architecture
  7. As Professor José van Dijck has described, the computerization of every aspect of life has created a Platform society.
  8. Today most of our social and economic relations take place through platforms like Facebook and Venmo
  9. Tinder’s matching algorithm leads to an increasing number of matches and marriages each year. Ultimately its algorithm will shape the genetic makeup of the human race, as swipes are made, humans are matched and babies are born.
  10. The filters of StreetEasy and Apartment Finder --literally filter the makeup of --who lives in what neighborhoods-- reprogramming entire city zones.
  11. Where the Nolli map once exposed accessible public space, Yelp is now telling individuals what spaces they should like, but everyone sees a different map. These recommendation systems algorithmically segregate cities, generating spatialized filter bubbles which choreograph pedestrian flows through siloed canals across the city.
  12. From Yelp reviews directing people to preferred restaurants to Airbnb reprogramming homes into vacation rentals, the invisible code that powers a city’s use may have more drastic influence than any physical invention in the last century.
  13. But cities have always operated as platforms, as Manuel Castells states - they are the ‘material interfaces’ that connect individual city dwellers.
  14. Just like the networks on the internet, room adjacencies and hallways too act like networks.
  15. not only have cities operated like platforms, the usage of data in cities isn’t new. -- In the 30s surveys and statistics about the makeup of a place were used to justify the redevelopment of “blighted areas” --and for racial redlining. So what is so different about data in the city now?
  16. Today its the quantity and ubiquity of that data which is new. The democratization of data through public APIs allow various apps and lone coders to access giant pools of data dropped by tiny transactions throughout the city.
  17. This interconnectedness and availability of this data gives immense power to designers to choreograph the use of cities and speculate creatively about the urban environment.
  18. This course will focus on encoding spatial analytical processes. We will hypothesize about the relationships of tools and space, as well as develop models and simulations so designers can gain a foothold in the changing landscape of the digital city.
  19. We will develop a technical training in relevant techniques: using Python, public APIs, batch image and video processes, and visualization techniques in Processing
  20. As well as a critical understanding of the social, economic, and political dynamics caused by these technologies such as data bias, and privacy issues.
  21. In Session A, we will learn about data types, preprocessing data, about location and accuracy
  22. About mapping Data & Other Visualization techniques, About defining Spatial Patterns About recommendation systems
  23. And about Pixels, Images, Video, and computer vision
  24. Session B will be run as workshops tailored to your specific interests (such as sentiment analysis or natural language processing) and will give you the opportunity to deep dive into your own project which can orient around your studio.
  25. Workshops will include expert guest critics from data, cloud computing and urban analytics.
  26. Set of processes or methods for discovering patterns
  27. We’ll do a quick reflection at the end of each class through a google form to give you the opportunity to submit regular feedback on the class as well as mark yourself as here
  28. Every week there will be a tutorial or an assignment that will develop your Project which you will post on Medium. Who knows what Medium is?
  29. Every week there will be a tutorial or an assignment that will develop your Project which you will post on Medium. We’ll get started on the first week’s assignment and you’ll continue it at the end of class.
  30. The course project asks students to use at least 2 NYC datasets to generate a visual argument about change in the city. Projects will be individual, however students are encouraged to share their data sets and methods with a pair coding partner.
  31. Super open on what people want to do for midterm and final review. critics?
  32. Who has computers? groups
  33. Google Street View is an amazing archive of the city but has yet to be easily sortable. If we want to see all locations that are marked as historic in New York City, we would need to look up each location from a database of addresses copy the address into Google Maps, drop the pegman into each location, screenshot each street scene, and then repeat the steps for each location before being able to compare them all.
  34. Artists like Josh Begley have found smarter ways to sample Google Street View. He uses Google’s API and custom scripts to automate the downloading of street view from various locations. In “Officer Involved”, he uses databases of police brutality (collected by non-governmental and news organizations) to sample Street View scenes at the location of each incident, thus immersing us in “the environment of someone’s last moment”.
  35. Where is data stored?-----Flat files, Databases and websites, APIs - whats an API? Google Maps (church, CVS, bridge, bar, etc) ------> google sheets manually scraping
  36. Each dataset has the same summary statistics (mean, standard deviation, correlation),...
  37. and the datasets are clearly different, and visually distinct). Anscombe’s Quartet is the classic example showing how visualization can trump statistics alone.
  38. In a paper by Benoit Mandelbrot on the coastline of Britain it was shown that it is inherently nonsensical to discuss certain spatial concepts(such as the length of the perimeter of the coastline) despite that there me an inherent presumption that discussing the length of a coastline seems valid. Lengths in ecology depend directly on the scale at which they are measured and experienced. So while surveyors commonly measure the length of a river, this length only has meaning in the context of the relevance of the measuring technique to the question under study.
  39. He depicted this idea behind fractal geometry, that certain forms and branching patterns could be seen at multiple scales
  40. binary is the way computers store data at their lowest level, as electric charge.
  41. We don’t use ones and zeroes. When working with binary data, we often use hexadecimal instead.
  42. But given the proper context, this hexadecimal string actually represents color (you’ve probably used these numbers in photoshop)
  43. What you may not know is that internally, most data are held as long, one-dimensional sequences of values, either binary (as hexadecimal) or text (as characters).
  44. In computers, encoding is the process of putting a sequence of characters (letters, numbers, punctuation, and certain symbols) into a specialized format for efficient transmission or storage.
  45. Decoding is the opposite process -- the conversion of an encoded format back into the original sequence of characters.
  46. Now that we know a bit about what data are and how they’re stored… lets get into formatting data
  47. We’re going to use location data to get streetview images from Google’s API (their open data)
  48. We want to clean our data to turn our addresses into lat, and longitude
  49. When we’re talking about our data, there are a couple terms to know...
  50. When we’re talking about our data, there are a couple terms to know...
  51. ...
  52. ...