SlideShare a Scribd company logo
1 of 44
A Guide to Getting Data


    John Myles White




      April 10, 2013
A Hierarchy of Data Access Schemes


Bulk Downloads
API Access
Web-Scraping
Bulk Downloads
Collections of Bulk Downloads


https://delicious.com/jhofman/data
http://bitly.com/bundles/hmason/1
Some Available Data Sets


Wikipedia
IMDB
Million Song Database
SNAP (Social Networks)
Sunlight (Congressional Votes)
Data Formats


Delimited Values
    CSV
    TSV
    WSV
JSON
XML
Ad Hoc Formats
JSON

JSON sees the world as hash tables and arrays:

    Hash tables: {"a":    1, "b":    2}
    Arrays: [1, 2, 3]
JSON

Example from json.org:

{"menu": {
   "id": "file",
   "value": "File",
   "popup": {
     "menuitem": [
       {"value": "New", "onclick": "CreateNewDoc()"},
       {"value": "Open", "onclick": "OpenDoc()"},
       {"value": "Close", "onclick": "CloseDoc()"}
     ]
   }
}}
XML

XML views the world as a recursive container:

<container>
    <item>A</item>
    <item>B</item>
    <item>
        <container>
            <item attr="SomePropertyOfC">C’</item>
        </container>
    </item>
</container>
XML

From Wikipedia XML dump:

<mediawiki xml:lang="en">
  <page>
    <title>Page title</title>
    <restrictions>edit=sysop:move=sysop</restrictions>
    <revision>
      <timestamp>2001-01-15T13:15:00Z</timestamp>
      <contributor><username>Foobar</username></contributor
      <comment>I have just one thing to say!</comment>
      <text>A bunch of [[text]] here.</text>
      <minor />
    </revision>
  </page>
</mediawiki>
Ad Hoc Data Formats


Fixed Width Files
Graph Edgelists
Voting Record Format
Many others. . .
Fixed Width Format

7-5-5 Format:

Sam    5    6
Josh   6    1211
Nicole 9983 200
Graph Edgelist

Directed Graph Format:

1   2
1   3
1   4
2   3
4   4
Voting Records

KH Format:
1109991099 0USA 200 BUSH
9999999999999999696996999999996. . .
Unstructured or Misstructured Data


Which Wikipedia articles link to each other?
Have Wikipedia dump of raw text
Need to parse XML, find links, extract them
API Access
Sites w/ API’s


NY Times
Twitter
Google
Facebook
Foursquare
Live Demo of NY Times API

http://developer.nytimes.com/docs
Live Demo of Twitter API

https://dev.twitter.com/docs/api/1.1
Use wget or curl:

wget http://google.com
API Wrappers

    Google API Client
    Tweepy - Twitter API
    twitteR
    ...
Tweepy usage:

# Create API object
# ...
auth = tweepy.OAuthHandler(consumer_key, consumer_secret)
auth.set_access_token(access_token, access_token_secret)
api = tweepy.API(auth)

screen_name = "BarackObama"
user_info = api.get_user(screen_name)
for page in Cursor(api.friends_ids,
                   screen_name = screen_name).pages():
  for user_id in page:
    user_friends.append(user_id)
Parsing Data


Regular Expressions
Formal Parsers
    XML Parsers
    HTML Parsers
Basics of Regular Expressions

A Pattern Language for Text w/ Three Parts

    Character literals: a, b, 5
    Repetition operator: *
    Logical OR: |
(cat)|(dog)
(cats*)|(dogs*)
(ha)*
grep "cat" /usr/share/dict/words
grep -E "(cats*)|(dogs*)" /usr/share/dict/words
Advanced Tools:

    Complex Repetition: *, +, ?, {m, n}
    Character Classes: [0-9], [a-z]
    Special Character Classes: d, w
Complex Repetition:

    a*: 0 or more occurrences of a
    a+: 1 or more occurrences of a
    a?: 0 or 1 occurrences of a
    a{m, n}: At least m and no more than n occurrences of a
Character Classes:

    [0-9]
    [a-z]
    [0-9a-zA-Z]
    [^0-9] Negate a character class
Special Character Classes:

    d: Any digit
    D: Any non-digit
    w: Any word character
    s: Any whitespace character
Matching Phone Numbers


555-5757
800-555-5757
800.555.5757
1-800-555-5757
+1 800 555 5757
5–52—25
First Draft Regular Expression

(d|-)+
Second Draft

ddd-dddd
Third Draft

ddd[.-]dddd
Formal Parsers
Python JSON

import json
print json.dumps({’4’: 5, ’6’: [7, 8]})
json.loads(’["foo", {"bar":["baz", null, 1.0, 2]}]’)
Python XML Parser

<data>
    <items>
        <item   name="item1"></item>
        <item   name="item2"></item>
        <item   name="item3"></item>
        <item   name="item4"></item>
    </items>
</data>
Python XML Parser

from xml.dom import minidom
xmldoc = minidom.parse(’items.xml’)
itemlist = xmldoc.getElementsByTagName(’item’)
print len(itemlist)
print itemlist[0].attributes[’name’].value
for s in itemlist :
    print s.attributes[’name’].value
Web-Scraping
Crawling web
Spidering data
Scraping HTML for information
wget google.com
Developer Console Demo
Many HTML parsing libraries:

    Beautiful Soup
    Nokogiri
Generic UNIX Tools


grep
sort
more
wc
cut
awk
...

More Related Content

What's hot

Traversals for all ocasions
Traversals for all ocasionsTraversals for all ocasions
Traversals for all ocasionsLuka Jacobowitz
 
Haskell for Scala-ists
Haskell for Scala-istsHaskell for Scala-ists
Haskell for Scala-istschriseidhof
 
Elm introduction
Elm   introductionElm   introduction
Elm introductionMix & Go
 
A Higher-Order Data Flow Model for Heterogeneous Big Data
A Higher-Order Data Flow Model for Heterogeneous Big DataA Higher-Order Data Flow Model for Heterogeneous Big Data
A Higher-Order Data Flow Model for Heterogeneous Big DataSimon Price
 
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj TalkSpark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj TalkZalando Technology
 
Programming in python Unit-1 Part-1
Programming in python Unit-1 Part-1Programming in python Unit-1 Part-1
Programming in python Unit-1 Part-1Vikram Nandini
 
Databases for Beginners SQLite
Databases for Beginners SQLiteDatabases for Beginners SQLite
Databases for Beginners SQLiteChristopher Wimble
 
A la découverte de TypeScript
A la découverte de TypeScriptA la découverte de TypeScript
A la découverte de TypeScriptDenis Voituron
 
2019-01-24 Sequelize ORM (Object Relational Mapper): models, migrations, oh my
2019-01-24 Sequelize ORM (Object Relational Mapper): models, migrations, oh my2019-01-24 Sequelize ORM (Object Relational Mapper): models, migrations, oh my
2019-01-24 Sequelize ORM (Object Relational Mapper): models, migrations, oh myRyan M Harrison
 
Facebook Graph Search by Ole martin mørk for jdays2013 Gothenburg www.jdays.se
Facebook Graph Search by Ole martin mørk for jdays2013 Gothenburg www.jdays.seFacebook Graph Search by Ole martin mørk for jdays2013 Gothenburg www.jdays.se
Facebook Graph Search by Ole martin mørk for jdays2013 Gothenburg www.jdays.sehamidsamadi
 
The Global Performing Arts Database
The Global Performing Arts DatabaseThe Global Performing Arts Database
The Global Performing Arts DatabasePaul Houle
 
The Ring programming language version 1.8 book - Part 48 of 202
The Ring programming language version 1.8 book - Part 48 of 202The Ring programming language version 1.8 book - Part 48 of 202
The Ring programming language version 1.8 book - Part 48 of 202Mahmoud Samir Fayed
 
ITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function ProgrammingITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function ProgrammingIstanbul Tech Talks
 
2013 CrossRef Workshops Boot Camp Introduction Patricia Feeney
2013 CrossRef Workshops Boot Camp Introduction Patricia Feeney2013 CrossRef Workshops Boot Camp Introduction Patricia Feeney
2013 CrossRef Workshops Boot Camp Introduction Patricia FeeneyCrossref
 

What's hot (20)

CSS for developers
CSS for developersCSS for developers
CSS for developers
 
Traversals for all ocasions
Traversals for all ocasionsTraversals for all ocasions
Traversals for all ocasions
 
Haskell for Scala-ists
Haskell for Scala-istsHaskell for Scala-ists
Haskell for Scala-ists
 
Elm introduction
Elm   introductionElm   introduction
Elm introduction
 
04 standard class library c#
04 standard class library c#04 standard class library c#
04 standard class library c#
 
A Higher-Order Data Flow Model for Heterogeneous Big Data
A Higher-Order Data Flow Model for Heterogeneous Big DataA Higher-Order Data Flow Model for Heterogeneous Big Data
A Higher-Order Data Flow Model for Heterogeneous Big Data
 
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj TalkSpark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
Spark + Clojure for Topic Discovery - Zalando Tech Clojure/Conj Talk
 
Programming in python Unit-1 Part-1
Programming in python Unit-1 Part-1Programming in python Unit-1 Part-1
Programming in python Unit-1 Part-1
 
Interfaces to xapian
Interfaces to xapianInterfaces to xapian
Interfaces to xapian
 
Databases for Beginners SQLite
Databases for Beginners SQLiteDatabases for Beginners SQLite
Databases for Beginners SQLite
 
A la découverte de TypeScript
A la découverte de TypeScriptA la découverte de TypeScript
A la découverte de TypeScript
 
2019-01-24 Sequelize ORM (Object Relational Mapper): models, migrations, oh my
2019-01-24 Sequelize ORM (Object Relational Mapper): models, migrations, oh my2019-01-24 Sequelize ORM (Object Relational Mapper): models, migrations, oh my
2019-01-24 Sequelize ORM (Object Relational Mapper): models, migrations, oh my
 
Listview to dif
Listview to difListview to dif
Listview to dif
 
Facebook Graph Search by Ole martin mørk for jdays2013 Gothenburg www.jdays.se
Facebook Graph Search by Ole martin mørk for jdays2013 Gothenburg www.jdays.seFacebook Graph Search by Ole martin mørk for jdays2013 Gothenburg www.jdays.se
Facebook Graph Search by Ole martin mørk for jdays2013 Gothenburg www.jdays.se
 
Graph search with Neo4j
Graph search with Neo4jGraph search with Neo4j
Graph search with Neo4j
 
The Global Performing Arts Database
The Global Performing Arts DatabaseThe Global Performing Arts Database
The Global Performing Arts Database
 
Fs2 - Crash Course
Fs2 - Crash CourseFs2 - Crash Course
Fs2 - Crash Course
 
The Ring programming language version 1.8 book - Part 48 of 202
The Ring programming language version 1.8 book - Part 48 of 202The Ring programming language version 1.8 book - Part 48 of 202
The Ring programming language version 1.8 book - Part 48 of 202
 
ITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function ProgrammingITT 2015 - Saul Mora - Object Oriented Function Programming
ITT 2015 - Saul Mora - Object Oriented Function Programming
 
2013 CrossRef Workshops Boot Camp Introduction Patricia Feeney
2013 CrossRef Workshops Boot Camp Introduction Patricia Feeney2013 CrossRef Workshops Boot Camp Introduction Patricia Feeney
2013 CrossRef Workshops Boot Camp Introduction Patricia Feeney
 

Viewers also liked

Computational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part IIComputational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part IIjakehofman
 
Computational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: RegressionComputational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: Regressionjakehofman
 
Computational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part IComputational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part Ijakehofman
 
Computational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part IComputational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part Ijakehofman
 
Computational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online ExperimentsComputational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online Experimentsjakehofman
 
Computational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: ClassificationComputational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: Classificationjakehofman
 
Computational Social Science, Lecture 06: Networks, Part II
Computational Social Science, Lecture 06: Networks, Part IIComputational Social Science, Lecture 06: Networks, Part II
Computational Social Science, Lecture 06: Networks, Part IIjakehofman
 
Computational Social Science, Lecture 05: Networks, Part I
Computational Social Science, Lecture 05: Networks, Part IComputational Social Science, Lecture 05: Networks, Part I
Computational Social Science, Lecture 05: Networks, Part Ijakehofman
 
Computational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part IIComputational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part IIjakehofman
 
Computational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to CountingComputational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to Countingjakehofman
 
Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 6: Regression, Part 1Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 6: Regression, Part 1jakehofman
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Countingjakehofman
 
Modeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: OverviewModeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: Overviewjakehofman
 
Assist Workshop 2016 - Phil Gray - Interactions
Assist Workshop 2016 - Phil Gray - InteractionsAssist Workshop 2016 - Phil Gray - Interactions
Assist Workshop 2016 - Phil Gray - Interactionsassist
 
From Ubisoft Montreal to Fantasia: Happy 15th Anniversary
From Ubisoft Montreal to Fantasia: Happy 15th AnniversaryFrom Ubisoft Montreal to Fantasia: Happy 15th Anniversary
From Ubisoft Montreal to Fantasia: Happy 15th AnniversaryUbisoft Montreal
 
Mapa Conceptual 11 6
Mapa Conceptual 11 6Mapa Conceptual 11 6
Mapa Conceptual 11 6guest7379f6
 
Conociendo Parte De La AmazoníA
Conociendo Parte De La AmazoníAConociendo Parte De La AmazoníA
Conociendo Parte De La AmazoníARufinaespi
 

Viewers also liked (20)

Computational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part IIComputational Social Science, Lecture 08: Counting Fast, Part II
Computational Social Science, Lecture 08: Counting Fast, Part II
 
Computational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: RegressionComputational Social Science, Lecture 11: Regression
Computational Social Science, Lecture 11: Regression
 
Computational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part IComputational Social Science, Lecture 07: Counting Fast, Part I
Computational Social Science, Lecture 07: Counting Fast, Part I
 
Computational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part IComputational Social Science, Lecture 03: Counting at Scale, Part I
Computational Social Science, Lecture 03: Counting at Scale, Part I
 
Computational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online ExperimentsComputational Social Science, Lecture 10: Online Experiments
Computational Social Science, Lecture 10: Online Experiments
 
Computational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: ClassificationComputational Social Science, Lecture 13: Classification
Computational Social Science, Lecture 13: Classification
 
Computational Social Science, Lecture 06: Networks, Part II
Computational Social Science, Lecture 06: Networks, Part IIComputational Social Science, Lecture 06: Networks, Part II
Computational Social Science, Lecture 06: Networks, Part II
 
Computational Social Science, Lecture 05: Networks, Part I
Computational Social Science, Lecture 05: Networks, Part IComputational Social Science, Lecture 05: Networks, Part I
Computational Social Science, Lecture 05: Networks, Part I
 
Computational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part IIComputational Social Science, Lecture 04: Counting at Scale, Part II
Computational Social Science, Lecture 04: Counting at Scale, Part II
 
Computational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to CountingComputational Social Science, Lecture 02: An Introduction to Counting
Computational Social Science, Lecture 02: An Introduction to Counting
 
Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 6: Regression, Part 1Modeling Social Data, Lecture 6: Regression, Part 1
Modeling Social Data, Lecture 6: Regression, Part 1
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Counting
 
Modeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: OverviewModeling Social Data, Lecture 1: Overview
Modeling Social Data, Lecture 1: Overview
 
Assist Workshop 2016 - Phil Gray - Interactions
Assist Workshop 2016 - Phil Gray - InteractionsAssist Workshop 2016 - Phil Gray - Interactions
Assist Workshop 2016 - Phil Gray - Interactions
 
Ninos sabios
Ninos sabiosNinos sabios
Ninos sabios
 
From Ubisoft Montreal to Fantasia: Happy 15th Anniversary
From Ubisoft Montreal to Fantasia: Happy 15th AnniversaryFrom Ubisoft Montreal to Fantasia: Happy 15th Anniversary
From Ubisoft Montreal to Fantasia: Happy 15th Anniversary
 
практ8
практ8практ8
практ8
 
Mapa Conceptual 11 6
Mapa Conceptual 11 6Mapa Conceptual 11 6
Mapa Conceptual 11 6
 
Conociendo Parte De La AmazoníA
Conociendo Parte De La AmazoníAConociendo Parte De La AmazoníA
Conociendo Parte De La AmazoníA
 
лабар8
лабар8лабар8
лабар8
 

Similar to Computational Social Science, Lecture 09: Data Wrangling

Golang slidesaudrey
Golang slidesaudreyGolang slidesaudrey
Golang slidesaudreyAudrey Lim
 
Crafting Evolvable Api Responses
Crafting Evolvable Api ResponsesCrafting Evolvable Api Responses
Crafting Evolvable Api Responsesdarrelmiller71
 
Introduction to clojure
Introduction to clojureIntroduction to clojure
Introduction to clojureAbbas Raza
 
The Django Web Application Framework
The Django Web Application FrameworkThe Django Web Application Framework
The Django Web Application FrameworkSimon Willison
 
Breaking down data silos with the open data protocol
Breaking down data silos with the open data protocolBreaking down data silos with the open data protocol
Breaking down data silos with the open data protocolWoodruff Solutions LLC
 
JSON Fuzzing: New approach to old problems
JSON Fuzzing: New  approach to old problemsJSON Fuzzing: New  approach to old problems
JSON Fuzzing: New approach to old problemstitanlambda
 
Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)Igor Moochnick
 
Native Phone Development 101
Native Phone Development 101Native Phone Development 101
Native Phone Development 101Sasmito Adibowo
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.GeeksLab Odessa
 
T3chFest 2016 - The polyglot programmer
T3chFest 2016 - The polyglot programmerT3chFest 2016 - The polyglot programmer
T3chFest 2016 - The polyglot programmerDavid Muñoz Díaz
 
Jackson beyond JSON: XML, CSV
Jackson beyond JSON: XML, CSVJackson beyond JSON: XML, CSV
Jackson beyond JSON: XML, CSVTatu Saloranta
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R StudioRupak Roy
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDBantoinegirbal
 
2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introductionantoinegirbal
 
The Django Web Framework (EuroPython 2006)
The Django Web Framework (EuroPython 2006)The Django Web Framework (EuroPython 2006)
The Django Web Framework (EuroPython 2006)Simon Willison
 
Python Code Camp for Professionals 4/4
Python Code Camp for Professionals 4/4Python Code Camp for Professionals 4/4
Python Code Camp for Professionals 4/4DEVCON
 
Http4s, Doobie and Circe: The Functional Web Stack
Http4s, Doobie and Circe: The Functional Web StackHttp4s, Doobie and Circe: The Functional Web Stack
Http4s, Doobie and Circe: The Functional Web StackGaryCoady
 

Similar to Computational Social Science, Lecture 09: Data Wrangling (20)

Golang slidesaudrey
Golang slidesaudreyGolang slidesaudrey
Golang slidesaudrey
 
Crafting Evolvable Api Responses
Crafting Evolvable Api ResponsesCrafting Evolvable Api Responses
Crafting Evolvable Api Responses
 
Introduction to clojure
Introduction to clojureIntroduction to clojure
Introduction to clojure
 
The Django Web Application Framework
The Django Web Application FrameworkThe Django Web Application Framework
The Django Web Application Framework
 
Breaking down data silos with the open data protocol
Breaking down data silos with the open data protocolBreaking down data silos with the open data protocol
Breaking down data silos with the open data protocol
 
JSON Fuzzing: New approach to old problems
JSON Fuzzing: New  approach to old problemsJSON Fuzzing: New  approach to old problems
JSON Fuzzing: New approach to old problems
 
Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)Ado.Net Data Services (Astoria)
Ado.Net Data Services (Astoria)
 
Native Phone Development 101
Native Phone Development 101Native Phone Development 101
Native Phone Development 101
 
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
Java/Scala Lab: Борис Трофимов - Обжигающая Big Data.
 
T3chFest 2016 - The polyglot programmer
T3chFest 2016 - The polyglot programmerT3chFest 2016 - The polyglot programmer
T3chFest 2016 - The polyglot programmer
 
Jackson beyond JSON: XML, CSV
Jackson beyond JSON: XML, CSVJackson beyond JSON: XML, CSV
Jackson beyond JSON: XML, CSV
 
Import web resources using R Studio
Import web resources using R StudioImport web resources using R Studio
Import web resources using R Studio
 
JavaScript Lessons 2023 V2
JavaScript Lessons 2023 V2JavaScript Lessons 2023 V2
JavaScript Lessons 2023 V2
 
Introduction to MongoDB
Introduction to MongoDBIntroduction to MongoDB
Introduction to MongoDB
 
2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction2011 Mongo FR - MongoDB introduction
2011 Mongo FR - MongoDB introduction
 
The Django Web Framework (EuroPython 2006)
The Django Web Framework (EuroPython 2006)The Django Web Framework (EuroPython 2006)
The Django Web Framework (EuroPython 2006)
 
Python Code Camp for Professionals 4/4
Python Code Camp for Professionals 4/4Python Code Camp for Professionals 4/4
Python Code Camp for Professionals 4/4
 
Http4s, Doobie and Circe: The Functional Web Stack
Http4s, Doobie and Circe: The Functional Web StackHttp4s, Doobie and Circe: The Functional Web Stack
Http4s, Doobie and Circe: The Functional Web Stack
 
huhu
huhuhuhu
huhu
 
Odp
OdpOdp
Odp
 

More from jakehofman

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2jakehofman
 
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1jakehofman
 
Modeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: NetworksModeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: Networksjakehofman
 
Modeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: ClassificationModeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: Classificationjakehofman
 
Modeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 7: Model complexity and generalizationModeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 7: Model complexity and generalizationjakehofman
 
Modeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at ScaleModeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at Scalejakehofman
 
Modeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in RModeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in Rjakehofman
 
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation SystemsModeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation Systemsjakehofman
 
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive BayesModeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive Bayesjakehofman
 
Modeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at ScaleModeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at Scalejakehofman
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Countingjakehofman
 
Modeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case StudiesModeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case Studiesjakehofman
 
NYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social ScienceNYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social Sciencejakehofman
 
Technical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal WabbitTechnical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal Wabbitjakehofman
 
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10jakehofman
 
Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09jakehofman
 
Using Data to Understand the Brain
Using Data to Understand the BrainUsing Data to Understand the Brain
Using Data to Understand the Brainjakehofman
 

More from jakehofman (17)

Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
Modeling Social Data, Lecture 12: Causality & Experiments, Part 2
 
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
Modeling Social Data, Lecture 11: Causality and Experiments, Part 1
 
Modeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: NetworksModeling Social Data, Lecture 10: Networks
Modeling Social Data, Lecture 10: Networks
 
Modeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: ClassificationModeling Social Data, Lecture 8: Classification
Modeling Social Data, Lecture 8: Classification
 
Modeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 7: Model complexity and generalizationModeling Social Data, Lecture 7: Model complexity and generalization
Modeling Social Data, Lecture 7: Model complexity and generalization
 
Modeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at ScaleModeling Social Data, Lecture 4: Counting at Scale
Modeling Social Data, Lecture 4: Counting at Scale
 
Modeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in RModeling Social Data, Lecture 3: Data manipulation in R
Modeling Social Data, Lecture 3: Data manipulation in R
 
Modeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation SystemsModeling Social Data, Lecture 8: Recommendation Systems
Modeling Social Data, Lecture 8: Recommendation Systems
 
Modeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive BayesModeling Social Data, Lecture 6: Classification with Naive Bayes
Modeling Social Data, Lecture 6: Classification with Naive Bayes
 
Modeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at ScaleModeling Social Data, Lecture 3: Counting at Scale
Modeling Social Data, Lecture 3: Counting at Scale
 
Modeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to CountingModeling Social Data, Lecture 2: Introduction to Counting
Modeling Social Data, Lecture 2: Introduction to Counting
 
Modeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case StudiesModeling Social Data, Lecture 1: Case Studies
Modeling Social Data, Lecture 1: Case Studies
 
NYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social ScienceNYC Data Science Meetup: Computational Social Science
NYC Data Science Meetup: Computational Social Science
 
Technical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal WabbitTechnical Tricks of Vowpal Wabbit
Technical Tricks of Vowpal Wabbit
 
Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10Data-driven modeling: Lecture 10
Data-driven modeling: Lecture 10
 
Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09Data-driven modeling: Lecture 09
Data-driven modeling: Lecture 09
 
Using Data to Understand the Brain
Using Data to Understand the BrainUsing Data to Understand the Brain
Using Data to Understand the Brain
 

Recently uploaded

ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4MiaBumagat1
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomnelietumpap1
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxAnupkumar Sharma
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Jisc
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxMaryGraceBautista27
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptxSherlyMaeNeri
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designMIPLM
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfTechSoup
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxnelietumpap1
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 

Recently uploaded (20)

ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4ANG SEKTOR NG agrikultura.pptx QUARTER 4
ANG SEKTOR NG agrikultura.pptx QUARTER 4
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
ENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choomENGLISH6-Q4-W3.pptxqurter our high choom
ENGLISH6-Q4-W3.pptxqurter our high choom
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptxMULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
MULTIDISCIPLINRY NATURE OF THE ENVIRONMENTAL STUDIES.pptx
 
Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...Procuring digital preservation CAN be quick and painless with our new dynamic...
Procuring digital preservation CAN be quick and painless with our new dynamic...
 
Science 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptxScience 7 Quarter 4 Module 2: Natural Resources.pptx
Science 7 Quarter 4 Module 2: Natural Resources.pptx
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
Judging the Relevance and worth of ideas part 2.pptx
Judging the Relevance  and worth of ideas part 2.pptxJudging the Relevance  and worth of ideas part 2.pptx
Judging the Relevance and worth of ideas part 2.pptx
 
OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...OS-operating systems- ch04 (Threads) ...
OS-operating systems- ch04 (Threads) ...
 
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Tilak Nagar Delhi reach out to us at 🔝9953056974🔝
 
Keynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-designKeynote by Prof. Wurzer at Nordex about IP-design
Keynote by Prof. Wurzer at Nordex about IP-design
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdfInclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
Inclusivity Essentials_ Creating Accessible Websites for Nonprofits .pdf
 
Q4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptxQ4 English4 Week3 PPT Melcnmg-based.pptx
Q4 English4 Week3 PPT Melcnmg-based.pptx
 
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptxYOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
YOUVE_GOT_EMAIL_PRELIMS_EL_DORADO_2024.pptx
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptxYOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
YOUVE GOT EMAIL_FINALS_EL_DORADO_2024.pptx
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 

Computational Social Science, Lecture 09: Data Wrangling