Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Data extraction tools

263 views

Published on

Digital Media Winter Institute. Smart Data Sprint: Interpreters of platform data.
29 Jan-02 Fev, 2018. Universidade Nova de Lisboa, Lisbon, Portugal
Practical Lab as an introduction to Social Media Methods, taking as a starting point the data extraction tools.

Published in: Social Media
  • Be the first to comment

Data extraction tools

  1. 1. Data Extraction tools #SMARTdatasprint, 2018 Cristian Ruiz @CristianCJRuiz M.A. Communication Sciences SMART Research Member iNova Media Lab – NOVA/FCSH
  2. 2. Agenda Data Extraction Tools 1. SMART Goals. 2. API relevanceand debate. 3. Diversity of extractiontools. 4. How to use Netlytic. 5. How to use DMI: YouTube Data Tools. 6. How to use Netvizz. 7. Tools Output.
  3. 3. SMART goals Social Media Methods 2. It includes a set of data processes that implies: 1. Extraction 2. Visualization 3. Analysis 4. Critique Social Media Methods 1. What for: 1. Social Sciences. 2. Communication Sciences. 3. Medical Sciences. 4. Geography. 5. Culture. 6. Political Sciences And a big etc. Sakaki, T. Okazaki, M. & Matsuo, Y. - Earthquake Shakes TwitterUsers (2010) Tremaynea M. - Anatomy of Protest in the Digital Era (2014) Burgess J et al. - Platform Studies (2017) Lampos, V. Tijl De Bie & Cristianini N. – Flu Detector (2010) Smith R. & Sanderson J. - I’m Going to Instagram It! (2014) Del Vicario et al. - The Anatomy of Brexit Debate on Facebook (2016) Data
  4. 4. Where to find data? Social Media API Application Programming Interface Part of the software that provides a specific library and functions to external applications. In Social Media Methods A data extraction tool interacts with the platform API to retrieve the queried data.
  5. 5. API debate– Who decides what is public? Social Media Methods Limitations What digitalobjects are available for data extraction? What media content can be part of my analysis? How far back in time can data be retrieved? What are the standards output files? (Omena, 2016) To take in mind when is developed an extraction tool
  6. 6. Tools For Social Media Platforms DMI Tools: https://wiki.digitalmethods.net/Dmi/ToolDatabase Netvizz: https://apps.facebook.com/netvizz/ NodeXL: http://nodexl.codeplex.com/ MédiaLab Tools: http://www.medialab.sciences-po.fr/tools/ SocioViz: http://socioviz.net/SNA/eu/sna/login.jsp And a big etc: http://socialmediadata.wikidot.com/start
  7. 7. Image from: https://netlytic.org Tools Netlytic
  8. 8. Data sources: Twitter Facebook Instagram Youtube It needs to link with your Twitter account It needs to link with your Instagram account Start creating an account https://netlytic.org
  9. 9. Datasets Type of accounts https://netlytic.org/home/?page_id=10851 Data-set depends of the data source: https://netlytic.org/home/?page_id=10851
  10. 10. Data Source: Extracting from Twitter It needs to link with your Twitter account (tool) Then Name your Dataset. Use one or more Keywords, Hashtags or @username (And Language) of interest. To use more than one search term (Keywords, Hashtags etc.) Use conjunctions and & or. E.g: #SmartDataSprint and Data Sprint. So Go to “Preview Bottom” To see your extracted data. Now you have got a dataset to export in .CSV or to work inside Netlytic!
  11. 11. Data Source: from Facebook Name your Dataset. Groups (ID is required)Pages (URL is required) Data retrieved posts and posts comments, but not replies of comments. An e-mail will be sent once the data collection is done. (Or check the status of your dataset) Now you have got a dataset to export in .CSV or to work inside Netlytic!
  12. 12. Data Source: from Instagram It needs to link with your Twitter account (tool) Then Name your Dataset. Query by keyword (Hashtag)Query by location An e-mail will be sent once the data collection is done. (Or check the status of your dataset). Now you have got a dataset to export in .CSV or to work inside Netlytic!
  13. 13. Data Source: from Youtube Name your Dataset. YouTube video ID It retrieves comments of the video. Now you have got a dataset to export in .CSV or to work inside Netlytic!
  14. 14. Where to find your datasets Status: In progress/Complete/One time collection Data source Name of dataset Subset Share a colleague Edit Download
  15. 15. Tools DMI: YouTube Data Tools • A collection of simple tools for extracting data from the YouTube platform via the YouTube API v3. • Created by Bernhard Rieder as part of the Digital Methods Initiative. Output Gephi Files: .gdf Tab Files: .tab
  16. 16. Getting channel ID
  17. 17. Tools Netvizz Output Tab Files: .tab Gephi Files: .gdf Tab Files: .tvs
  18. 18. Output .gdf; .tab; .cvs; etc. As a result of the extraction, will be gotten a file normally .gdf; .tab or .cvs. Depending on the research and analysis, these files have to be introduced into an specific software. E.g: Analysis: Network Analysis Software: Gephi (.gfd)
  19. 19. What to do with the output? Go to a software analysis and introduce your files! Etc. Image from: https://netlytic.org Image from: http://santuan.github.io/stn/open-source/ Image from:http://gephi.org/
  20. 20. Then Social Media Data (Collected and stored by the social media platform.) Data supplied by API (Depends on social media site policies.) Extraction Tool Requires (Keywords, Hashtags, locations, etc.) Extraction Tool Retrieves (A dataset.) Extraction Tool creates an output. (.gdf; .tab; .cvs; etc.) Now visualize it! (Gephi, NodeXL, DMI tools, etc.)
  21. 21. Data Extraction tools #SMARTdatasprint, 2018 Cristian Ruiz @CristianCJRuiz M.A. Communication Sciences SMART Research Member iNova Media Lab – NOVA/FCSH

×