Twitter_socioviz gephi_mspowerbi


How to create an interactive online report from Twitter data. Data collected with Socioviz, Network with Gephi. Report itself with MS Power Bi.

Workshop data is available online with request

  2. 2. MISSION OF THE WORKSHOP • Find different clusters in discussion • By hashtags network • By User mentions network • Find opinion leaders in each of those clusters • Allow users to browse to most shared content from these opinion leaders • Have a look at possible sentiment analysis workflow
  3. 3. WORKFLOW Data preparing •We have a look at socioviz platform in order to understand the datasource •Data is ready in a zip •We create an excel file for import to PBI •15 mins Netwrork analysos •We do some network pictures •Create csv exports for excel -> pbi •45 mins Powerbi reporting •Import the data and look how data model is created •Create some pages and visuals •45 mins
  4. 4. TOOLS • Socioviz – is an online platform for twitter data collection. This gives us different network files ready and all the tweets in a give search as an excel file • Gephi. Open source network analytiics tool. Gives us teh visualization of the network and also metrics for the MS Power BI • MS Power BI is a platform for big data analysis. Basically one can think of it as an interactive hybrid of Excel and powerpoint, which situates in the ’cloud’
  5. 5. SOCIOVIZ DATA SOURCE • We get in a zip • All the tweets in an excel97 format. -> concert to xlsx • Top10 excels for some attributes. This is ’nice-to- know”. But we want to create our own report. • Network files, for mentions, hashtags, words and emojis we use hastags and mentions networks • Limitiation in the data. No tweet level metrics present, like RT, favorite etc. we can however, count the amount of certain media is shared
  6. 6. GEPHI TO DO LIST Create network images of • Mentions network • Hashtags network Export nodes data with metrics for both of the above Identify the clusters, in other words find the the set of top nodes for each clusters
  7. 7. SOCIOVIZ DATA PREPARING IN EXCEL • Fix the date column bug • Add _nodata to empty fields in attribute column cells. This will be handy later in Power Query ofnthe BI • Import the Gephi node csv’s to excel and prepare cluster identifications, in other words we give name to modularity values of hashtags and mentions networks. This will be handy also in the MS PowerBI later on • In the mentions data, create column attribute for node type. Here party, president of the party, media, ngo etc. also used later on in the PBI • In the mentions data. Create ranking by metrics. In other words, nodes position in top lists by different metrics.
  8. 8. MS POWER BI TO DO LIST • Import data • Create datamodel • Import some visuals • Create a filtering page • Create a tweets content page with drillthrough • Create a matrix with hashtags clusters on x and user clusters on y. As values we take the ID’s from the tweets table -> look at content • Create top items page by clusters (users, hashtags, words?)
  9. 9. MS POWER BI VISUALS TIP 1 Enable this feature in settings It will make creating many pages easier All the visuals automatically filter each other
  10. 10. POSSIBLE OUTCOME  Is this powerBI report that is now published online There are 6 different pages. You can go and have a look prior to workshop