The document outlines an analytic framework for effectively analyzing social media data. It discusses common pitfalls to avoid, such as relying too heavily on metrics without context. The framework involves data capture, reporting, and analysis in an iterative cycle. A case study applies the framework to understand public sentiment toward a new video game. Key steps included refining queries, tagging entities, visualizing reports, and generating hypotheses to improve analysis over time.
Building Effective Frameworks for Social Media Analysisikanow
This document outlines an analytic framework for effectively analyzing social media data. It discusses common pitfalls to avoid, such as relying too heavily on metrics without understanding context. The framework involves capturing data from multiple social media sources, reporting insights through visualizations, and iteratively analyzing the data to test hypotheses and make recommendations. A case study applies this framework to understand public sentiment toward a new video game. The document emphasizes adapting to changing data and focusing analysis on addressing specific operational needs.
Partner Webinar: Recommendation Engines with MongoDB and HadoopMongoDB
Personalized recommendations drive business, helping people find the products they want, the news they need, and the music they didn't know they would love. Despite the obvious advantages, many companies either don't have recommendations or don't leverage their data to make good ones. Too many recommendation engines are black-box algorithms that are hard to change or don't scale well. Using the same recommendation techniques as used at StubHub, Viacom, and AP, this technical webinar will show you how to load your data from MongoDB into Hadoop, generate recommendations, and then put those recommendations into MongoDB, ready to serve end-users. This webinar will prepare you to build a custom recommender for your company that is highly scalable, easy to understand, and built on open-source technology.
K Young: About the speaker
K Young is the CEO of Mortar Data. Mortar serves data scientists and engineers with a service that makes creating and operating high-scale data pipelines easy. Mortar contributes to several open source projects including Pig, Luigi, and the Mongo-Hadoop connector. Prior to founding Mortar Data, K built software that reaches one in ten public school students in the U.S. He holds a Computer Science degree from Rice University.
The document discusses personas and user archetypes for public libraries. It describes capturing narratives from library users to identify key characters, issues, behaviors, and needs. Workshops were held to generate anecdotes from users, which were then grouped into archetypes, themes, and values. This information aims to help libraries understand user expectations for services, content, and virtual interaction. The goal is to identify priority requirements for specific user groups represented by personas.
This document provides an overview of Donna Maurer's presentation on information architecture. She discusses conducting user research through methods like interviews, card sorting, and surveys. Key insights from research are analyzed using affinity diagramming and dimensional analysis. Different types of information structures are explored, including hierarchies, databases, faceted classification, and organic structures. Classification schemes like topic, task, audience, and geography are also examined. The goal is to organize content in a way that allows people to discover the information they need.
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...Dawn Anderson MSc DigM
In a mobile-first world, with time-poor users, research crazy and mainly browsing with one eye and one hand, we need to consider a few new approaches to SEO. Information overload is all around. We need to think about how we can maintain relevance with less in some cases. We need to consider the Iceberg Theory and simultaneously consider the Iceberg Syndrome as well as placement position in user view for content and links
Social Media & Hurricane Sandy: Crisis Case StudiesSandra Fathi
Social Media & Hurricane Sandy: Crisis Communications Case Studies from ConEdison, JetBlue and the MTA. Before, during and after the Hurricane that devastated much of the Northeast, these three organizations had a tremendous challenges to restore service and communicate with constituents and media.
Moderator:
Sandra Fathi, President, Affect (@TeamAffect, @sandrafathi)
Panelists:
Michael Clendenin, Director of Media Relations, Con Edison (@ConEdison)
Aaron Donovan, Media Liaison, MTA (@Aaron_Donovan, @MTAInsider)
Eugene Ribeiro, Director, Promotions and Business Development, MTA New York City Transit ( @MTAInsider)
Morgan Johnston, Manager of Corporate Communications, Social Media Strategist, JetBlue (@MHJohnston, @JetBlue)
Presented on February 20th, 2013 at Affect (www.affect.com)
Open Analytics: Building Effective Frameworks for Social Media Analysisikanow
The document outlines an analytic framework for effectively analyzing social media data. It discusses common pitfalls to avoid, such as analyzing what is said rather than why. The framework involves capturing relevant social media data, reporting on it, and analyzing it to answer operational questions. A case study applies this by analyzing Twitter sentiment about a new video game. It finds hashtags are generic and expands the query to improve understanding attitudes toward the product. The document concludes by recommending segmentation, graph analysis, and tying metrics to actions.
Josh Liss presented on building effective frameworks for social media analysis. He discussed how social media can be used from an intelligence perspective and common pitfalls to avoid, such as analyzing what instead of why. He provided a case study on analyzing social media data during Superstorm Sandy to help disaster response efforts. Key aspects included problem definition, data capture from Twitter, reporting trends and entities, and analyzing hashtags, sentiment, and networks to identify new questions and ways social media could supplement emergency systems.
Building Effective Frameworks for Social Media Analysisikanow
This document outlines an analytic framework for effectively analyzing social media data. It discusses common pitfalls to avoid, such as relying too heavily on metrics without understanding context. The framework involves capturing data from multiple social media sources, reporting insights through visualizations, and iteratively analyzing the data to test hypotheses and make recommendations. A case study applies this framework to understand public sentiment toward a new video game. The document emphasizes adapting to changing data and focusing analysis on addressing specific operational needs.
Partner Webinar: Recommendation Engines with MongoDB and HadoopMongoDB
Personalized recommendations drive business, helping people find the products they want, the news they need, and the music they didn't know they would love. Despite the obvious advantages, many companies either don't have recommendations or don't leverage their data to make good ones. Too many recommendation engines are black-box algorithms that are hard to change or don't scale well. Using the same recommendation techniques as used at StubHub, Viacom, and AP, this technical webinar will show you how to load your data from MongoDB into Hadoop, generate recommendations, and then put those recommendations into MongoDB, ready to serve end-users. This webinar will prepare you to build a custom recommender for your company that is highly scalable, easy to understand, and built on open-source technology.
K Young: About the speaker
K Young is the CEO of Mortar Data. Mortar serves data scientists and engineers with a service that makes creating and operating high-scale data pipelines easy. Mortar contributes to several open source projects including Pig, Luigi, and the Mongo-Hadoop connector. Prior to founding Mortar Data, K built software that reaches one in ten public school students in the U.S. He holds a Computer Science degree from Rice University.
The document discusses personas and user archetypes for public libraries. It describes capturing narratives from library users to identify key characters, issues, behaviors, and needs. Workshops were held to generate anecdotes from users, which were then grouped into archetypes, themes, and values. This information aims to help libraries understand user expectations for services, content, and virtual interaction. The goal is to identify priority requirements for specific user groups represented by personas.
This document provides an overview of Donna Maurer's presentation on information architecture. She discusses conducting user research through methods like interviews, card sorting, and surveys. Key insights from research are analyzed using affinity diagramming and dimensional analysis. Different types of information structures are explored, including hierarchies, databases, faceted classification, and organic structures. Classification schemes like topic, task, audience, and geography are also examined. The goal is to organize content in a way that allows people to discover the information they need.
The Iceberg Approach - Power from what lies beneath in SEO for a mobile-first...Dawn Anderson MSc DigM
In a mobile-first world, with time-poor users, research crazy and mainly browsing with one eye and one hand, we need to consider a few new approaches to SEO. Information overload is all around. We need to think about how we can maintain relevance with less in some cases. We need to consider the Iceberg Theory and simultaneously consider the Iceberg Syndrome as well as placement position in user view for content and links
Social Media & Hurricane Sandy: Crisis Case StudiesSandra Fathi
Social Media & Hurricane Sandy: Crisis Communications Case Studies from ConEdison, JetBlue and the MTA. Before, during and after the Hurricane that devastated much of the Northeast, these three organizations had a tremendous challenges to restore service and communicate with constituents and media.
Moderator:
Sandra Fathi, President, Affect (@TeamAffect, @sandrafathi)
Panelists:
Michael Clendenin, Director of Media Relations, Con Edison (@ConEdison)
Aaron Donovan, Media Liaison, MTA (@Aaron_Donovan, @MTAInsider)
Eugene Ribeiro, Director, Promotions and Business Development, MTA New York City Transit ( @MTAInsider)
Morgan Johnston, Manager of Corporate Communications, Social Media Strategist, JetBlue (@MHJohnston, @JetBlue)
Presented on February 20th, 2013 at Affect (www.affect.com)
Open Analytics: Building Effective Frameworks for Social Media Analysisikanow
The document outlines an analytic framework for effectively analyzing social media data. It discusses common pitfalls to avoid, such as analyzing what is said rather than why. The framework involves capturing relevant social media data, reporting on it, and analyzing it to answer operational questions. A case study applies this by analyzing Twitter sentiment about a new video game. It finds hashtags are generic and expands the query to improve understanding attitudes toward the product. The document concludes by recommending segmentation, graph analysis, and tying metrics to actions.
Josh Liss presented on building effective frameworks for social media analysis. He discussed how social media can be used from an intelligence perspective and common pitfalls to avoid, such as analyzing what instead of why. He provided a case study on analyzing social media data during Superstorm Sandy to help disaster response efforts. Key aspects included problem definition, data capture from Twitter, reporting trends and entities, and analyzing hashtags, sentiment, and networks to identify new questions and ways social media could supplement emergency systems.
Advanced Analytics and Data Science ExpertiseSoftServe
An overview of SoftServe's Data Science service line.
- Data Science Group
- Data Science Offerings for Business
- Machine Learning Overview
- AI & Deep Learning Case Studies
- Big Data & Analytics Case Studies
Visit our website to learn more: http://www.softserveinc.com/en-us/
Lecture 5: Mining, Analysis and VisualisationMarieke van Erp
This is the fourth lecture in the Social Web course at the VU University Amsterdam
Visit the website for more information: <a>Social Web 2012</a>
Advanced Use Cases for Analytics Breakout SessionSplunk
This document discusses Splunk's analytics capabilities and how to develop analytics for business users. It introduces personas as user types in a Splunk deployment beyond core IT. Requirements should be gathered for each persona, including their business problem, relevant data sources, and how they prefer to consume results. Searches and data models can then be developed and delivered through dashboards, visualizations, or third-party tools. Advanced analytics techniques discussed include anomaly detection, data visualization, predictive analytics, and demos. The document encourages reaching out for help from Splunk technical teams to grow analytics beyond IT.
This document discusses the social media analysis solution space. It describes who the solution providers are (researchers, software, services), what they provide (social media analysis and analytics-infused advisory services), who they serve (business users), and how (through various technologies). The document also outlines some key business questions that social media analysis can help answer, and the different approaches taken by industry to work backwards from goals and insights to determine appropriate data, methods, and presentations.
Agile Data Science is a lean methodology that is adopted from Agile Software Development. At the core it centers around people, interactions, and building minimally viable products to ship fast and often to solicit customer feedback. In this presentation, I describe how this work was done in the past with examples. Get started today with our help by visiting http://www.alpinenow.com
How can we mine, analyse and visualise the Social Web?
In this lecture, you will learn about mining social web data for analysis. Data preparation and gathering basic statistics on your data.
Early Lessons from Building Sensor.Network: An Open Data Exchange for the Web...benaam
The document describes Sensor.Network, an open data exchange platform for the Internet of Things. It discusses lessons learned from interactions with customers that highlighted common issues like data collection, sharing, and analysis. Sensor.Network aims to address these issues by providing an open API and tools for data-centric collaboration inspired by social networks. These include tags, annotations, access controls, and event notifications. The platform supports heterogeneous devices and data insertion/retrieval through a REST API. It provides views, visualizations, and the ability to integrate statistical analysis packages to enable insights from sensor data.
The document discusses social media monitoring and measurement. It defines monitoring as watching conversations to determine a course of action, while measurement is quantifying online activity to establish success. Key differences between monitoring and measurement data are explained. Recommendations are provided on tools for both monitoring and measurement, and how to set goals and select appropriate reports.
The document provides an overview of social media marketing and strategies for using various social media platforms. It discusses defining social media marketing and key platforms like social networks, blogs, Twitter, and multimedia content. It also outlines seven steps for planning and executing a social media campaign, including setting goals, defining team roles, branding and integrating elements, researching platforms, and documenting the process. Quantitative metrics for gauging success are also presented. The overall message is that an effective social media strategy requires ongoing maintenance across multiple platforms.
Univ. of AZ Global Racing Symposium 2015 - Digital Strategiessmfrisby
Provides a high-level view of how organizations can leverage Big Data in the digital space. Covers topics such as structured vs unstructured data, curating disparate data sources and exploiting the data correlation opportunities.
Amundsen is a metadata-driven application developed by Lyft to solve data discovery challenges. It provides a search-based UI and uses a distributed architecture with various microservices to index and serve metadata from multiple sources. Key components include a metadata service using Neo4j, a search service using Elasticsearch, and a frontend. The tool has been hugely successful at Lyft and is now open source. Future work includes expanding metadata coverage and integrating with other tools.
The document discusses the evolution of search engines from basic keyword search to semantic search using knowledge graphs and structured data. It provides examples of how search engines like Google are now able to provide direct answers to queries by searching structured data rather than just documents. It emphasizes the importance of representing web content as structured data using schemas like schema.org to be discoverable in semantic search and knowledge graphs.
Social Media Data Collection & AnalysisScott Sanders
A non-technical primer on how to collect and analyze social media data. This was an invited lecture by Biostatistics and Bioinformatics Department in the School of Public Health at the University of Louisville.
The document provides an overview of data science, big data, data mining, and data mining techniques. It defines data science as a multi-disciplinary field that uses scientific methods to extract knowledge from structured and unstructured data. Big data is described as large, diverse datasets that are too large for traditional databases to handle. Common data mining tasks like prediction, classification, clustering and association rule mining are summarized. Finally, specific techniques like decision trees, k-means clustering, and association rule mining are overviewed.
Introduction to Competitive Intelligence PortalsComintelli
The number of companies that are successfully deploying various kinds of Competitive Intelligence (CI) portal solutions are constantly growing. The phrases CI portals, Intelligence systems, CI tools, MI portals are heard everywhere, but what do they really mean? And why should you really care?
Neo4j GraphTour Santa Monica 2019 - Amundsen PresentationTamikaTannis
Tamika Tannis gave a presentation on Lyft's open source data discovery tool called Amundsen. She discussed Lyft's data ecosystem and the challenges of data discovery. Amundsen addresses these challenges through search, metadata, and visualization capabilities powered by a graph database backend. The tool has been hugely successful at Lyft and an active open source community is contributing to its ongoing development and new features.
This document discusses using open source tools to combine search engines, analytics, semantics, and content management systems (CMS) to enhance organizational knowledge management. It proposes a stack that uses Drupal as a semantic CMS with Apache Stanbol for content enhancement, reasoning, knowledge models, and persistence of semantic data. This integrated approach allows information and knowledge to be discovered, shared, reused, and retained within and across organizations.
This document summarizes cybersecurity policy issues before Congress from 2012-2014 following the Snowden leaks. It discusses key pillars debated in 2012 like critical infrastructure protection and information sharing between government and private sector. In 2013, an executive order focused on voluntary best practices and increased information sharing. The document outlines various cybersecurity bills introduced but not passed. It predicts lame duck issues in the Senate and changes in congressional committee leadership going forward. It also summarizes lessons from a crisis response exercise showing focus on critical infrastructure protection and developing cybersecurity job skills.
This document discusses how cyber intelligence can be used to combat advanced cyber adversaries. It notes that traditional computer network defense is no longer sufficient due to state-sponsored groups, hacktivists, and crime rings. Cyber intelligence involves fusing open source data, reports, and internal attack data to provide organizations threat profiles, attack timelines, and malware intelligence. This intelligence can be combined with network defense to give a broader view of adversaries and better arm organizations against advanced threats.
More Related Content
Similar to Building Effective Frameworks for Social Media Analysis
Advanced Analytics and Data Science ExpertiseSoftServe
An overview of SoftServe's Data Science service line.
- Data Science Group
- Data Science Offerings for Business
- Machine Learning Overview
- AI & Deep Learning Case Studies
- Big Data & Analytics Case Studies
Visit our website to learn more: http://www.softserveinc.com/en-us/
Lecture 5: Mining, Analysis and VisualisationMarieke van Erp
This is the fourth lecture in the Social Web course at the VU University Amsterdam
Visit the website for more information: <a>Social Web 2012</a>
Advanced Use Cases for Analytics Breakout SessionSplunk
This document discusses Splunk's analytics capabilities and how to develop analytics for business users. It introduces personas as user types in a Splunk deployment beyond core IT. Requirements should be gathered for each persona, including their business problem, relevant data sources, and how they prefer to consume results. Searches and data models can then be developed and delivered through dashboards, visualizations, or third-party tools. Advanced analytics techniques discussed include anomaly detection, data visualization, predictive analytics, and demos. The document encourages reaching out for help from Splunk technical teams to grow analytics beyond IT.
This document discusses the social media analysis solution space. It describes who the solution providers are (researchers, software, services), what they provide (social media analysis and analytics-infused advisory services), who they serve (business users), and how (through various technologies). The document also outlines some key business questions that social media analysis can help answer, and the different approaches taken by industry to work backwards from goals and insights to determine appropriate data, methods, and presentations.
Agile Data Science is a lean methodology that is adopted from Agile Software Development. At the core it centers around people, interactions, and building minimally viable products to ship fast and often to solicit customer feedback. In this presentation, I describe how this work was done in the past with examples. Get started today with our help by visiting http://www.alpinenow.com
How can we mine, analyse and visualise the Social Web?
In this lecture, you will learn about mining social web data for analysis. Data preparation and gathering basic statistics on your data.
Early Lessons from Building Sensor.Network: An Open Data Exchange for the Web...benaam
The document describes Sensor.Network, an open data exchange platform for the Internet of Things. It discusses lessons learned from interactions with customers that highlighted common issues like data collection, sharing, and analysis. Sensor.Network aims to address these issues by providing an open API and tools for data-centric collaboration inspired by social networks. These include tags, annotations, access controls, and event notifications. The platform supports heterogeneous devices and data insertion/retrieval through a REST API. It provides views, visualizations, and the ability to integrate statistical analysis packages to enable insights from sensor data.
The document discusses social media monitoring and measurement. It defines monitoring as watching conversations to determine a course of action, while measurement is quantifying online activity to establish success. Key differences between monitoring and measurement data are explained. Recommendations are provided on tools for both monitoring and measurement, and how to set goals and select appropriate reports.
The document provides an overview of social media marketing and strategies for using various social media platforms. It discusses defining social media marketing and key platforms like social networks, blogs, Twitter, and multimedia content. It also outlines seven steps for planning and executing a social media campaign, including setting goals, defining team roles, branding and integrating elements, researching platforms, and documenting the process. Quantitative metrics for gauging success are also presented. The overall message is that an effective social media strategy requires ongoing maintenance across multiple platforms.
Univ. of AZ Global Racing Symposium 2015 - Digital Strategiessmfrisby
Provides a high-level view of how organizations can leverage Big Data in the digital space. Covers topics such as structured vs unstructured data, curating disparate data sources and exploiting the data correlation opportunities.
Amundsen is a metadata-driven application developed by Lyft to solve data discovery challenges. It provides a search-based UI and uses a distributed architecture with various microservices to index and serve metadata from multiple sources. Key components include a metadata service using Neo4j, a search service using Elasticsearch, and a frontend. The tool has been hugely successful at Lyft and is now open source. Future work includes expanding metadata coverage and integrating with other tools.
The document discusses the evolution of search engines from basic keyword search to semantic search using knowledge graphs and structured data. It provides examples of how search engines like Google are now able to provide direct answers to queries by searching structured data rather than just documents. It emphasizes the importance of representing web content as structured data using schemas like schema.org to be discoverable in semantic search and knowledge graphs.
Social Media Data Collection & AnalysisScott Sanders
A non-technical primer on how to collect and analyze social media data. This was an invited lecture by Biostatistics and Bioinformatics Department in the School of Public Health at the University of Louisville.
The document provides an overview of data science, big data, data mining, and data mining techniques. It defines data science as a multi-disciplinary field that uses scientific methods to extract knowledge from structured and unstructured data. Big data is described as large, diverse datasets that are too large for traditional databases to handle. Common data mining tasks like prediction, classification, clustering and association rule mining are summarized. Finally, specific techniques like decision trees, k-means clustering, and association rule mining are overviewed.
Introduction to Competitive Intelligence PortalsComintelli
The number of companies that are successfully deploying various kinds of Competitive Intelligence (CI) portal solutions are constantly growing. The phrases CI portals, Intelligence systems, CI tools, MI portals are heard everywhere, but what do they really mean? And why should you really care?
Neo4j GraphTour Santa Monica 2019 - Amundsen PresentationTamikaTannis
Tamika Tannis gave a presentation on Lyft's open source data discovery tool called Amundsen. She discussed Lyft's data ecosystem and the challenges of data discovery. Amundsen addresses these challenges through search, metadata, and visualization capabilities powered by a graph database backend. The tool has been hugely successful at Lyft and an active open source community is contributing to its ongoing development and new features.
This document discusses using open source tools to combine search engines, analytics, semantics, and content management systems (CMS) to enhance organizational knowledge management. It proposes a stack that uses Drupal as a semantic CMS with Apache Stanbol for content enhancement, reasoning, knowledge models, and persistence of semantic data. This integrated approach allows information and knowledge to be discovered, shared, reused, and retained within and across organizations.
Similar to Building Effective Frameworks for Social Media Analysis (20)
This document summarizes cybersecurity policy issues before Congress from 2012-2014 following the Snowden leaks. It discusses key pillars debated in 2012 like critical infrastructure protection and information sharing between government and private sector. In 2013, an executive order focused on voluntary best practices and increased information sharing. The document outlines various cybersecurity bills introduced but not passed. It predicts lame duck issues in the Senate and changes in congressional committee leadership going forward. It also summarizes lessons from a crisis response exercise showing focus on critical infrastructure protection and developing cybersecurity job skills.
This document discusses how cyber intelligence can be used to combat advanced cyber adversaries. It notes that traditional computer network defense is no longer sufficient due to state-sponsored groups, hacktivists, and crime rings. Cyber intelligence involves fusing open source data, reports, and internal attack data to provide organizations threat profiles, attack timelines, and malware intelligence. This intelligence can be combined with network defense to give a broader view of adversaries and better arm organizations against advanced threats.
CDM….Where do you start? (OA Cyber Summit)Open Analytics
The document discusses ForeScout's network access control solution. It provides visibility into networked devices and endpoints, including those that are and aren't corporate assets. It can control access based on compliance levels, perform continuous monitoring, and share information. The solution offers user and device authentication, posture assessment, policy-based enforcement across networks and infrastructure, and integration with existing enterprise tools through an open platform. It allows network access control to be implemented gradually over time through a staged approach.
An Immigrant’s view of Cyberspace (OA Cyber Summit)Open Analytics
This document discusses different perspectives on cyberspace. It notes that cyberspace is constantly dynamic, pervades everything, and can be seen as another reality. The document separates cyberspace into geographic and persona layers and invites questions and comments on viewing cyberspace.
MOLOCH: Search for Full Packet Capture (OA Cyber Summit)Open Analytics
Moloch is an open source packet capture system built using Elasticsearch for storage and indexing and a Node.js web interface for searching. It consists of a capture process that extracts session profile information from packets and writes it to Elasticsearch, allowing the packet data and metadata to be queried and browsed through a web GUI or APIs. It is designed for scalability, supporting clustering across multiple nodes to handle large packet volumes.
Observations on CFR.org Website Traffic Surge Due to Chechnya Terrorism Scare...Open Analytics
The document summarizes website traffic data from the Council on Foreign Relations (CFR) website following the Boston Marathon bombings in April 2013. It found a significant surge in traffic on April 19th, with over 100,000 additional visits, focused on the page about Chechen terrorism. The traffic came from new sources like news sites and social media, and more visits from mobile devices and countries like the Netherlands and Australia. This showed that CFR was seen as an authoritative source for information on the suspected Chechen connection to the bombings.
Using Real-Time Data to Drive Optimization & PersonalizationOpen Analytics
This document discusses using real-time data and machine learning techniques to optimize, segment, and personalize digital experiences. It provides examples of optimization, segmentation, and personalization. It also describes building a platform that uses various technologies like Couchbase, Spring, and MongoDB to power a real-time engine that chooses offers for customers based on their data and business rules. This platform delivers personalized experiences and content to clients to increase conversions over time as it continuously learns from customer interactions and offer history.
The document discusses an upcoming tech summit hosted by Bois Capital, an investment bank focusing on the technology sector. Bois Capital's managing partners have extensive experience in the telecom big data analytics sector. The summit will provide an overview of the telco analytics market and applications across various stakeholders. Recent M&A transactions in the space are also analyzed, with revenue multiples typically between 3-5x for companies under $100m in revenue. The document concludes with a case study of Bois Capital advising a Swiss mobile analytics company in its sale to Gemalto.
The document discusses how businesses can compete in the digital economy. It covers topics like using big data and analytics to gain insights, delivering superior customer experiences, and the need to act on data insights. It provides examples of how various industries like healthcare, retail, automotive and insurance can leverage digital technologies and data to improve operations and customer value. The key message is that competing in the digital world involves using data and technology to improve quality of service while maintaining operational simplicity and price competitiveness.
Piwik: An Analytics Alternative (Chicago Summit)Open Analytics
The document discusses Piwik, an open-source web analytics platform. It provides an alternative to Google Analytics that gives users more control and independence over their behavioral data. The summary describes how Piwik is freely available, can be hosted anywhere, has a simple interface, provides real-time reporting, and is highly customizable. It also notes that an initial Piwik installation takes around 10-20 minutes.
Social Media, Cloud Computing, Machine Learning, Open Source, and Big Data An...Open Analytics
This document discusses using social media, cloud computing, machine learning, open source, and big data analytics to analyze Twitter data. It describes how to collect tweets using the Twitter API, classify tweets in real-time using machine learning models on AWS, store classified tweets in MongoDB on AWS, and present results. Cost estimates for real-time classification of 1 million tweets per day are provided. Use cases described include tracking food poisoning reports and disease occurrence. Future directions discussed include developing turnkey services and linking to additional open data sources.
Crossing the Chasm (Ikanow - Chicago Summit)Open Analytics
The document discusses the results of a study on the effects of a new drug on memory and cognitive function in older adults. The double-blind study involved 100 participants aged 65-80 who were given either the drug or a placebo daily for 6 months. Researchers found that those who received the drug performed significantly better on memory and problem-solving tests at the end of the study compared to those who received the placebo.
On the “Moneyball” – Building the Team, Product, and Service to Rival (Pegged...Open Analytics
This document discusses how a hospital system used big data analytics to reduce staff turnover rates and the associated costs of replacing employees. It provides data on turnover rates and replacement costs for nurses and non-nurses from 2009 to 2012. For nurses, the turnover rate decreased from 22.91% in 2009 to 24.01% in 2012, and the estimated replacement cost was over $14 million. For non-nurses, the turnover rate decreased from 21.49% to 24.53% over the same period, with a replacement cost of over $13 million in 2012. The total estimated cost of turnover for 2012 was over $27 million. The document also outlines best practices for using big data, including clearly defining objectives
Data evolutions in media, marketing, and retail (Business Adv Group - Chicago...Open Analytics
The document discusses evolutions in media, marketing, and retail. It notes that media content and operations are going digital, enabling individual distribution and programmatic selling. Marketing is becoming more integrated to enable demand discovery, touchpoint messaging, and product lifecycle relationship management. Retail operations are also going digital, enabling location-based messaging, offers, and product services both online and in physical stores. Data sources are expanding to include more location data, purchase behavior data, and data from sensors. Integration is becoming the cornerstone of real-time analytics across industries.
Characterizing Risk in your Supply Chain (nContext - Chicago Summit)Open Analytics
1. The document discusses characterizing risk in supply chains, with a focus on human trafficking in agriculture. It identifies key challenges in understanding risk, including lack of information about lower tier suppliers.
2. Using data analysis, a methodology was established to characterize risk from supplier survey responses at the country, state, and item level. This allows identification of high risk vendors and areas for mitigation.
3. Opportunities exist to enhance risk characterization by incorporating additional publicly available and paid data sources, and monitoring industry news and social media.
From Insight to Impact (Chicago Summit - Keynote)Open Analytics
This document discusses five critical pillars for the success of analytics and data science projects: 1) align with corporate strategy, 2) ignite stakeholder engagement, 3) sharpen team focus, 4) drive change management, and 5) recruit key talent. It provides guidance on each pillar, such as prioritizing analytics opportunities by their impact and horizon, understanding stakeholder incentives, avoiding "zombie" projects, enabling experiments to drive change, and pre-screening talent for technical skills and culture fit. Following these pillars can help organizations improve analytics project success rates and better compete through data-driven insights.
This document discusses how EasyBib uses data analytics to help students improve their research skills and citation quality. It analyzes student paper bibliographies and source usage over time to identify top sources and credibility trends. EasyBib developed features to warn students about source credibility, analyze citation quality, and provide analytics on source usage. This data-driven approach helped shift top sources from places like Wikipedia to more credible sources like The New York Times and CDC. The document discusses expanding these data analytics efforts through tools like Cloudant Search to further help students find better sources and evaluate source credibility in real-time.
The document discusses enabling information discovery by unifying search and data management. It provides a brief history of search engines and databases from the 1960s to present. It then proposes that search and databases could be unified by using a schema-agnostic, hierarchical data model with a universal index that can index structured and unstructured data alike. Examples of potential use cases are given, such as creating a 360-degree customer view or enabling fraud prevention. The presentation concludes by suggesting future areas could include more semantic technologies and graph traversal capabilities.
The caprate presentation_july2013_open analytics dc meetupOpen Analytics
This document discusses capitalization rates and how they relate to property income and investment returns. It mentions paying X amount for a property with Y income and how that translates into the return on the investment money. The document also notes there will be a demonstration related to capitalization rates.
Verifeed open analytics_3min deck_071713_finalOpen Analytics
The document discusses Verifeed, a company that analyzes social media conversations to provide insights for enterprises. It highlights large potential markets, example use cases showing benefits, plans to grow revenue through an initial product launch and expansion. Key points include:
- Verifeed's platform allows customers to filter social data to identify relevant information, engage customers, and make better business decisions.
- There is significant potential demand totaling billions from industries like consumer goods, sports, financial services, and more.
- Early pilots showed benefits like increased engagement for sports and identifying customer attitudes for a dog food brand.
- The company plans an initial product launch, adding customers, and expanding its capabilities and markets over time.
2. Agenda
• Social Media: An INT perspective
• Common Analytic Pitfalls
• An Analytic Framework
• Case Study: Brand Management
– Problem Definition
– Source Selection
– Data Capture
– Data Reporting
– Data Analysis
• Ways Forward, Future Analysis
• Questions?
3. Intelligence
• Intelligence is information that has been
transformed to meet an operational need
Operational Lens
Data Intelligence
5. Social Media: The INT Perspective
Social Media gets the best
and worst of three disciplines:
HUMINT
– HUMINT
• Pros: Reveals intentions
• Cons: Can be unreliable
– OSINT
• Pros: Fast, Accessible
OSINT SIGINT • Cons: Noise
– SIGINT
• Pros: Network, High Volume
• Cons: Noise
6. Social Media Analysis Goals
• Need to have an end-goal with value to the
organization (operational lens)
• Need to ensure cyclical feedback occurs from
collection, processing, analysis, and
consumption
• Need to make sure that a particular network is
the right source for the task
7. Common Misconceptions
• Social media is not a panacea
– Not everyone uses social media
– Users of social media use it unevenly
– User behavior changes based on situations
• Just because people can talk about anything
does not mean they talk about everything all the
time.
8. Common Pitfalls
• The important thing is often not what people are
saying… but why they are saying it.
• Reporting tools rarely help dig into the why.
• Many common tools, reports, and metrics are
actually misleading:
– Word clouds atomize message context
– Sentiment metrics are often highly inaccurate
– Information in aggregate hides more than it reveals
9.
10.
11. Dangers of Disintegration
Source: Matthew Auer, Policy Studies
Journal, Volume 39, Issue 4, pages 709–736, Nov
2011
12. Analytic Framework
• Data Capture (DC)
• Data Reporting (DR)
• Data Analysis (DA)
– 1. What to measure
– 2. What the data is saying
– 3. What should be done based on the data
Source: Avinash Kaushik, Occam’s Razor Blog
http://www.kaushik.net/avinash/web-analytics-consulting-
framework-smarter-decisions/
14. Choosing a Platform
• Social media is still new, evolving; and so
is how we use it.
– Static approaches to social media are flawed
from the outset
– No one metric or set of metrics will always let
you know what is happening
• Need an adaptive platform to facilitate
data capture, reporting, and analysis
15. Case Study: Brand Management
• Industry: Gaming
– Experiencing 10% growth annually
– Overall revenue expected to exceed $80
billion by 2014
• In May, Zenimax Online Studios
announced Elder Scrolls Online
– Elder Scrolls V: Skyrim 2nd largest game of
2011
16. Problem Definition
• As a brand manager, how can I use social
media to track and understand public
attitudes toward my product?
• Challenge is getting relevant information
– Query too large = false positives
– Query too small = miss potential information
17. Source: Twitter
• Twitter has some of the best
analytic potential
– High volume traffic
– High volume user-base
– Open API
• Not without limitations:
– 140 characters
– Limited historical / lookback
18. Platform: Infinit.e
Infinit.e is a
scalable
framework for Visualizing
Analyzing
Retrieving
Enriching
Storing
Collecting
Unstructured documents
&
Structured records
19. Platform: Infinit.e
• Infinit.e supports the extraction of entities
and creation of associations using a
combination of built in enrichment libraries
and 3rd party NLP APIs.
20. Data Capture – Initial Query
• Twitter search for “Elder Scrolls Online”
– Simplest possible way to access information
– RSS feed for 10 days (Jun 27 – July 6 2012)
22. Data Capture – Entity Map
Hashtag TwitterHandle URL
Who
TwitterHandle
What
Hashtags, Keywords,
URLs
When
Time, Date
Unstructured Keywords Where
Time / Date Stamp Geo (if Available)
23. Data Reporting
• Used Infinit.e’s Flash U/I Widget Framework
– Document Browser (Individual Tweets)
– Entity Significance (Top Entities)
– Sentiment (Top Entities w/ Sentiment)
– Query Metrics (Breakdowns of Query Results)
• Framework allows for additional
visualizations to be constructed as needed
• Export options also available for manual
review (e.g. graphml, excel, pdf)
27. Data Analysis
• Analysis needs to be rooted in the
operational need:
“How can I use social media to track and
understand public attitudes toward my
product”
• Emphasis on hypothesis generation,
testing, and experimentation
28. Data Analysis -> Capture
• Hash tags from an initial subset of Tweets
fed back into the initial query
Initial
Expanded Query
Query
Results
Results
Twitter
29. Data Analysis - Hashtags
• Top hashtags were
almost all generic /
more abstract
– Undermines tracking and
understanding
– Top hashtags tied to
franchise, not to the
game
30. Data Analysis - Sentiment
• Converted URLs into derivative sources
• 35% additional sources
• Larger text sources offer potential value with
sentiment analysis that tweets alone cannot offer
31. Data Analysis - Sentiment
• Top negative and positive scores provided
glimpses into aggregate attitudes
• Provide starting points for additional analysis
32. Data Analysis - Recommendations
• Actionable recommendations allow
decision makers to make changes
33. Future Data Analysis
• Initial conclusions should be starting points
for new analysis
• Broad entity capture allows for:
– Key influencer identification
– Clustering of tweets for segmentation
– Map / Reduce for aggregate functions
35. Expandable Model
• Identify key influencers on specific topics
• Look at relationships between websites /
blogs and Twitter use (cross-network
analysis)
36. Counting and Summing
• “Traditional” business intelligence analytics
problems solved using aggregate functions:
– Sum
– Count
– Average
– Min
– Max
– Etc.
37. Clustering - Topic
• Topic Extraction
– Key words -> Categories
– Categories -> Related Categories
Keyword Topic Key Value
graphics graphics graphics gameplay.pdf
screenshots graphics story gameplay.pdf
resolution graphics company corporate.txt
quests story … …
zenimax company … …
… …
39. Take-Aways
• All data providers can and do change their
formats; users flock to and abandon
platforms – what works today may not
work tomorrow.
• Whatever platform you choose to do
analysis, make sure it’s open and
adaptable or your investment may
degrade over time.
40. Take Aways (Things to Avoid)
• Data puking (less is more)
• Metrics that cannot be tied to actions
• Visualizations / reports that remove
context
• Taking dashboards at face value
41. Take Aways (Things to Do)
• Segment data rather than work in aggregate
• Look for the why behind the message
• Always return to the source material
• Explore alternative explanations
• Always consider the ultimate goal
Editor's Notes
Given my background, I come at the social media problem from an intelligence analysis perspective. This comes with a certain set of vocabulary and paradigms, but I believe they are useful for understanding how to frame out an effective analytic framework.