The document discusses data kindness on the internet. It covers topics like data valuation, dealing with data through active and passive collection methods, and examples of data kindness and unkindness. The presentation also introduces speakers working in areas like privacy and behavioral targeting. Overall it promotes the polite and transparent collection/use of data according to stated policies.
This document provides an overview of big data concepts and Hadoop. It discusses the characteristics of big data including volume, variety and velocity. It compares traditional data warehouses to Hadoop and explains when each is best suited. Use cases of big data from various companies are presented. The document also summarizes a survey on big data adoption trends and priorities across industries. Finally, it provides details on the Hadoop framework and its key components.
The document discusses big data and how it is often misunderstood and overhyped. It argues that big data needs to be objectively defined in order to make meaningful claims and measurements regarding its success. Jumping into big data without proper preparation often leads to the same dismal results as failed IT projects. The document advocates separating the signal from the noise to truly understand big data.
GGV Capital: Venture Investing and the Cloud (2012)GGV Capital
This document discusses venture investing in cloud computing. It provides an overview of why VCs continue to see opportunities in the cloud sector. The presentation agenda covers trends disrupting the cloud like mobile and big data, as well as opportunities in serving small and medium businesses. The document concludes with advice for cloud startups on effectively approaching VCs for funding, emphasizing differentiation, market size, scalability, financial model, and chemistry over legal terms.
The document provides an overview of big data concepts including definitions, statistics on data generation and internet usage, applications and examples, challenges, and data types. It discusses key big data concepts such as the 3Vs of volume, velocity and variety; more Vs including veracity, value and visualization; data science areas and skills; the data workflow; and examples from companies like UPS, Walmart, eBay, and Kaiser Permanente.
The document discusses big data, defining it as extremely large data sets that can be analyzed computationally to reveal patterns. It notes that advances in storage, processing power, and data availability have enabled the rise of big data. The key aspects of big data are described as the four V's: volume, velocity, variety, and veracity. Examples of how big data is used include optimizing business processes by analyzing social media, web search, and sensor data, and better understanding customers by combining traditional data sets with social media, browser, and sensor information to create predictive models.
Data-Ed Webinar: Demystifying Big Data DATAVERSITY
We are in the middle of a data flood and we need to figure out how to tame it without drowning. Most of what has been written about Big Data is focused on selling hardware and services. But what about a Big Data Strategy that guides hardware and software decisions? While virtually every major organization is faced with the challenge of figuring out the approach for and the requirements of this new development, jumping into the fray hastily and unprepared will only reproduce the same dismal IT project results as previously experienced. Join Dr. Peter Aiken as he will debunk a number of misconceptions about Big Data as your un-typical IT project. He will provide guidance on how to establish realistic Big Data management plans and expectations, and help demonstrate the value of such actions to both internal and external decision makers without getting lost in the hype.
Takeaways:
- The means by which Big Data techniques can complement existing data management practices
- The prototyping nature of practicing Big Data techniques
- The distinct ways in which utilizing Big Data can generate business value
- Bigger Data isn’t always Better Data
The document discusses how big data is changing business due to the massive increase in data creation in recent years. It notes that 90% of data in the world was created in just the last two years alone. The document then provides an overview of what big data means and the factors involved, including volume, velocity, variety, and value. It also reviews some case studies and discusses how big data is affecting software companies and creating new opportunities.
This document provides an overview of big data concepts and Hadoop. It discusses the characteristics of big data including volume, variety and velocity. It compares traditional data warehouses to Hadoop and explains when each is best suited. Use cases of big data from various companies are presented. The document also summarizes a survey on big data adoption trends and priorities across industries. Finally, it provides details on the Hadoop framework and its key components.
The document discusses big data and how it is often misunderstood and overhyped. It argues that big data needs to be objectively defined in order to make meaningful claims and measurements regarding its success. Jumping into big data without proper preparation often leads to the same dismal results as failed IT projects. The document advocates separating the signal from the noise to truly understand big data.
GGV Capital: Venture Investing and the Cloud (2012)GGV Capital
This document discusses venture investing in cloud computing. It provides an overview of why VCs continue to see opportunities in the cloud sector. The presentation agenda covers trends disrupting the cloud like mobile and big data, as well as opportunities in serving small and medium businesses. The document concludes with advice for cloud startups on effectively approaching VCs for funding, emphasizing differentiation, market size, scalability, financial model, and chemistry over legal terms.
The document provides an overview of big data concepts including definitions, statistics on data generation and internet usage, applications and examples, challenges, and data types. It discusses key big data concepts such as the 3Vs of volume, velocity and variety; more Vs including veracity, value and visualization; data science areas and skills; the data workflow; and examples from companies like UPS, Walmart, eBay, and Kaiser Permanente.
The document discusses big data, defining it as extremely large data sets that can be analyzed computationally to reveal patterns. It notes that advances in storage, processing power, and data availability have enabled the rise of big data. The key aspects of big data are described as the four V's: volume, velocity, variety, and veracity. Examples of how big data is used include optimizing business processes by analyzing social media, web search, and sensor data, and better understanding customers by combining traditional data sets with social media, browser, and sensor information to create predictive models.
Data-Ed Webinar: Demystifying Big Data DATAVERSITY
We are in the middle of a data flood and we need to figure out how to tame it without drowning. Most of what has been written about Big Data is focused on selling hardware and services. But what about a Big Data Strategy that guides hardware and software decisions? While virtually every major organization is faced with the challenge of figuring out the approach for and the requirements of this new development, jumping into the fray hastily and unprepared will only reproduce the same dismal IT project results as previously experienced. Join Dr. Peter Aiken as he will debunk a number of misconceptions about Big Data as your un-typical IT project. He will provide guidance on how to establish realistic Big Data management plans and expectations, and help demonstrate the value of such actions to both internal and external decision makers without getting lost in the hype.
Takeaways:
- The means by which Big Data techniques can complement existing data management practices
- The prototyping nature of practicing Big Data techniques
- The distinct ways in which utilizing Big Data can generate business value
- Bigger Data isn’t always Better Data
The document discusses how big data is changing business due to the massive increase in data creation in recent years. It notes that 90% of data in the world was created in just the last two years alone. The document then provides an overview of what big data means and the factors involved, including volume, velocity, variety, and value. It also reviews some case studies and discusses how big data is affecting software companies and creating new opportunities.
The document appears to contain links to pirated movie and game downloads along with cheat codes and instructions for installing and using pirated content. It includes links to download Ice Age 4, various adult films, the games Need for Speed: Most Wanted and Grand Theft Auto: San Andreas. It also provides extensive lists of cheat codes for both games.
DATA FORUM MICROPOLE 2015 - Forrester - Data Gouvernance ValuationMicropole Group
This document discusses data valuation and governance for Micropole. It recommends evaluating data usage internally for efficiency, revenue, strategic objectives, and customer value. Externally, data can be used in data marketplaces and by selling APIs or data. Effective data governance is still needed and should focus on quality, uniqueness, lifecycle, compliance, security and privacy. New data governance objectives include classification, transparency, and machine learning integration. The document recommends roadmapping data strategies by business outcomes and considering data governance applications to improve effectiveness using data valuation.
1) While some organizations measure the value of their data assets, most do not properly quantify, measure benefits, or inventory their data. Data is increasingly becoming a key asset but many organizations are focused on storage and access rather than business value.
2) There are various techniques to estimate the value of data including Delphi method, scorecards, statistical methods, and information markets. Quantifying value helps with competitive advantage, M&A valuations, and justifying security expenses.
3) APIs can increase data value by allowing access to third party data and enabling experimentation through external partners and developers. The purpose, type of access, and process accessed (data vs services) determine the API strategy around exploitation, public
The document discusses five pillars for data valuation: predictively spotting new opportunities, innovating in an agile way, demonstrating transparency and trust, providing unique personalized experiences, and always being on and operating in real time. It provides more details on the analytic lifecycle for predictively spotting opportunities, the need for data agility in application development, establishing a data ethic of transparency and trust, using data to provide personalized experiences, and reliably storing data to always be on and operating in real time.
This document provides an overview of various Web 2.0 tools that can be used in the language classroom to engage students and promote interactive, collaborative learning. It describes tools for polls (Polleverywhere), social media (Twitter, Instagram), document sharing (Google Docs), multimedia (QuickTime, iPad, Vine), games/quizzes (Sporcle, Quizlet), visual content (Pinterest, Infographics), blogs, and more. Potential applications are outlined like having students follow news in the target language, participate in discussions, collaborate on projects, practice skills like speaking, writing and listening.
JavaScript is the primary scripting language of the web and is used to make web pages interactive. It can dynamically write and modify HTML content, react to user events like clicks, validate form data, detect the browser, and more. JavaScript code is commonly embedded directly in HTML using <script> tags and can also be stored externally in .js files. Key JavaScript concepts include variables, operators, conditional statements, functions, loops, and events.
Spring Roo is a tool that provides rapid application development for Spring applications. It uses a shell-based approach where commands are entered to generate common code structures like controllers, services, and repositories. The shell can be integrated with Eclipse and Spring Tool Suite for development. Spring Roo aims to have no lock-in and dependencies. The document demonstrates Spring Roo by developing an online shop application to sell second-hand products.
This document discusses the capabilities of rich browsers and devices for building games, including new browser APIs for graphics, audio, and device access. It also touches on challenges like browser fragmentation and differences between desktop and mobile browsers. An example architecture is proposed using a Node.js server to synchronize game state over web sockets between HTML5 mobile and desktop browsers.
This document discusses how Web 2.0 tools can be used to simplify incorporating technology into the language classroom. It provides examples of free and easy-to-use tools for polls, discussions, multimedia sharing, collaboration and content creation. Specific tools highlighted include Polleverywhere, Twitter, Google Docs, Instagram, iPad applications, Quizlet and PowerPoint. Permissions, appropriate use policies and engagement strategies are also addressed.
Buku ini membahas tentang penelitian tindakan kelas sebagai upaya meningkatkan mutu pendidikan dan kompetensi guru. Terdiri dari 6 bab yang membahas tentang pentingnya menulis ilmiah bagi guru, pengertian dan tujuan penelitian tindakan kelas, langkah pelaksanaan dan penyusunannya, teknik pemantauan, serta penyusunan laporan hasil penelitian.
This document summarizes CTSO, a network of building consultants that offers franchise opportunities. Key points:
1. CTSO is a recognized brand focused on customer needs that offers attractive franchise fees with no royalties and exclusivity in defined zones.
2. Franchisees gain access to CTSO's expertise and proven methods through regular network meetings and ongoing training.
3. CTSO supports franchisees by checking their suitability, helping with business setup, providing training and legal support, and optimizing communications.
Buku ini membahas tentang cara mengajar dan pembelajaran yang efektif, dengan membahas konsep pembelajaran efektif, pendekatan sembilan dimensi pembelajaran, cara murid belajar, organisasi kelas, rancangan instruksional yang mempertimbangkan karakteristik murid, dan permasalahan pendidikan secara keseluruhan. Tujuannya adalah membantu guru dalam meningkatkan praktik mengajar.
Web 2.0 Technology in the Language ClassroomBarbara Hirsch
This document discusses using technology in the language classroom, including Web 2.0 tools. It provides examples of tools like online comic creators, photo books, interactive quizzes and recording tools that can be used to engage students. Challenges with technology integration like access and privacy are also addressed. The document promotes sharing resources through online forums and social media channels.
DataONE Education Module 10: Legal and Policy IssuesDataONE
This document discusses legal, ethical and policy issues related to managing research data. It defines key concepts like copyright, licenses and waivers, and explains why identifying ownership and control is important. Restrictions on data use and sharing are discussed, including protecting privacy and following regulations. Open licensing is presented as a way to facilitate sharing while still giving credit. The importance of behaving ethically and respecting licenses is emphasized.
This year’s survey found that the global investment community in the U.S., the UK, Europe, and Asia continues to place a premium on companies that are best able to monetize the data they collect even during a period of market instability. Since 2014, data monetization’s impact on investor decision has increased by seven percent, with 41 percent of those surveyed indicating an effect. In addition, almost a fifth of analysts surveyed believe that a company’s ability to monetize data is the single most important driver of investments, a four percent increase since 2014.
Key highlights include:
• Rise in investment decisions based on the data premium
• Cybersecurity’s importance to M&A
• Data protection increasingly crucial
• Financial services at greatest cyber risk
• Data premium lags in Europe
For more information please contact:
Mark Seifert: www.brunswickgroup.com/people/directory/mark-seifert/
Sparky Zivin: www.brunswickgroup.com/people/directory/sparky-zivin/
Northeastern Ohio nonprofit innovators met for first annual Big Data for a Better World conference on November 16 at Hyland Software's sprawling Westake, Ohio campus. Leading Hands Through Technology (LHTT) and Workman’s Circle teamed up to offer the event so local nonprofits could discuss how analytics could be successfully used to keep their organization profitable and ultimately improve the community.
Interesting ways Big Data is used todayDaniel Sârbe
An overview on the Big Data field, interesting patterns on how data is used to make data mining, predictive analytics, machine learning and an overview on the jobs generated by the Big Data demand.
The document appears to contain links to pirated movie and game downloads along with cheat codes and instructions for installing and using pirated content. It includes links to download Ice Age 4, various adult films, the games Need for Speed: Most Wanted and Grand Theft Auto: San Andreas. It also provides extensive lists of cheat codes for both games.
DATA FORUM MICROPOLE 2015 - Forrester - Data Gouvernance ValuationMicropole Group
This document discusses data valuation and governance for Micropole. It recommends evaluating data usage internally for efficiency, revenue, strategic objectives, and customer value. Externally, data can be used in data marketplaces and by selling APIs or data. Effective data governance is still needed and should focus on quality, uniqueness, lifecycle, compliance, security and privacy. New data governance objectives include classification, transparency, and machine learning integration. The document recommends roadmapping data strategies by business outcomes and considering data governance applications to improve effectiveness using data valuation.
1) While some organizations measure the value of their data assets, most do not properly quantify, measure benefits, or inventory their data. Data is increasingly becoming a key asset but many organizations are focused on storage and access rather than business value.
2) There are various techniques to estimate the value of data including Delphi method, scorecards, statistical methods, and information markets. Quantifying value helps with competitive advantage, M&A valuations, and justifying security expenses.
3) APIs can increase data value by allowing access to third party data and enabling experimentation through external partners and developers. The purpose, type of access, and process accessed (data vs services) determine the API strategy around exploitation, public
The document discusses five pillars for data valuation: predictively spotting new opportunities, innovating in an agile way, demonstrating transparency and trust, providing unique personalized experiences, and always being on and operating in real time. It provides more details on the analytic lifecycle for predictively spotting opportunities, the need for data agility in application development, establishing a data ethic of transparency and trust, using data to provide personalized experiences, and reliably storing data to always be on and operating in real time.
This document provides an overview of various Web 2.0 tools that can be used in the language classroom to engage students and promote interactive, collaborative learning. It describes tools for polls (Polleverywhere), social media (Twitter, Instagram), document sharing (Google Docs), multimedia (QuickTime, iPad, Vine), games/quizzes (Sporcle, Quizlet), visual content (Pinterest, Infographics), blogs, and more. Potential applications are outlined like having students follow news in the target language, participate in discussions, collaborate on projects, practice skills like speaking, writing and listening.
JavaScript is the primary scripting language of the web and is used to make web pages interactive. It can dynamically write and modify HTML content, react to user events like clicks, validate form data, detect the browser, and more. JavaScript code is commonly embedded directly in HTML using <script> tags and can also be stored externally in .js files. Key JavaScript concepts include variables, operators, conditional statements, functions, loops, and events.
Spring Roo is a tool that provides rapid application development for Spring applications. It uses a shell-based approach where commands are entered to generate common code structures like controllers, services, and repositories. The shell can be integrated with Eclipse and Spring Tool Suite for development. Spring Roo aims to have no lock-in and dependencies. The document demonstrates Spring Roo by developing an online shop application to sell second-hand products.
This document discusses the capabilities of rich browsers and devices for building games, including new browser APIs for graphics, audio, and device access. It also touches on challenges like browser fragmentation and differences between desktop and mobile browsers. An example architecture is proposed using a Node.js server to synchronize game state over web sockets between HTML5 mobile and desktop browsers.
This document discusses how Web 2.0 tools can be used to simplify incorporating technology into the language classroom. It provides examples of free and easy-to-use tools for polls, discussions, multimedia sharing, collaboration and content creation. Specific tools highlighted include Polleverywhere, Twitter, Google Docs, Instagram, iPad applications, Quizlet and PowerPoint. Permissions, appropriate use policies and engagement strategies are also addressed.
Buku ini membahas tentang penelitian tindakan kelas sebagai upaya meningkatkan mutu pendidikan dan kompetensi guru. Terdiri dari 6 bab yang membahas tentang pentingnya menulis ilmiah bagi guru, pengertian dan tujuan penelitian tindakan kelas, langkah pelaksanaan dan penyusunannya, teknik pemantauan, serta penyusunan laporan hasil penelitian.
This document summarizes CTSO, a network of building consultants that offers franchise opportunities. Key points:
1. CTSO is a recognized brand focused on customer needs that offers attractive franchise fees with no royalties and exclusivity in defined zones.
2. Franchisees gain access to CTSO's expertise and proven methods through regular network meetings and ongoing training.
3. CTSO supports franchisees by checking their suitability, helping with business setup, providing training and legal support, and optimizing communications.
Buku ini membahas tentang cara mengajar dan pembelajaran yang efektif, dengan membahas konsep pembelajaran efektif, pendekatan sembilan dimensi pembelajaran, cara murid belajar, organisasi kelas, rancangan instruksional yang mempertimbangkan karakteristik murid, dan permasalahan pendidikan secara keseluruhan. Tujuannya adalah membantu guru dalam meningkatkan praktik mengajar.
Web 2.0 Technology in the Language ClassroomBarbara Hirsch
This document discusses using technology in the language classroom, including Web 2.0 tools. It provides examples of tools like online comic creators, photo books, interactive quizzes and recording tools that can be used to engage students. Challenges with technology integration like access and privacy are also addressed. The document promotes sharing resources through online forums and social media channels.
DataONE Education Module 10: Legal and Policy IssuesDataONE
This document discusses legal, ethical and policy issues related to managing research data. It defines key concepts like copyright, licenses and waivers, and explains why identifying ownership and control is important. Restrictions on data use and sharing are discussed, including protecting privacy and following regulations. Open licensing is presented as a way to facilitate sharing while still giving credit. The importance of behaving ethically and respecting licenses is emphasized.
This year’s survey found that the global investment community in the U.S., the UK, Europe, and Asia continues to place a premium on companies that are best able to monetize the data they collect even during a period of market instability. Since 2014, data monetization’s impact on investor decision has increased by seven percent, with 41 percent of those surveyed indicating an effect. In addition, almost a fifth of analysts surveyed believe that a company’s ability to monetize data is the single most important driver of investments, a four percent increase since 2014.
Key highlights include:
• Rise in investment decisions based on the data premium
• Cybersecurity’s importance to M&A
• Data protection increasingly crucial
• Financial services at greatest cyber risk
• Data premium lags in Europe
For more information please contact:
Mark Seifert: www.brunswickgroup.com/people/directory/mark-seifert/
Sparky Zivin: www.brunswickgroup.com/people/directory/sparky-zivin/
Northeastern Ohio nonprofit innovators met for first annual Big Data for a Better World conference on November 16 at Hyland Software's sprawling Westake, Ohio campus. Leading Hands Through Technology (LHTT) and Workman’s Circle teamed up to offer the event so local nonprofits could discuss how analytics could be successfully used to keep their organization profitable and ultimately improve the community.
Interesting ways Big Data is used todayDaniel Sârbe
An overview on the Big Data field, interesting patterns on how data is used to make data mining, predictive analytics, machine learning and an overview on the jobs generated by the Big Data demand.
This document discusses the power of small data compared to big data for marketers. It argues that small data, which involves filtering data by segments like time, location, or profiles and measuring important metrics, can provide valuable insights when analyzed properly. The key is to start small by identifying relevant data sources, defining metrics and goals, and testing and refining analytics on segmented slices of data rather than trying to analyze all data at once. Small, targeted data analysis focused on important metrics can allow businesses to make better real-time decisions.
This document discusses big data and its characteristics, applications, and market opportunities. It notes that big data involves large amounts of data from a variety of sources that require new techniques and tools to solve problems. Examples are provided of the large quantities of data generated daily from sources like social media, online transactions, and medical records. The document also outlines some applications of big data analytics in fields like healthcare, homeland security, finance, and manufacturing. It predicts substantial growth in the big data market and jobs required to manage and analyze increasingly large datasets.
Companies collect vast amounts of personal data from various sources such as social media logins, online quizzes, purchases, internet activity, and IoT devices. This data is used for targeted ads, product customization, insurance rates, and credit reports. However, data can also be abused, such as for political targeting, identity theft if breached, or changing credit scores based on behaviors. Individuals can take steps to secure their data by controlling profiles and opting out of data collection, but the US third party doctrine allows government collection of data shared with companies without warrants.
Will Bigger and Better Data Help Deliver More Major Donors?Azadi Sheridan
This document discusses how larger and better data can help non-profits identify more major donors. It notes that while bigger data presents opportunities, data integrity is key to finding new donors. Data-driven philanthropy allows organizations to be innovative in tracking donor behaviors and metrics that matter, like affinity, and in managing the donor pipeline more effectively. Data can also help measure the impact of donations and guide investment decisions. Overall, data analysis has significant potential but must be done carefully and ethically to truly help organizations in their missions.
a whistlestop tour through some of the ethical dilemmas and challenges that arise in this "Big Data Age" and the various approaches to considering them, if not solving them.
In this 10 minute "lightning talk" delegates will get insights into some of the research agenda and issues being considered in this area, touching on Business Analytics, Data Quality, analytic risks, ethics and evidence-based decision-making culture
This document discusses big data, including its prerequisites, technologies, uses, and consequences. It notes that big data is enabled by Moore's law, the internet, cloud computing, and the internet of things, allowing vast amounts of data to be stored indefinitely. It is used by businesses for targeted ads, insurance rates, and fraud detection. Intelligence and law enforcement agencies like the NSA use it for surveillance, as revealed by Snowden. Research institutions use big data to study patterns in areas like climate, medicine, economics, and psychology. The consequences for individuals and society include loss of privacy and data being used beyond its original purpose. Solutions proposed include data minimization laws, and tools/websites that help resist online surveillance.
It has been said that Mobiles +Cloud + Social + Big Data = Better Run The World. IBM has invested over $20 billion since 2005 to grow its analytics business, many companies will invest more than $120 billion by 2015 on analytics, hardware, software and services critical in almost every industry like ; Healthcare, media, sports, finance, government, etc.
It has been estimated that there is a shortage of 140,000 – 190,000 people with deep analytical skills to fill the demand of jobs in the U.S. by 2018.
Decoding the human genome originally took 10 years to process; now it can be achieved in one week with the power of Analytic and BI (Business Intelligence). This lecture’s Key Messages is that Analytics provide a competitive edge to individuals , companies and institutions and that Analytics and BI are often critical to the success of any organization.
Methodology used is to teach analytic techniques through real world examples and real data with this goal to convince audience of the Analytics Edge and power of BI, and inspire them to use analytics and BI in their career and their life.
The document discusses the business applications of big data across multiple topics. It begins with the significance of social network data, explaining concepts like social network analysis and sentiment analysis. It then covers applications in detecting financial fraud and insurance fraud. Finally, it discusses the use of big data in the retail industry. The document provides overviews of key areas where big data analytics can be applied in business.
IoT & Big Data - A privacy-oriented view of the futureFacundo Mauricio
Understanding the future based on the current technology, with a focus on Big Data and Internet of Things (IoT). A discussion of privacy and personal information and how it affects us.
This document provides an overview of data science including why it is an exciting field, where data comes from, what data science is, how to do data science, and who data scientists are. It discusses the history of data analysis and examples of exciting new applications of data analytics like Google Flu Trends. It also covers sources of big data, the five V's of big data, contrasting data science with databases and business intelligence, and contrasting data science with machine learning.
Big data comes from many sources and is used in elections to analyze voter data and predict outcomes. Political parties gather large amounts of structured and unstructured data from social media, websites, and other online sources to build personalized voter profiles. They then use data mining algorithms and analytics to develop targeted campaign messages and strategies aimed at influencing different voter groups. While big data provides opportunities to better understand voters, it also raises privacy concerns if personal data is collected and shared without permission.
Data is the New Oil: Presented By Naveen Narayanan, Global Client Partner of ...InterCon
InterCon is a premier technology conference that brings together like-minded people on a common platform to share knowledge, present ideas, get recognition, and network. InterCon Dubai will offer knowledgeable sessions, informative content, extraordinary speakers, and an overall memorable experience.
Follow us:
Facebook: https://www.facebook.com/InterConWorld
Linkedin: https://www.linkedin.com/showcase/int...
Twitter: https://twitter.com/InterConWorld
Instagram - https://www.instagram.com/interconworld/
This document discusses mining social data from online sources to gain insights. It defines social data and information, and notes that unstructured data found online provides a rich source of knowledge. It recommends developing skills in statistics, data processing, and data visualization to extract value from social data. Finally, it outlines best practices for social media analytics, including defining goals, selecting metrics, targeting data sources, using analytics tools, and delivering insights through dashboards, reports, and infographics.
1. Data Kindness on the
Internet
Christan Grant
@cegme
Christan Grant • Tapia 2011 • Birds of a Feather
2. Data Kindness on the
Internet
BoF Rules:
Ask questions, address to the group not
just me
Goal: Get researchers using (open) data
in research and ventures.
Last chunk of minutes will be a group
bonding activity
Christan Grant • Tapia 2011 • Birds of a Feather
3. Data Kindness on the
Internet
• Data Bandwagon
• Data Valuation
• Dealing with Data
• Good and Bad
• Open Data Breakout
Christan Grant • Tapia 2011 • Birds of a Feather
4. People
Dr. Tyrone Grandison
An ACM Distinguished
Engineer, and program
manager for IBM core
healthcare services.
Dr. Grandison is world
renowned leader in the
area of information privacy
and security.
Christan Grant • Tapia 2011 • Birds of a Feather
5. People
Dr. Kun Liu
Dr. Liu is researching
behavioral targeting issues
in the Yahoo! Display
advertising team.
He has co-authored three
book chapters on privacy-
preserving social-network
analysis and privacy-
preserving data mining.
Christan Grant • Tapia 2011 • Birds of a Feather
6. People
Dr. Christan Grant
A doctoral student at the
University of Florida
Database Research
Center, has worked on
projects to extract
information from hidden
data bases and collections
of online.
Christan Grant • Tapia 2011 • Birds of a Feather
7. lets talk about data
• Web 2.0 is predicated in the
fact users will upload their
data
• Data → Knowledge → Power
• A Market for Lemons (George
Akerlof)
Christan Grant • Tapia 2011 • Birds of a Feather
8. Valuation of Data
“If you are not paying for something, you are
not the customer, you are the product.”
Christan Grant • Tapia 2011 • Birds of a Feather
9. Valuation of Data
“Personal data is the oil of the Internet and
the new currency of the digital world.”
-- Megan Kuneva, Consumer Commissioner
Christan Grant • Tapia 2011 • Birds of a Feather
10. Valuation of Data
• Facebook
• Ad Sales 2010 is $1.21 Billion
• 90 “pieces on content”
created per month per user
(3 per day)
Christan Grant • Tapia 2011 • Birds of a Feather
13. Valuation of Data
Facebook
$1.21 Billion ÷ (90 data/(month x user) x 12
months)
Christan Grant • Tapia 2011 • Birds of a Feather
14. Valuation of Data
Facebook
$1.21 Billion ÷ (90 data/(month x user) x 12
months)
$1.12 Million per user/data
Christan Grant • Tapia 2011 • Birds of a Feather
15. Valuation of Data
Facebook
$1.21 Billion ÷ (90 data/(month x user) x 12
months)
$1.12 Million per user/data
500 Millions users!! Why The Face??
Christan Grant • Tapia 2011 • Birds of a Feather
16. Valuation of Data
Facebook
$1.21 Billion ÷ (90 data/(month x user) x 12
months)
$1.12 Million per user/data
500 Millions users!! Why The Face??
Note: Inaccurate, we need to know how
much ads per user action
Christan Grant • Tapia 2011 • Birds of a Feather
17. Valuation of Data
• Google
• ~97% of revenue from ads
• $28 Billion from Ads in
2010
Christan Grant • Tapia 2011 • Birds of a Feather
18. Valuation of Data
• Yahoo!
• 90% of revenue from
Ads
• Revenue $6.460
billion (2009)
• 6 Million Users
Christan Grant • Tapia 2011 • Birds of a Feather
19. Valuation of Data
• Color.com
New location-based photo
sharing “elastic” social
network.
Initial seed investment of
$41 Million dollars
Christan Grant • Tapia 2011 • Birds of a Feather
20. Valuation of Data
• Color.com
New location-based photo
sharing “elastic” social
network.
Initial seed investment of
$41 Million dollars
How?? Bubble??
Christan Grant • Tapia 2011 • Birds of a Feather
21. Valuation of Data
• Color.com
• Answer: Data!!
• Grabs more data points per picture, patents
for garbing user location and proximity.
• “sound levels, Bluetooth readings, light
readings, antenna strength, the time - even
the direction you're pointing your phone -
and more and uses it all to determine
your proximity to other users.” - rww
• “It is a research company and a data mining
company” - Bill Nguyen
• “Our data is so accurate that we know where
you are” - Bill Nguyen
Christan Grant • Tapia 2011 • Birds of a Feather
22. Valuation of Data
* Data is valuable for people who
know how to use it
• Part of the value calculation is the cost of a
data breach
• Buyers are largely advertisers
(People say “we shouldn’t put value on data” ... to me that is like saying don’t put value on crude oil)
Christan Grant • Tapia 2011 • Birds of a Feather
23. Dealing with Data
• Categories of data?
• How to collect internet oil?
• What is data kindness?
• Examples
Christan Grant • Tapia 2011 • Birds of a Feather
24. Dealing with Data
• Categories of Data
• Geographic
• Demographic
• Behavioral
Christan Grant • Tapia 2011 • Birds of a Feather
25. Dealing with Data
• Collecting internet oil
• Active
• Passive
Christan Grant • Tapia 2011 • Birds of a Feather
26. Dealing with Data
• Active data collection
• Collecting data by proactively seeking and
obtaining
• Common Examples:
• Scraping websites
• Credit card applications
• Medical procedure information
Christan Grant • Tapia 2011 • Birds of a Feather
27. Dealing with Data
• Passive data collection
• Information is obtained through
monitoring streams
• Common Examples:
• Cookies
• Telecom call logs (Just press record)
Christan Grant • Tapia 2011 • Birds of a Feather
28. Dealing with Data
• Data Kindness
• Polite method to collect/use/provide/destroy
data. Say what you are going to do, do it, and
do only this.
• “Kind data” transaction model is a two-way
gentle person hand-shake.
• If a website has a privacy policy, follow it.
• API’s/Protocols make this formal
Christan Grant • Tapia 2011 • Birds of a Feather
29. Dealing with Data
• Data Kindness ❿Insurance companies
• Active
• Insurers have long used blood and urine tests to assess people's
health
• Instead, Information comes from online and offline activities to
predict life expectancies
• Information comes from warranty cards
• Registering at websites (weight loss tips websites, bungee jumping
websites)
• Find higher risk information, looking for clues about life style that
play into obesity, hypertension and diabetes. Insurers are looking
for healthy people, people who exercise a lot.
Christan Grant • Tapia 2011 • Birds of a Feather
30. Dealing with Data
• Data Kindness ❿Credit card companies
• Passive
• Visa Europe is looking into using your
phone location to prevent fraud
• This is used to help reduce false positives
• By 2015, more than 15% of cards will be
validated by mobile location
Christan Grant • Tapia 2011 • Birds of a Feather
31. Dealing with Data
• Data Kindness ❿Street Line Networks
• Put sensors in city parking spaces to help
people find parking spots.
• Active or Passive?
Christan Grant • Tapia 2011 • Birds of a Feather
32. Dealing with Data
• Data (un)Kindness ❿Google Street
View
• Active
• 240K people opted out of being
included in google street view.
• Houses were blurred... some
were not
• German court said it was legal.
• Fined €100K France for activities
Christan Grant • Tapia 2011 • Birds of a Feather
33. Dealing with Data
• Data (un)Kindness ❿Websites
• Passive
• first party cookies -- text files
that provide useful services
(e.g. shopping cart)
• 3rd party cookies -- text file
with unique id to track websites
you visit
• beacons -- small programs,
cookies on steroids (e.g. a key
logger)
• flash cookies -- hard to delete,
may replace deleted cookies
mostly volume setting
Christan Grant • Tapia 2011 • Birds of a Feather
34. Dealing with Data
• Data (un)Kindness ❿Websites
• Passive
• dictionary.com → 234
• msn.com → 207
• comcast.net → 151
• aol.com → 133
• Some phone apps send your
phoneid and other identifiable
information to 3rd parties.
• A large amount of these apps do
not have privacy policies
Christan Grant • Tapia 2011 • Birds of a Feather
35. Dealing with Data
• How to use data
• Sharing data can have a huge impact
• $60 mil Alzheimer Research → 160
papers, 80 in the pipeline
• Publish your data as linked-data to
semantically combine with other
information
• Be data stewards and declare data ownership
Christan Grant • Tapia 2011 • Birds of a Feather
36. Dealing with Data
• Protect sensitive information
• Anonymize
• High tech breach notification act -- must
notify individuals when their health
information is breached
• 2009 US data breach cost $6.75 Mil
• 2009 US detection/escalation $0.26 Mil
• 2009 US notification cost $0.5 Mil
Christan Grant • Tapia 2011 • Birds of a Feather
38. Dealing with Data
• Obtaining data
• Open data sets infochimps.org
• dev.twitter.com
• developers.facebook.com
• Scraper Wiki
• developers.nytimes.com
• Health Tap data http://h4h.healthtap.com/pages/
data
Christan Grant • Tapia 2011 • Birds of a Feather
39. Dealing with Data
• Tools
• Graduate students
• Scripting language - python/bash/awk
• Google refine
• Yahoo pipes
• Bing maps
• ...
Christan Grant • Tapia 2011 • Birds of a Feather
40. Dealing with Data
• Data Quality
• Most data is meh -
Restaurant data is
3.5-4 stars
• Aggregated/fused
data is better
information is better
• “90% of time is
spent modeling data”
Christan Grant • Tapia 2011 • Birds of a Feather
41. Dealing with Data
• Data Quality
• Integrating data is difficult (Information Integration is
a billion dollar business)
• Semantic data helps
• Larger and more diverse the data sets, the more
difficult
• Dates/Unicode is are a headache
• Separate teams must clearly define downstream data
Christan Grant • Tapia 2011 • Birds of a Feather
42. Open Data Breakout
• Choose or make up a topic
• Get together in groups of 5-6
• Discuss a plan to attack this topic
• With ~ 15 minutes left we will start
presentations
Christan Grant • Tapia 2011 • Birds of a Feather
43. Open Data Breakout!
• Projects
• Vision → How many Facebook pictures on average will it take to
create a 3D model of a person
• Algos/Statistics →How would you count the number of spam bots
on twitter
• NLP → How can you use language specific wording to predict user
location/demographics
• Healthcare → How can you use open health data to improve
diagnosis.
• Robotics → How can a robot use web or social network data to
entertain a human companion
• Web/Algos → Web based privacy risk score
Christan Grant • Tapia 2011 • Birds of a Feather
44. Links
• Links to resources are at: http://goo.gl/
TCF3x
Christan Grant • Tapia 2011 • Birds of a Feather
45. Thank you
• Questions?!
Christan Grant
christangrant.com
University of Florida
Christan Grant • Tapia 2011 • Birds of a Feather