This is my experience of going to my first data hackathon, Govhack 2015 and what it taught me.
A Hackathon is an event where you gather a heap of resources and people, form small teams and try to deliver as fully realised solution to a set theme or problem in a short intense amount of time.
Normally a hackathon is focused on delivering working software, but in the case of a data hackathon you work from a heap of datasets and try to deliver something of value, that can be working software, but often is something else. For this reason non coders can participate in a data hack easily.
Another difference is a hackathon normally revolves around creating some sort of business (be that profit or non-profit) idea and validating it.
Data hackathons are about understanding and realising value from data, and that value can often just be delivering better access to the information the data represents.
Got Chaos? Extracting Business Intelligence from Email with Natural Language ...Digital Reasoning
In this presentation, O'Reilly author and Digital Reasoning CTO Matthew Russell along with Dr. Steve Kramer, founder and chief scientist at Paragon Science, discuss how Digital Reasoning processed the Enron corpus with its advanced Natural Language Processing (NLP) technology - effectively transforming it into building blocks that are viable for data science. Then, Paragon Science used dynamic graph analysis inspired from particle physics to tease out insights from the data in order to better understand whether an enterprise fiasco such as the Enron scandal could have been thwarted.
Join our #DataTalk on Thursdays at 5 p.m. ET. This week, we tweeted with Dr. Michael Wu, the Chief Scientist at Lithium, where he applies data-driven methodologies to investigate the complex dynamics of the social web.
Michael works with big data and has developed many predictive and prescriptive social analytics with actionable insights. His R&D won him the recognition as a 2010 Influential Leader by CRM Magazine.
You can see all tweets and resources here:
http://www.experian.com/blogs/news/about/data-scientists/
Tim Estes - Information Systems in an Entity Centric WorldDigital Reasoning
Tim Estes, CEO of Digital Reasoning, talks about the use of Hadoop and other scalable technologies along with Digital Reasoning's analytics for automated understanding of cloud-scale text challenges.
This presentation was delivered at Hadoop World in New York in Oct 2010
2017 06-14-getting started with data scienceThinkful
The document provides an overview of getting started with a career in data science. It introduces the author Jasjit Singh and discusses what a data scientist does, how the field has emerged to analyze big data. Examples are given of how companies like LinkedIn and Uber use data science. The data science process is explained through the steps of framing a question, collecting and processing data, exploring patterns in the data, and communicating findings. Tools used include SQL, data visualization software, and machine learning algorithms. The document encourages the reader that becoming a data scientist is achievable through learning statistics, algorithms, and software skills.
Get Smart: The Present and Future of Data DiscoveryInside Analysis
Hot Technologies of 2013 with Bloor, Fitzgerald & Neutrino BI
Live Webcast July 17, 2013
http://www.insideanalysis.com
Somewhere in your data, discoveries wait to be found. Finding them can be quite a challenge, though, which is why data discovery gets so much attention these days. A whole array of tools is being promoted for data visualization and business discovery. But what are the component parts of this technology? And how can discovery tools be used to sift through vast amounts of data effectively? Register for this episode of Hot Technologies to find out!
Analysts Dr. Robin Bloor of The Bloor Group, and Jaime Fitzgerald of Fitzgerald Analytics will each offer their take on what constitutes a high-quality discovery tool. They'll then take a briefing from Jon Woodward of Neutrino BI, who will tout his company's platform for facilitating data discovery. He'll talk about the value of being able to go "direct to data" during the discovery process. He'll also outline their roadmap for developing a next-generation "smart" discovery platform.
This document provides an overview of data science including its importance, what data scientists do, how the field has emerged, and how to become a data scientist. It notes that by 2018 the US could face shortages of people with data analytics skills. It then discusses how LinkedIn's early growth in 2006 exemplifies the data science process of framing questions, collecting and processing data, exploring patterns, and communicating results. Finally, it outlines the tools used in data science like SQL, analytics software, and machine learning and discusses getting started in the field through education, curiosity, and ongoing learning with mentorship support.
You've heard the news, Data Science is the cool new career opportunity sweeping the world. Come learn from Thinkful Mentors all about this new and exciting industry.
Data Scientist: The Sexiest Job in the 21st CenturyLyn Fenex
The document discusses the growing field of data science. It begins by defining data science and explaining how the rise of big data and the internet of things has led to an increasing demand for data scientists. It then examines the skills and qualifications needed for different types of data science roles, including data analysts, engineers, and research scientists. Finally, it provides resources for continuing to learn about data science.
Got Chaos? Extracting Business Intelligence from Email with Natural Language ...Digital Reasoning
In this presentation, O'Reilly author and Digital Reasoning CTO Matthew Russell along with Dr. Steve Kramer, founder and chief scientist at Paragon Science, discuss how Digital Reasoning processed the Enron corpus with its advanced Natural Language Processing (NLP) technology - effectively transforming it into building blocks that are viable for data science. Then, Paragon Science used dynamic graph analysis inspired from particle physics to tease out insights from the data in order to better understand whether an enterprise fiasco such as the Enron scandal could have been thwarted.
Join our #DataTalk on Thursdays at 5 p.m. ET. This week, we tweeted with Dr. Michael Wu, the Chief Scientist at Lithium, where he applies data-driven methodologies to investigate the complex dynamics of the social web.
Michael works with big data and has developed many predictive and prescriptive social analytics with actionable insights. His R&D won him the recognition as a 2010 Influential Leader by CRM Magazine.
You can see all tweets and resources here:
http://www.experian.com/blogs/news/about/data-scientists/
Tim Estes - Information Systems in an Entity Centric WorldDigital Reasoning
Tim Estes, CEO of Digital Reasoning, talks about the use of Hadoop and other scalable technologies along with Digital Reasoning's analytics for automated understanding of cloud-scale text challenges.
This presentation was delivered at Hadoop World in New York in Oct 2010
2017 06-14-getting started with data scienceThinkful
The document provides an overview of getting started with a career in data science. It introduces the author Jasjit Singh and discusses what a data scientist does, how the field has emerged to analyze big data. Examples are given of how companies like LinkedIn and Uber use data science. The data science process is explained through the steps of framing a question, collecting and processing data, exploring patterns in the data, and communicating findings. Tools used include SQL, data visualization software, and machine learning algorithms. The document encourages the reader that becoming a data scientist is achievable through learning statistics, algorithms, and software skills.
Get Smart: The Present and Future of Data DiscoveryInside Analysis
Hot Technologies of 2013 with Bloor, Fitzgerald & Neutrino BI
Live Webcast July 17, 2013
http://www.insideanalysis.com
Somewhere in your data, discoveries wait to be found. Finding them can be quite a challenge, though, which is why data discovery gets so much attention these days. A whole array of tools is being promoted for data visualization and business discovery. But what are the component parts of this technology? And how can discovery tools be used to sift through vast amounts of data effectively? Register for this episode of Hot Technologies to find out!
Analysts Dr. Robin Bloor of The Bloor Group, and Jaime Fitzgerald of Fitzgerald Analytics will each offer their take on what constitutes a high-quality discovery tool. They'll then take a briefing from Jon Woodward of Neutrino BI, who will tout his company's platform for facilitating data discovery. He'll talk about the value of being able to go "direct to data" during the discovery process. He'll also outline their roadmap for developing a next-generation "smart" discovery platform.
This document provides an overview of data science including its importance, what data scientists do, how the field has emerged, and how to become a data scientist. It notes that by 2018 the US could face shortages of people with data analytics skills. It then discusses how LinkedIn's early growth in 2006 exemplifies the data science process of framing questions, collecting and processing data, exploring patterns, and communicating results. Finally, it outlines the tools used in data science like SQL, analytics software, and machine learning and discusses getting started in the field through education, curiosity, and ongoing learning with mentorship support.
You've heard the news, Data Science is the cool new career opportunity sweeping the world. Come learn from Thinkful Mentors all about this new and exciting industry.
Data Scientist: The Sexiest Job in the 21st CenturyLyn Fenex
The document discusses the growing field of data science. It begins by defining data science and explaining how the rise of big data and the internet of things has led to an increasing demand for data scientists. It then examines the skills and qualifications needed for different types of data science roles, including data analysts, engineers, and research scientists. Finally, it provides resources for continuing to learn about data science.
This document summarizes an introductory presentation on data science. It introduces the presenter and their background in data and analytics. The goals of the presentation are to define what a data scientist is, how the field has emerged, and how to become one. It discusses the growing demand and salaries for data scientists. Examples are given of how data science has been applied at companies like LinkedIn and Netflix. The presentation covers big data, Hadoop, data processing techniques, machine learning algorithms, and tools used in data science. Finally, attendees are encouraged to consider Thinkful's data science bootcamp program.
Ordinary people included anyone who is not a Geek like myself. This book is written for ordinary people. That includes manager, marketers, technical writers, couch potatoes and so on.
Data Science and Analytics for Ordinary People is a collection of blogs I have written on LinkedIn over the past year. As I continue to perform big data analytics, I continue to discover, not only my weaknesses in communicating the information, but new insights into using the information obtained from analytics and communicating it. These are the kinds of things I blog about and are contained herein.
Data Science Applications | Data Science For Beginners | Data Science Trainin...Edureka!
** Data Science Certification using R: https://www.edureka.co/data-science **
This Edureka "Data Science Applications" PPT takes you through the various domains in which data science is being deployed today, along with some potential applications of this technology. The world today runs on data and this PPT shows exactly that.
Check out our Data Science Tutorial blog series: http://bit.ly/data-science-blogs
Check out our complete Youtube playlist here: http://bit.ly/data-science-playlist
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Mining the Social Web for Fun & Profit Within Your OrganizationDigital Reasoning
In this talk, Matthew Russell explores why it is imperative for organizations and companies to leverage social media and how they can do it. In today's world of massive, rapidly evolving data streams, it is very challenging to sift through the data and extract the hidden nuggets of critical business intelligence. With advances in machine learning and natural language processing, decision makers can now look at all of their data and see what's really important. Matthew presents examples of how companies like Digital Reasoning are using social media to answer questions like:
Who know whom, and what friends do they have in common?
How frequently are certain people communicating with one another?
Who are the quietest/chattiest people in a network?
Who are the most influential/popular people in a network?
What are people chatting about (and is it interesting)?
The document provides a description of data scientist positions at three levels - Data Scientist I, II, and III. It outlines the general characteristics and responsibilities expected for each level, with level III involving the most complex work, responsibilities for leading projects, and experience/education qualifications. Key responsibilities include data analysis, modeling, collaborating with stakeholders, and communicating results.
This document provides an overview of data science including its importance, what data scientists do, how the field has emerged, and how to become a data scientist. It discusses how data science can help answer important business questions using LinkedIn in 2006 as a case study. It also outlines the typical data science process of framing questions, collecting and cleaning data, exploring patterns, and communicating results. Finally, it introduces some common data science tools like SQL, analytics software, and machine learning algorithms and discusses options for continuing education in data science.
Is big data handicapped by "design"? Seven design principles for communicatin...Zach Gemignani
Is big data handicapped by "design"? This presentation shares the seven design principles for effective data communication. Good and bad examples for data visualizations highlight the choices designers make in helping non-analytical audiences understand the meaning in data.
The document discusses the concept of "Big Data" and argues that there is no such thing. It notes that the term is primarily a buzzword used in IT and defines the 4Vs typically associated with Big Data. However, it states that most companies actually have "Big, Data Problems" rather than true Big Data problems, and that traditional databases can still solve many problems. It advocates focusing first on properly defining, storing, and understanding data before worrying about issues of scale or using new technologies. Engineering, the right tools, asking the right questions, building strong teams, and continuous learning are more important than prematurely pursuing Big Data.
Lessons Learned The Hard Way: 32+ Data Science InterviewsGregory Kamradt
This document summarizes Greg Kamradt's experience applying for data science jobs and interviews after graduating from a data science bootcamp program. It details the process he took, including sourcing over 60 companies, pitching over 30 recruiters and hiring managers, and preparing for 13 technical interviews. Key lessons included staying organized with tracking sheets, making the application process easy for recruiters, researching interviewers and companies thoroughly, and bringing creative energy to interviews. The document aims to share these lessons and resources to help other bootcamp graduates in their job searches.
Introduction to Data Science (Data Summit, 2017)Caserta
This document summarizes an introduction to data science presentation by Joe Caserta and Bill Walrond of Caserta Concepts. Caserta Concepts is an internationally recognized data innovation and engineering consulting firm. The agenda covers why data science is important, challenges of working with big data, governing big data, the data pyramid, what data scientists do, standards for data science, and a demonstration of data analysis. Popular machine learning algorithms like regression, decision trees, k-means clustering and collaborative filtering are also discussed.
This session describes the roles and skill sets required when building a Data Science team, and starting a data science initiative, including how to develop Data Science capabilities, select suitable organizational models for Data Science teams, and understand the role of executive engagement for enhancing analytical maturity at an organization.
Objective 1: Understand the knowledge and skills needed for a Data Science team and how to acquire them.
After this session you will be able to:
Objective 2: Learn about the different organizational models for forming a Data Science team and how to choose the best for your organization.
Objective 3: Understand the importance of Executive support for Data Science initiatives and role it plays in their successful deployment.
The document summarizes Eddie Lin's work in data science for social good. It discusses his participation in the 2016 Data Science for Social Good Summer Fellowship at the University of Chicago, and his current work at DSaPP, which uses data and machine learning to help solve social problems. It outlines common machine learning tasks and how they are similar to concepts learned in kindergarten. It also describes typical social good project categories and emphasizes open source tools.
Data science is having a growing effect on our lives, from the content we see on social media feeds to the decisions businesses are making. Along with successes, data science has inspired much hype about what it is and what it can do. So I plan to try and demystify data science and have a discussion about what it really is. What does a day-in-the-life look like? What tools and skills are needed? How is data science successfully applied in the real world? In this talk, I’ll be providing insight into these questions and also speculate the future of data science and its place in business and technology.
Presented at OpenWest 2018
Presentation at Data ScienceTech Institute campuses, Paris and Nice, May 2016 , including Intro, Data Science History and Terms; 10 Real-World Data Science Lessons; Data Science Now: Polls & Trends; Data Science Roles; Data Science Job Trends; and Data Science Future
Using cognitive computing to better analyze human communicationDigital Reasoning
Dr. Marten den Haring, senior vice president of products at Digital Reasoning, explores how cognitive computing can be used to better analyze human communication (e.g. email, chat, social media, voice, etc.) in order to reveal suspicious activity. This presentation was part of a series at the recent Alpha Innovation Required (AIR) Summit, which was sponsored by Franklin Templeton Investments.
This document provides an overview of the introductory lecture to the BS in Data Science program. It discusses key topics that were covered in the lecture, including recommended books and chapters to be covered. It provides a brief introduction to key terminologies in data science, such as different data types, scales of measurement, and basic concepts. It also discusses the current landscape of data science, including the difference between roles of data scientists in academia versus industry.
This document provides advice for starting a Big Data project. It recommends defining a clear business goal for the project first before collecting data. Consider different types and sources of data beyond just internal company data. Get management buy-in to ensure results are acknowledged and acted on. Address privacy and security issues early on. Start with a small, well-scoped project rather than trying to do too much initially. The key is reaching your business goal, not the size of the data or tools used.
Presentation for the NC Tech4Good conference. Discussed: What is data science? How can data science help social good organizations? What is NC Data4Good?
Drinking from the Digital Data Fire HoseGigi Johnson
This document provides an overview of a webinar about using data more effectively. The webinar discusses strategies for proactively gathering relevant data, saving time in current data practices, and using social media for business intelligence. It outlines a 5-step process: 1) strategizing how to use abundant data, 2) creating simple systems to access and store data, 3) listening to data from within and outside the organization, 4) visualizing data to make and persuade around decisions, and 5) sharing data and tools collaboratively. Examples of specific tools are also provided. The webinar aims to help organizations and individuals improve key performance metrics by developing a data strategy and better leveraging abundant data sources.
This document provides an overview of data science and how to get started in the field. It defines data science and the roles of data scientists. It also outlines the data science process of framing questions, collecting and cleaning data, exploring patterns, and communicating results. Examples are given of how companies like LinkedIn and Netflix used data science to improve their businesses. The document recommends learning tools like SQL, analytics software, and machine learning algorithms and describes a data science bootcamp program for gaining these skills through mentorship and hands-on projects.
This document summarizes an introductory presentation on data science. It introduces the presenter and their background in data and analytics. The goals of the presentation are to define what a data scientist is, how the field has emerged, and how to become one. It discusses the growing demand and salaries for data scientists. Examples are given of how data science has been applied at companies like LinkedIn and Netflix. The presentation covers big data, Hadoop, data processing techniques, machine learning algorithms, and tools used in data science. Finally, attendees are encouraged to consider Thinkful's data science bootcamp program.
Ordinary people included anyone who is not a Geek like myself. This book is written for ordinary people. That includes manager, marketers, technical writers, couch potatoes and so on.
Data Science and Analytics for Ordinary People is a collection of blogs I have written on LinkedIn over the past year. As I continue to perform big data analytics, I continue to discover, not only my weaknesses in communicating the information, but new insights into using the information obtained from analytics and communicating it. These are the kinds of things I blog about and are contained herein.
Data Science Applications | Data Science For Beginners | Data Science Trainin...Edureka!
** Data Science Certification using R: https://www.edureka.co/data-science **
This Edureka "Data Science Applications" PPT takes you through the various domains in which data science is being deployed today, along with some potential applications of this technology. The world today runs on data and this PPT shows exactly that.
Check out our Data Science Tutorial blog series: http://bit.ly/data-science-blogs
Check out our complete Youtube playlist here: http://bit.ly/data-science-playlist
Follow us to never miss an update in the future.
Instagram: https://www.instagram.com/edureka_learning/
Facebook: https://www.facebook.com/edurekaIN/
Twitter: https://twitter.com/edurekain
LinkedIn: https://www.linkedin.com/company/edureka
Mining the Social Web for Fun & Profit Within Your OrganizationDigital Reasoning
In this talk, Matthew Russell explores why it is imperative for organizations and companies to leverage social media and how they can do it. In today's world of massive, rapidly evolving data streams, it is very challenging to sift through the data and extract the hidden nuggets of critical business intelligence. With advances in machine learning and natural language processing, decision makers can now look at all of their data and see what's really important. Matthew presents examples of how companies like Digital Reasoning are using social media to answer questions like:
Who know whom, and what friends do they have in common?
How frequently are certain people communicating with one another?
Who are the quietest/chattiest people in a network?
Who are the most influential/popular people in a network?
What are people chatting about (and is it interesting)?
The document provides a description of data scientist positions at three levels - Data Scientist I, II, and III. It outlines the general characteristics and responsibilities expected for each level, with level III involving the most complex work, responsibilities for leading projects, and experience/education qualifications. Key responsibilities include data analysis, modeling, collaborating with stakeholders, and communicating results.
This document provides an overview of data science including its importance, what data scientists do, how the field has emerged, and how to become a data scientist. It discusses how data science can help answer important business questions using LinkedIn in 2006 as a case study. It also outlines the typical data science process of framing questions, collecting and cleaning data, exploring patterns, and communicating results. Finally, it introduces some common data science tools like SQL, analytics software, and machine learning algorithms and discusses options for continuing education in data science.
Is big data handicapped by "design"? Seven design principles for communicatin...Zach Gemignani
Is big data handicapped by "design"? This presentation shares the seven design principles for effective data communication. Good and bad examples for data visualizations highlight the choices designers make in helping non-analytical audiences understand the meaning in data.
The document discusses the concept of "Big Data" and argues that there is no such thing. It notes that the term is primarily a buzzword used in IT and defines the 4Vs typically associated with Big Data. However, it states that most companies actually have "Big, Data Problems" rather than true Big Data problems, and that traditional databases can still solve many problems. It advocates focusing first on properly defining, storing, and understanding data before worrying about issues of scale or using new technologies. Engineering, the right tools, asking the right questions, building strong teams, and continuous learning are more important than prematurely pursuing Big Data.
Lessons Learned The Hard Way: 32+ Data Science InterviewsGregory Kamradt
This document summarizes Greg Kamradt's experience applying for data science jobs and interviews after graduating from a data science bootcamp program. It details the process he took, including sourcing over 60 companies, pitching over 30 recruiters and hiring managers, and preparing for 13 technical interviews. Key lessons included staying organized with tracking sheets, making the application process easy for recruiters, researching interviewers and companies thoroughly, and bringing creative energy to interviews. The document aims to share these lessons and resources to help other bootcamp graduates in their job searches.
Introduction to Data Science (Data Summit, 2017)Caserta
This document summarizes an introduction to data science presentation by Joe Caserta and Bill Walrond of Caserta Concepts. Caserta Concepts is an internationally recognized data innovation and engineering consulting firm. The agenda covers why data science is important, challenges of working with big data, governing big data, the data pyramid, what data scientists do, standards for data science, and a demonstration of data analysis. Popular machine learning algorithms like regression, decision trees, k-means clustering and collaborative filtering are also discussed.
This session describes the roles and skill sets required when building a Data Science team, and starting a data science initiative, including how to develop Data Science capabilities, select suitable organizational models for Data Science teams, and understand the role of executive engagement for enhancing analytical maturity at an organization.
Objective 1: Understand the knowledge and skills needed for a Data Science team and how to acquire them.
After this session you will be able to:
Objective 2: Learn about the different organizational models for forming a Data Science team and how to choose the best for your organization.
Objective 3: Understand the importance of Executive support for Data Science initiatives and role it plays in their successful deployment.
The document summarizes Eddie Lin's work in data science for social good. It discusses his participation in the 2016 Data Science for Social Good Summer Fellowship at the University of Chicago, and his current work at DSaPP, which uses data and machine learning to help solve social problems. It outlines common machine learning tasks and how they are similar to concepts learned in kindergarten. It also describes typical social good project categories and emphasizes open source tools.
Data science is having a growing effect on our lives, from the content we see on social media feeds to the decisions businesses are making. Along with successes, data science has inspired much hype about what it is and what it can do. So I plan to try and demystify data science and have a discussion about what it really is. What does a day-in-the-life look like? What tools and skills are needed? How is data science successfully applied in the real world? In this talk, I’ll be providing insight into these questions and also speculate the future of data science and its place in business and technology.
Presented at OpenWest 2018
Presentation at Data ScienceTech Institute campuses, Paris and Nice, May 2016 , including Intro, Data Science History and Terms; 10 Real-World Data Science Lessons; Data Science Now: Polls & Trends; Data Science Roles; Data Science Job Trends; and Data Science Future
Using cognitive computing to better analyze human communicationDigital Reasoning
Dr. Marten den Haring, senior vice president of products at Digital Reasoning, explores how cognitive computing can be used to better analyze human communication (e.g. email, chat, social media, voice, etc.) in order to reveal suspicious activity. This presentation was part of a series at the recent Alpha Innovation Required (AIR) Summit, which was sponsored by Franklin Templeton Investments.
This document provides an overview of the introductory lecture to the BS in Data Science program. It discusses key topics that were covered in the lecture, including recommended books and chapters to be covered. It provides a brief introduction to key terminologies in data science, such as different data types, scales of measurement, and basic concepts. It also discusses the current landscape of data science, including the difference between roles of data scientists in academia versus industry.
This document provides advice for starting a Big Data project. It recommends defining a clear business goal for the project first before collecting data. Consider different types and sources of data beyond just internal company data. Get management buy-in to ensure results are acknowledged and acted on. Address privacy and security issues early on. Start with a small, well-scoped project rather than trying to do too much initially. The key is reaching your business goal, not the size of the data or tools used.
Presentation for the NC Tech4Good conference. Discussed: What is data science? How can data science help social good organizations? What is NC Data4Good?
Drinking from the Digital Data Fire HoseGigi Johnson
This document provides an overview of a webinar about using data more effectively. The webinar discusses strategies for proactively gathering relevant data, saving time in current data practices, and using social media for business intelligence. It outlines a 5-step process: 1) strategizing how to use abundant data, 2) creating simple systems to access and store data, 3) listening to data from within and outside the organization, 4) visualizing data to make and persuade around decisions, and 5) sharing data and tools collaboratively. Examples of specific tools are also provided. The webinar aims to help organizations and individuals improve key performance metrics by developing a data strategy and better leveraging abundant data sources.
This document provides an overview of data science and how to get started in the field. It defines data science and the roles of data scientists. It also outlines the data science process of framing questions, collecting and cleaning data, exploring patterns, and communicating results. Examples are given of how companies like LinkedIn and Netflix used data science to improve their businesses. The document recommends learning tools like SQL, analytics software, and machine learning algorithms and describes a data science bootcamp program for gaining these skills through mentorship and hands-on projects.
This document provides an overview of data science and how to get started in the field. It defines data science and the roles of data scientists. It also outlines the data science process of framing questions, collecting and cleaning data, exploring patterns, and communicating results. Examples are given of how companies like LinkedIn and Netflix used data science to improve their businesses. The document recommends learning tools like SQL, analytics software, and machine learning algorithms and describes a data science bootcamp program for gaining these skills through mentorship and hands-on projects.
This document provides an introduction and overview of big data technologies. It begins with defining big data and its key characteristics of volume, variety and velocity. It discusses how data has exploded in recent years and examples of large scale data sources. It then covers popular big data tools and technologies like Hadoop and MapReduce. The document discusses how to get started with big data and learning related skills. Finally, it provides examples of big data projects and discusses the objectives and benefits of working with big data.
Department of Commerce App Challenge: Big Data DashboardsBrand Niemann
The document summarizes Dr. Brand Niemann's presentation at the 2012 International Open Government Data Conference. It discusses open data principles and provides an example using EPA data. It also describes Niemann's beautiful spreadsheet dashboard for EPA metadata and APIs. Finally, it outlines Niemann's data science analytics approach for the conference, including knowledge bases, data catalog, and using business intelligence tools to analyze linked open government data.
This document discusses big data and data science. It addresses three main points:
1) Big data methods and algorithms can be useful for smaller datasets as well as large ones.
2) To successfully extract insights from data requires a team with a variety of skills, including business and domain knowledge.
3) For HR in particular, big data can help determine the optimal time to approach potential candidates by analyzing patterns in their job seeking activities online.
How to build a data science team 20115.03.13v6Zhihao Lin
Teralytics provides real-time insights into human behavior globally using data from 350 million profiles and 180 billion daily events. They have built a data science team in Singapore that develops one of their three products deployed worldwide. The presentation outlines how to build an effective data science team, including finding team members through diverse sources, evaluating them through a multi-stage interview process, convincing them to join by emphasizing the work, data, and team environment, and getting the team working cohesively through collaborative projects with clear goals and deadlines.
1) The document discusses trends in analytics teams at Twitch, including how data analysts, data scientists, and data engineers work together. It also covers tools for data analysis and important skills.
2) At Twitch, the data team has evolved to include data scientists, data engineers, UX researchers, and data analysts. Data scientists and analysts now share customers and responsibilities.
3) When data is truly valued as a business asset, building your own analytics tools in-house is now more viable due to advances in technology and skills. However, it requires the ability to hire people with the necessary technical skills and to support non-technical users.
Digitizing Your Impact | 2020 Hunger and Poverty ConferenceTiasiaOBrien
Tiasia O'Brien presented on digitizing impact through data. She discussed understanding the value of data through collecting various types of qualitative and quantitative data like interviews, surveys, and observations. She emphasized starting with human-centered design and understanding community needs. O'Brien also covered hosting data on various tools and analyzing data to build narratives and stories that showcase programs and their impact. She demonstrated how to use data visualization to tell stories about communities.
Warren Buffet would often think of companies as castles with a competitive moat protecting the business. Products or companies that figure out how to build and leverage differentiated data assets will be best positioned to win their respective markets. This talk describes the properties of a good data moat, why it matters, and how to go about building them within your organization.
Big Data, Big Opportunity: Making Sense of Big Data for PRCision
Big Data creates big brand-building opportunities. Do you know how to use it effectively? It’s as easy as tapping into social media intelligence and doesn’t require an additional investment in IT resources.
Heidi Sullivan of Cision and Mike Maziarka of Visible Technologies explore the basics of Big Data and how to leverage it to positively impact your brand.
This presentation is prepared by one of our renowned tutor "Suraj"
If you are interested to learn more about Big Data, Hadoop, data Science then join our free Introduction class on 14 Jan at 11 AM GMT. To register your interest email us at info@uplatz.com
Know Your Market - Know Your Customer: What Web Data Reveals if You Know Wher...Connotate
In this presentation, Connotate will share expertise gained from years of experience extracting data from the Web and making it usable. Connotate’s experts will explain why certain Web data sources are easy to tap into, why others aren’t – what to consider when scoping out a project.
Dashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIsLuciano Pesci, PhD
Organizations of every size have access to data dashboard technology, yet none of the solutions have delivered on their hype and right now across the world executives and analysts are staring at a dashboard and thinking the same thing, ""so what?""
The failure of dashboards to deliver meaningful insights is inherent in their simplicity: they only show surface level information, and not the relationships between data points that really drive the fate of your organization.
But all is not lost! By combining the right mix of technology and human expertise in business strategy, research and data mining you can embrace the smart analytics movement, and start accessing insights that grow your company and your competitive position.
You can watch the accompanying webinar here: https://youtu.be/RdOcPxv9wLs
Presentation given by the Proffer team during their hackathon launch ceremony at IIT Delhi on November 10.
In partnership with NITI Aayog, Microsoft, IBM, Accel Partners, AWS, and Coinbase/Toshi. $17K+ in prizes for your Ethereum/Hyperledger projects.
Building Effective Frameworks for Social Media AnalysisOpen Analytics
The document outlines an analytic framework for effectively analyzing social media data. It discusses common pitfalls to avoid, such as relying too heavily on metrics without context. The framework involves data capture, reporting, and analysis in an iterative cycle. A case study applies the framework to understand public sentiment toward a new video game. Key steps included refining queries, tagging entities, visualizing reports, and generating hypotheses to improve analysis over time.
Talking about Big Data generates a lot of questions; however, most of the focus is on the technologies and skills required to collect and store this volume of information as opposed to the insight that companies need to derive from it. What factors should organizations consider in order to ensure that they are capitalizing on their investments with these technologies? How do you break through business silos to enable sharing of data to increase organizational value? Leveraging his cross-industry experience at companies like The Walt Disney Company, Travelers Insurance and Demand Media, Brendan Aldrich will discuss the question of “big value” with industry examples and a particular focus on his current work to deploy a “data democracy” within the City Colleges of Chicago.
Session Discovery Topics:
• Big value - keeping an eye on the forest (assumptions, judgment and bias)
• Data democracy - increasing productivity with data transparency and open access
Cloudera Cares + DataKind | 7 May 2015 | London, UKCloudera, Inc.
Presented on 7 May 2015 in London, Cloudera Cares and DataKind talked about the following topics:
Cloudera Cares: How we've contributed to the community in 2014
Doug Cutting: PAX Data
Amr Awadallah: Cloudera Academic Partnership
Duncan Ross: DataKind UK Overview
Ian Ansell, Peter Passaro, Henry Simms, Billy Wong: Citizens Advice and DataKind
Similar to Going To A Data Hack - Govhack 2015 (20)
How to navigate being a junior member of development team that is not doing good work and/or is a bad team
Presentation for Coder Academy mentor session 10th Sept 2020
Govhack 2017 going to govhack (edited for slideshare)Andrew Saul
GovHack is a great event and I think it offers heaps for a wide range of people. Not only that I believe GovHack is helping us create a better society by making better use of open data. This is something we need to protect.
I did a long post about this topic here: https://www.linkedin.com/pulse/going-govhack-2017-yes-you-can-andrew-saul/
Game analytics @ Halfbrick
The document discusses how Halfbrick, a game development studio, uses analytics in their business. They track game events like how players interact with and progress through their games, as well as advertising metrics like install sources to evaluate the quality of users acquired from different channels. Key metrics they focus on include money spent, social virality, retention, and engagement. The document emphasizes tailoring analytics to development goals and resources, making metrics fast-updating, reliable and accessible, and involving employees in data analysis.
The document describes PainPal, a pain tracking mobile application. PainPal allows users to track their pain levels, physical activity, productivity, mood, and other factors. It provides visualizations of the historical data to encourage progress. Push notifications remind users to record information. The prototype provides a foundation, and the team aims to add features like linking activity trackers, intelligent notifications, separate pain profiles, and anonymous data export to inform research.
Golden Dentists Final Pitch - HealthHack 2016 BrisbaneAndrew Saul
HealthHack is a hackathon that brings together researchers, healthcare professionals, students, software developers, educators, engineers and designers to create innovative solutions to interesting problems in the health sector.
The HealthHack 2016 Brisbane event was held at River City Labs. It ran from Friday night to Sunday afternoon (14-16 Oct 2016).
I joined up with a problem owner, Dr Kenan Kalayci and two other participants, Randall and Gareth, to do a project for the problem Kenan had brought to HealthHack.
The problem Kenan pitched on the opening night of HealthHack was; overcharging and over treatment in the dental market. He spoke about the possibility of separating diagnosis and treatment services to encourage competition as a potential solution.
The solution we presented at the end of the weekend was to create price transparency in the market for dental procedures by allowing people to openly and anonymously share the prices they've been charged via their dentist receipts.
This is a similar idea to a website for salary comparisons which is called glassdoor, which is a website where people can share their salary, position, and employer anonymously. It, along with services like it, have led to a lot more transparency in the job market as goes salaries.
Balancing Fruit - IAPA presentation 26/11/15Andrew Saul
When we first saw the analytics around the behaviour of players playing the mini games in the 5th anniversary update for Fruit Ninja we were very surprised: They weren't playing them as we had designed them to be played.
When we looked further into the numbers we saw that players had found new, un-fun ways to interact with the mini games that we hadn't anticipated. We acted fast and pushed an update to save our players' fun.
This is an expanded version of the presentation deck I presented at a recent IAPA meeting.
Excerpt from the IAPA website:
"IAPA (Institute of Analytics Professionals of Australia) QLD Chapter Event 26 November 2015 - An Australian Analytics Story
Please join us for our last IAPA Chapter Event for 2015!
We've lined up a selection of local lads to all share a slide of Australian history - four presenters who will each give us a personal story of adversity to triumph - Fruit Ninja Triumphs at that!!"
http://www.iapa.org.au/Event/QLDChapterEvent26NovemberAnAustralianAnalyticsStor
Looks at the ways in which you can creatively use funnels in analytics to do more than the standard uses they have now.
Funnels can be an important tool in visualisation and data exploration as well as the actual analysis itself.
This presentation focuses on funnels in relation to mobile apps and mobile games, but many of the concepts apply to the broader web and other data intensive fields.
The document provides guidance on designing effective analytics for mobile apps and games. It discusses collecting data on user behaviors and key events while balancing performance and data usage. Analytics should track what is most important to development goals and be updated over time. On-device analysis of user milestones and aggregate data can provide insights with minimal overhead. In the future, more powerful mobile devices may allow greater on-device data processing and analytics.
Enhanced Enterprise Intelligence with your personal AI Data Copilot.pdfGetInData
Recently we have observed the rise of open-source Large Language Models (LLMs) that are community-driven or developed by the AI market leaders, such as Meta (Llama3), Databricks (DBRX) and Snowflake (Arctic). On the other hand, there is a growth in interest in specialized, carefully fine-tuned yet relatively small models that can efficiently assist programmers in day-to-day tasks. Finally, Retrieval-Augmented Generation (RAG) architectures have gained a lot of traction as the preferred approach for LLMs context and prompt augmentation for building conversational SQL data copilots, code copilots and chatbots.
In this presentation, we will show how we built upon these three concepts a robust Data Copilot that can help to democratize access to company data assets and boost performance of everyone working with data platforms.
Why do we need yet another (open-source ) Copilot?
How can we build one?
Architecture and evaluation
Analysis insight about a Flyball dog competition team's performanceroli9797
Insight of my analysis about a Flyball dog competition team's last year performance. Find more: https://github.com/rolandnagy-ds/flyball_race_analysis/tree/main
Learn SQL from basic queries to Advance queriesmanishkhaire30
Dive into the world of data analysis with our comprehensive guide on mastering SQL! This presentation offers a practical approach to learning SQL, focusing on real-world applications and hands-on practice. Whether you're a beginner or looking to sharpen your skills, this guide provides the tools you need to extract, analyze, and interpret data effectively.
Key Highlights:
Foundations of SQL: Understand the basics of SQL, including data retrieval, filtering, and aggregation.
Advanced Queries: Learn to craft complex queries to uncover deep insights from your data.
Data Trends and Patterns: Discover how to identify and interpret trends and patterns in your datasets.
Practical Examples: Follow step-by-step examples to apply SQL techniques in real-world scenarios.
Actionable Insights: Gain the skills to derive actionable insights that drive informed decision-making.
Join us on this journey to enhance your data analysis capabilities and unlock the full potential of SQL. Perfect for data enthusiasts, analysts, and anyone eager to harness the power of data!
#DataAnalysis #SQL #LearningSQL #DataInsights #DataScience #Analytics
Natural Language Processing (NLP), RAG and its applications .pptxfkyes25
1. In the realm of Natural Language Processing (NLP), knowledge-intensive tasks such as question answering, fact verification, and open-domain dialogue generation require the integration of vast and up-to-date information. Traditional neural models, though powerful, struggle with encoding all necessary knowledge within their parameters, leading to limitations in generalization and scalability. The paper "Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks" introduces RAG (Retrieval-Augmented Generation), a novel framework that synergizes retrieval mechanisms with generative models, enhancing performance by dynamically incorporating external knowledge during inference.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Round table discussion of vector databases, unstructured data, ai, big data, real-time, robots and Milvus.
A lively discussion with NJ Gen AI Meetup Lead, Prasad and Procure.FYI's Co-Found
1. Going to a Data HackathonOr How I Learned To Stop Worrying About Perfect And Learned To Love Good Enough
Andrew Saul – Gaming Technology
2. What is a data hackathon?
Hackathons:
• Entrepreneurship
• Focus on working code & the commercial viability of product
• Need at least one coder in a team
• Great code gets you a long way
Data hackathon:
• Focus on the datasets & the value from them
• Value = Something people will use (not buy)
• Don’t need any coders
• Need at least one person with understanding of how to use
datasets
• Great code won’t get you very far
3. What is GovHack
• Lots of govt data freely available (open data)
• Lack of awareness and usage
• GovHack aims to:
• Identify issues with datasets & access
• Increase public awareness
• Inspire ventures using govt data (profit and non-profit)
• Run by data.gov.au and Open Knowledge Australia
4. GovHack datasets
• Datasets from federal, state and local govt
• Wide range of data. Some examples
• Traffic offences
• Drinking fountain locations
• Mining exploration expenditure
• ABC Broadcast Data Archive 1978-2011
• Census 2011
• National Drugs Strategy Household Survey
• Landsat 5 and 7 surface reflectance archive
• WWI diary and letter transcripts
Plus many, many, more
5. GovHack prizes
• Lots of prizes from lots of different govt bodies
• Can win more than one prize
• A few general prizes
e.g. Best Digital Transformation Hack
• Most prizes relate to a specific dataset
e.g. The intellectual property data bounty: Must use IP
Government Open Data
6. What do you make?
From GovHack website:
A “hack” can be anything that uses government data in a clever or
creative way. It might be an application, an analysis, a data viz, a
3D printed or laser cut project, a digitisation project, artworks or
anything else that fits the spirit of GovHack.
Examples of 2015 winners
• Story telling/art – Remembrance
• Augmented maps - AusTrails.org
• API driven apps - Health Buddy
• Games - Question Time: A game of policy
• Specialised search tools - Neuron
• Data visualisations. With code (Gender Equality) and
existing tools (Synergising Synergies for Sitizens)
7. Timeline: Before the event
My team before the event:
• Looked at previous winners - Hindsight: too much
• Looked at datasets - Hindsight: not enough
• Posted ideas in a Trello board
8. Timeline: Friday night
1. Opening ceremony and announcement of prizes
2. Venue = Gorgeous, internet = atrocious
3. Listed project ideas we had previously and came up with new
ones as a team.
4. Discussion & voting to cut list down. Our criteria: Working code
Hindsight: Too much traditional hackathon thinking
9. Timeline: Saturday
1. Lost best web dev on Friday to illness
2. Decided on estimation game web app and assigned team roles
3. Plan: Front end in D3.js, Backend (database) in node.js
4. Node.js backend crashed that night.
5. Late on backend saved via Google Fusion tables – dancing ensued
10. Timeline: Sunday
1. Manually populated question dataset
2. Finalised value offering: data visualisation on the results screen &
input method. Discarded a lot of game elements.
Hindsight: Should’ve had this focus all along
3. Couldn’t get player responses writing to Fusion table. Used fake
user responses via random number generator
Hindsight: Should’ve done this as our first iteration
4. Started video
Hindsight: Should’ve started planning this Friday and used it
as a key piece of our value statement
5. Submitted. Team was tired but really proud of what we’d made.
11. Consensus: What it does
• Estimation game based on idea of Wits & Wagers board game
• Players move a slider to guess a statistic and get points for how
close they are to the correct answer
• Players see how their answer compared to other players
12. Consensus: How it’s made
• The frontend (user interface) is made with D3.js.
• Backend (database) is in a Google Drive app called Fusion
Tables.
We couldn’t get the write back from the web app working in
time but Fusion Tables are able to read and write data much as
a normal database. Super easy to use and setup.
13. Consensus: The value it gives
• Originally wanted to make a great game; dataset engagement was
to be a result of playing the game.
• Ended up focusing on how to engage people with data and then
fitted this into a game experience.
• Consensus value proposition:
• The player can see how well they know an area of society
• Players see how their knowledge compares to others
• Govt departments can infer public perceptions & awareness of
areas of society
19. What we won
1st prize State & Local awards:
• Best Use of Science Data on the Queensland Open Data Portal
• BCC Mashup Prize
Highly Commended (4th place) National:
• Best Open Government Data Hack
20. What I learned
I’m better at defining the problem that we need to solve, rather
than fitting a solution to a problem
I’m better at identifying and eliminating scope
Gained a new appreciation of how much you can learn just by
“faking it”
21. What I’d do differently aka “How to win a data hack”
• Understand the datasets: they are your problem
22. What I’d do differently aka “How to win a data hack”
• Focus on a fully thought out solution NOT working code
• Don’t try to fit a solution to datasets
23. What I’d do differently aka “How to win a data hack”
• Decide on your tools in advance. What tools doesn’t matter
nearly as much as that you can all contribute
• Use the video as your main communication tool: Takes the
load off working code
24. More info (read: go to GovHack 2016)
GovHack 2016 on 29-31 of July. Registrations open soon.
Find out more at: www.govhack.org/
Editor's Notes
https://www.govhack.org/
If you can’t read the fine print then you’re missing out on a movie reference that will most likely be lost on those under a certain age.
A Hackathon is an event where you gather a heap of resources and people, form small teams and try to deliver as fully realised solution to a set theme or problem in a short intense amount of time.
Normally a hackathon is focused on delivering working software, but in the case of a data hackathon you work from a heap of datasets and try to deliver something of value, that can be working software, but often is something else. For this reason non coders can participate in a data hack easily.
Another difference is a hackathon normally revolves around creating some sort of business (be that profit or non-profit) idea and validating it.
Data hackathons are about understanding and realising value from data, and that value can often just be delivering better access to the information the data represents.
A push is being made to make govt data open to the public which has resulted in lots of datasets now being available
But to date there has low awareness of these datasets and not much value that has been generated from these resources
GovHack was formed to:
Help the departments clean up their datasets and identify issues that were stopping people using them more frequently
Increase public awareness of these resources
Try to kick-start some for profit and non-profit ventures around these resources so the govt could start to see some return from this initiative
https://www.data.gov.au/ is the home of the overall open data initiative and there are state based sites such as QLD’s https://data.qld.gov.au/
Open Knowledge Australia is the Australian arm of the global Open Knowledge network which is a non-profit that supports the open data movement all over the world. They are who runs GovHack. http://au.okfn.org/
Only a subsection of the datasets available are selected for GovHack each year. That said there was well over a hundred for GovHack 2015 (https://www.govhack.org/2015-data/).
There’s huge range of data in both source and composition.
Data from all levels of govt bodies
Wide range of types of datasets. Some examples are: sensor readings, satellite imagery, questionnaires, letters, images, video, financial records, GPS co-ordinates, etc.
Data available many different ways: APIs, web portals, simple flat files, queries
Lots of prizes are on offer. Can win more than one prize for each entry.
Best Digital Transformation Hack: Open category for how government can be brought into the 21st Century via digital services.
The intellectual property data bounty: Develop an easy way for non-experts to access and use the IP Government Open Data on data.gov.au to find out where, who and what IP exists in Australia.
Find the winners from 2015 here: https://www.govhack.org/2015-winners/
Remembrance: A website that retells the ANZAC story, both at the front and at home. https://hackerspace.govhack.org/content/remembrance
AusTrails.org: Aggregator of trail data from community that’s run on OpenStreetMap. https://hackerspace.govhack.org/content/austrailsorg
Health Buddy: Helps you find health services matched to your needs and plan trips. Govt departments can use it to see usage stats. https://hackerspace.govhack.org/content/health-buddy
Question Time: A game of policy: 2 player game where you try to guess where an MP voting record lies in relation to a portfolio area. https://hackerspace.govhack.org/content/question-time-game-policy
Neuron: Mind-map of inter-connected IP data to aid discovery. Links to a purpose built IP social network. https://hackerspace.govhack.org/content/neuron-connecting-minds
Gender Equality: Data visualisation of gender equality data from the Workplace Gender Equality Act dataset using D3. https://hackerspace.govhack.org/node/542
Synergising Synergies for Sitizens: Tool in Tableau Public to work out which suburbs are best-placed for rooftop solar investment. https://hackerspace.govhack.org/content/synergising-synergies-sitizens
Do: Get familiar with the datasets and thinking about what tools you’ll use
Don’t: Start building something to take in with you. You are painting yourself into a corner to start with.
Before the event you should also make triple sure that whatever you are taking in as goes tech is working well and has all the updates you need. A couple of our team spent far too long getting their laptops working due to these very reasons.
Whilst you shouldn’t start building something before you go in, at least have a conversation about common tools. You should get setup up with at least a common messaging service and project management tool. We used Skype and Trello, but again wasted some time getting these setup at the event.
Just because you might not have decided yet what you are going to build doesn’t mean you don’t have at least a shortlist of development tools you might use. It’s a good idea to install all of these and get them working before you head in. Much easier to just not use something than it is to install and configure something for the first time at the event.
The UQ engineering block is a cathedral of wood, steel and glass. The UQ wifi is by contrast a poorly built mud hut. We struggled a lot with it over the course of the event. I ended up only being able to get it working on my phone, which I then had to tether to my laptop.
Consider backup internet options you could use before you go in and try to limit how much you will rely upon the internet at the venue.
Our main criteria was getting working code, and this was a mistake. This is more traditional hackathon thinking and didn’t fit well with the idea behind Govhack. I’ll talk more about this later on in the presentation.
Our best web developer was struck down by sickness on Friday night. We had originally started selecting ideas based on her considerable skills. In hindsight we didn’t readjust enough to take into account losing her from the team.
We ended up with two ideas:
an estimation game (based on the board game Wits & Wagers). This is an idea we’d had before the event
A map based tool to help you assess the safety of a neighbourhood. This is an idea we come up with on Friday night.
To decide we quickly sketched out a dev plan for each to see which was most achievable. We looked at:
time we had
Skill in the team
what the datasets we would use were like and what prizes we’d be eligible for
In the end the estimation game won out and we went to work.
Node.js backend crashed and burned because we’d selected it based on the skills of the original team. Developer working on it was toast so we sent him home. Rest of the team regrouped to think about what we could salvage from the wreckage. After some research we came across Google Fusion Tables. After an hour of hacking about our remaining dev raised his hands an announced it was working. We danced a bit then called it a day as it was really late and we were really tired.
One of the very best moments at Govhack was when our dejected dev, who’d been working on the node.js backend, came through the door and we surprised him with a working website. Even though it wasn’t his fault he had felt like he’d let the team down, so he was extra happy to see we were back in the game.
The whole team had been given a shot of energy from this brush with disaster and it carried us through the rest of the day.
We discarded many of the game elements we’d originally planned and focused instead on delivering value from the data. In hindsight this should’ve always been the focus as it was true to the data we had.
We couldn’t get the player responses to write back our Fusion Table database, even though that functionality does exist, so we used fake responses data. In hindsight it was way too optimistic to think we were going to have time to get 100’s of players through the game before submission, and not having response data earlier held up designing the results screen. Even though it was fake data we still needed to tweak it to make believable and have it produce interesting results. Even this took a little more time than we had anticipated.
We started the video late as we’d been told by the organisers not to focus on it. In reality you should start planning it Friday. It’s one of your best tools to explain what you’ve built and the better you can make it the less you have to have working code.
How to play Wits & Wagers: https://www.youtube.com/watch?v=6FbGRahrAhI
Our entry
Hackerspace entry: http://2015.hackerspace.govhack.org/content/consensus
Website (which is now broken): http://www.consensusquiz.com/
Video: https://www.youtube.com/watch?v=Q994vWVLNRM
Google Fusion tables: https://sites.google.com/site/fusiontablestalks/stories
Data Driven Documents (D3.js): https://d3js.org/
D3.js is a really flexible and easy to use data visualisation tool for the web. It’s great at one thing: bespoke data vizs. It can’t be relied upon for dashboard like functionality though.
Google Fusion Tables are even easier to setup and use, and many people use them to host the data from their web vizs for this reason. Now that Google have expanded their data offerings you have a wide range of free, robust, and easy to use options:
Google big data tools: https://cloud.google.com/products/big-data/
Google cloud data tools: https://cloud.google.com/products/storage/
There’s also a heap of other tools that you can use to get a prototype working fast. A good place to start is the resources at the The GovHack Developer Toolkit: http://govhack-toolkit.readthedocs.io/
We had originally wanted to make a great game that used datasets in an interesting way. The engagement with those datasets, we thought, would come as a result of people enjoying the game. We had this the wrong way around.
We got a much better result when we focused on how to engage people with the datasets in a way where they could understand them better, and we added game elements as a way to facilitate this process. We created another layer of data (in this case player estimations) that could help explain more fully an existing statistic.
Consensus helps players see how well their own perceptions of areas of society line up with the reality. They also get to see how their perceptions relate to other players.
For govt this is at tool they can use to see how the general public perceive areas of society.
Slider: We used a slider for these reasons:
You don’t have to come up with answers, so you can add questions much faster
You get a much more accurate measure of what people’s understanding of a statistic is. It’s not either they get the right answer or not, you get to peek inside their perceptions of a statistic to see what they think about a subject
A slider feels much closer to the feeling of estimating than selecting a value or typing one in. We still have a number come up so it’s easy to see what you have selected.
We debated about the slider start and end and the starting position of the selector on the range. All these things can bias a players answer; especially if they are particularly unsure of the answer. In the end we had two types of ranges: whole numbers and 0-100%. The whole number range goes super high, and if we’d had time we wanted instead to hide everything but the selector and have values come up down the bottom. This way we could have different ranges, but they would’ve been hidden from the player so as to not bias them.
Your score: Originally this was to be the most important part of the game. In the revised version it plays a much smaller role. Your score relates to how many people you were closer to the answer than.
“How close your answer was” bar: In the results screen we double up on the information about how close you were (the same info is in the histogram at the bottom), but we do so because how close you were is really important to visualise for the player so they start to get an idea of how their knowledge relates to the actual statistic.
It’s a really simple visualisation with a bar which is the same width as the histogram at the bottom, but it strips away all the other data and just gives the distance to the player so they can clearly see it. To reinforce this we also added a label to qualify how close they got to the answer.
Question scrolling text: As we were running out of screen real estate, and some of the questions were quite long, we put the question into a scrolling text box. It’s a hacky solution, but it gives a “breaking news” feel to the results screen.
The histogram: We have already shown the player how close they were to the correct answer and we show that again in the histogram but now we add in all the other player answers to the mix.
Here we want to the player to focus on where their answer falls in relation to all the other answers. We help them by highlighting what range their answer falls within either side of the correct answer and as a result we also highlight all those that fell outside that range.
We tell the player how many players they did better than. We added a random descriptive stat to give another insight into the histogram they are looking at.
We were impressed that many of the people we got to try the game after the event, even though we told them it was faked response data beforehand, ended up thinking it was real responses when they got to the results screen. That really validated for us the design decisions we had taken.
If we’d have more time we would’ve liked to have had the option for players to see how their answers faired across their region, state, and nationally. We’d also have liked to have added separate categories for players to estimate questions in.
Here’s Dale and I receiving the local and state awards.
Here all the winners: https://www.govhack.org/2015-winners/
I got a much better understanding of what defining a problem is and how to identify and eliminate scope.
I gained a much better appreciation of how far you can get “faking it” before you have to actually get something working.
How Paypal and Reddit faked their way to traction
https://medium.com/platform-thinking/how-paypal-and-reddit-faked-their-way-to-traction-9411fb583205#.y8yyoli0d
You’re there to realise value from the datasets the govt has. They are valuable data.
If you understand the value intrinsic to the datasets you then can much more easily match this to a data product which is valuable to someone else.
Also what a dataset is like heavily influences what you need to know to be able work with it. Ruling out datasets is more important than selecting ones you’d like to work with. There is no chance you’ll not have at least a couple of dataset you can work with.
Don’t be fooled that you won’t be able to think of anything. You’ll have more than enough ideas that will be achievable. Better you spend the time finding the solution that delivers meaning value, then working out a way to quickly prototype it than searching through 100’s of datasets trying to find something that fits into what you have already built or decided to build.
The real fun is finding a way to hack together a great solution and that’s where you use your creativity.
You’re not deploying production code so just decide on tools everyone can contribute with and get setup before the event. Too much time is wasted trying to pick the “perfect” tool to use that can be used coming up with a kick-ass solution.
If you don’t have to have it working perfectly that opens up heaps of options for you to hack together something to show the functionality via the video. The video helps you be more creative and innovative and attempt more ambitious solutions.
https://www.youtube.com/watch?v=Q994vWVLNRM