Data Models And Details About Open Data

Datasets, Data Models, and Useful
Facts In Drawing conclusions
If there's a delay and you have your phone keep in mind that
Facebook has a language setting called English(Pirate)
Banging your head against a wall burns 150 calories an hour.
By Michael Bostwick

 You've had some experience with technology.
 We cant cover it all, so be sure to read the manual.
 Google is a great search engine!
“Never memorize something that you can look up.”
― Albert Einstein
Feel free to ask questions or share insight.
Assumptions

Introduction
Data Sources
Example Models
Mistakes Using Data
Some Tools And Solutions
Overview

The study of where
information comes
from, what it
represents and how it
can be turned into a
valuable resource in
the creation of
business and IT
strategies.
Data Science

 The Great Chain Of Being By Plato
 Over time we moved to Porphyrian tree
 An example is someone's family tree
History Of Human Knowledge
Described in more detail
very well in a TED Talk by
Manuel Lima

More and More things are based
on Networks
http://cpan-explorer.org/
Networks Metaphor

From a Ted Talk By Eric Berlow and Sean Gourley
(this model expands to see connections, it processed language)
TED Talk Map

More Maps At socialexplorer.com
Mapping Of Social Ties
From A Ted Talk By Dave Troy About Social Mapping

 kdnuggets.com listing(and industry site)
 caesar0301 awesome public datasets
 data.gov
 code for kansas city github wiki
 reddit.com datasets
 blog.visual.ly data sources
 Open StreetMaps Data
 opensecrets.org a watchdog group
 wiki's linked data
 wiki data a nosql listing of different relationships
 trulia local map
Data Sets

 Freedom of Information Act (FOIA)
 Missouri Sunshine Law
 Many states have similar laws.
 We the people petitions
 Media and Other Public Posting Options
 instigative journalism postings
 craigslist
Getting More And New Data

A theory that seeks
to explain how,
why, and at what
rate new ideas and
technology spread
through cultures.
Diffusion Of Innovation

 Innovation
 Any idea, practice, or object that is perceived as new by an
individual or other unit of adoption could be considered an
innovation available for study.
 Adopters
 In most case adopters are individuals, but can also be
organizations
 Communication channels
 Time
 Social system
Components Of Diffusion Of
Innovations

 Knowledge
 The individual first view into to the innovation
 Persuasion
 The individual is interested in the innovation and actively seeks related information/details.
 Decision
 The individual takes the concept of the change and weighs the advantages/disadvantages of
using the innovation and decides whether to adopt or reject the innovation.
 This stage is very hard to get metrics on.
 Implementation
 They use it, how much and where is based on the situation. During this stage the individual also
determines the usefulness of the innovation and may search for further information about it.
 Confirmation
 The individual his/her decision to continue using the innovation. This stage is both intrapersonal
(may cause cognitive dissonance) and interpersonal, confirmation the group has made the right
decision.
Adoption Process For Diffusion
Of Innovation

 We heard about it on the news
 Officials began to push the use of it
 Some people couldn't get it to work, so they contact
there friends and family
 Issues made some people upset
 Now its working, and lots of people use it
 Some people still just use there work insurance
Example Change.gov

We all should remember this from science class
Empirical Research Model

 The model describes the meaning of its instances.
 A conceptual schema is a high-level description of a
business's informational needs.
 Normally described with entity relationship(E.R)
diagrams
 Common in software development
Semantic Data Model

 A very good explanation is given
on ted
 The Math is lengthy
 Cities are very hard to kill
 All companies die
 Companies and cities are like animals
 Growth is linked to population
 The growth of population is linked to
other factors
Geoffrey West Urban Scaling Model

 pace of life decree's
as you get bigger
 double the size of a
city, and get a 15%
increase of things
Growth Is Hard To Sustain

 To avoid collapse, major
innovation take place
 Cycles of innovation to avoid
collapse, you have to keep
innovating, or collapse
Cycles Of Innovation

Please Feel Free To Share Models Or
Patterns
Anti patterns are also approaches that
shouldn’t be taken!
Open Invitation To Share Models

 UML
 Universal Modeling Language
 PlantUML
 draw.io
 Different subjects have there own models!
 Algorithms were excluded
 Open MIT Courses
 Khanacademy Algorithms
Making New Models

If we have data, let’s look at data. If all we have
are opinions, let’s go with mine.
– Jim Barksdale, former Netscape CEO
Torture the data, and it will confess to anything
– Ronald Coase, Economics, Nobel Prize Laureate
Data Can Be BAD!

 Business Failures
 Systemic Failure
 Loss Of Trust
 An example is the bad data used in the
autism drug study
Possible Harms

 False Correlation
 Emotional Appeal
 Question demographics
 Look at relationships instead
 Confirmation Bias
 Posta Hac Ergo Proctor Hoc
 (after which therefore because of which)
 after something not because of it
Common Mistakes

 Show Your work so others can see the
mistakes
 Showing your work also shows what you
might have missed
 Data has limits
 Data doesn't create meaning, people do
Use Critical Thinking

 Should get its own talk
 Good Starting Tour of M.I.
 Microsoft has a very cool cloud framework
out
 Python and R, are very common tools for
using machine learning
 Machine learning is great for making
predictive calls
Machine Learning

 Some databases have this feature built in!
 QGIS
 An open source GIS viewer
 ArchGIS
 A very common GIS application, that’s licensed
 Google Earth
 KML Files
 Open Street Maps
 Can take user submit information
 It also allows downloading of all of there data
Spatial Mapping

 Relation Database Management System
 ACID (Atomicity, Consistency, Isolation, Durability)
 Commonly SQL
 Standard Query Language
 Most RDMS systems have a bulk loading tool
 Lots Of Examples
 Oracle
 MS Sql Server
 Postgress
RDMS

 These databases duplicate data
 There great for large data sets
 Big Data is normally done by this
 There great for distributed information
 They can scale to handle endless data
 Lots of Open Source Options
 MongoDB
 CouchDB
Non Relational Databases

 R
 Python
 Perl
 Bash
 Windows Powershell
 PHP And Javascript
Scripting Languages

 C#
 C++ (and C)
 Java
There are a lot of solutions on github and sourceforge,
that can be changed and expanded
Some are frameworks like eclipse frameworks
Compiled Programming
Languages

Blender
Molecular flipbook
YouTube
Khan Academy
Amazon Web Services
Googles BigQuery
Other Tools

Data Models And Details About Open Data

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (16)

Similar to Data Models And Details About Open Data

Similar to Data Models And Details About Open Data (20)

Recently uploaded

Recently uploaded (20)

Data Models And Details About Open Data