SlideShare a Scribd company logo
+
Open Data
500 Study
Contributed by SupStat Inc.
Data Scientist Team
+
Outlines
 Open Data Now
 Open Data 500
 List of Cities Which Have Open Data Portals
 Companies: including
10 companies in NY
6 companies in the United States excluding NY
5 companies in the U.K.
1 company in Shanghai, China
1 company in Taiwan, China
+
Open data Now
“Today’s Open Data
revolution is rapidly leading
us into new territory.”
“Open Data is becoming a
secret to success for smart
business leaders around
the world.”
+
Open data Now
 “Open Data can best be described as accessible
public data that people, companies, and
organizations can use to launch new ventures,
analyze patterns and trends, and solve complex
problems.”
 In terms of the similarity and difference between Big
Data and Open Data, it is obviously that the
introduction about Big Data could not represent the
scenario of Open Data.
+
Open data Now
 “This book describes the business applications of
Open Data with examples from dozens of
companies.”
 This book reflects the vision, insights, knowledge,
and advice of the leaders in this field who have been
interviewed by the author.
 This book also “contains several resources to help
readers explore the possibilities of Open Data”.
+
Open Data 500
 Open Data 500 is an initiative to research the U.S. companies
and organizations that make use of Open Data published by
government in an innovative way to develop new businesses.
 The research is funded by Knight Foundation, a foundation
aiming at promote journalism & media innovation, advance
community engagement, and forster the arts.
 Governance Lab ("GovLab"), located at New York University, is
responsible for conducting the research. GovLab is a platform
that brings togother innovators from different backgrounds to
collabratively seek new-technology based solutions for better
governance.
 Open Data 500.com is the website that publishes research
outcomes and collects new information about such companies.
+
Open Data 500
 Three established goals of the Open Data 500 research
a. Provide a basis for assessing the economic value of
government open data
b. Encourage the development of new open data
companies
c. Foster a dialogue between government and business
on how government data can be made more useful
+
Open Data 500
 Blueprint
a. Domestically, initiate a roundtable series to invovle both
government, Open Data providers, and businesses and
organizations, Open Data users, to communicate on
potential improvement of Open Data.
b. Internationally, cooperate with international
organizations and governments from other countries to
copy U.S. paradiam to worldwide.
+
Open Data 500
 Strategies of ongoing incoporation of other companies
that use Open Data
a. Outreach campaign: through mass emails, professional
meetings, various social media
b. Expert advice: through industry practioners and lab
advisors
c. Research: through other sources of identification, like
Online Open Data Userbase
+
Open Data 500
 OpenData500.com has filtering functionality that can filter
recorded companies based on industry, location of state, data
sources of agency.
+
What will we cover
 Company’s background
 Data sources
 What do they do with the data
 How do they serve the clients with the data
 Profit model
 Awards/current condition
+
Companies to be introduced
NY: US UK China
Enigma Archimedes Inc. Mastodon C Big Data Bureau
ZocDoc SigFig Locatable (Shanghai)
Honest Buildings PsychSignal Open Corporates Open Data Alliance
Capial Cube CARFAX DueDil (Taiwan)
Bloomberg Zillow CarbonCulture
Aidin Brightscope
Calcbench
Consumer Reports
GetRaised
Palantir
+
Enigma.io
company’s background
The connotation of the company’s name:
“Both in honor of the code-breaking machine developed
by computer pioneer Alan Turing during World War II and
because they were finding that too much public data was
more enigmatic than it should be.”
 Team members and size:
Now a little over a dozen people, who are all based in
New York.
+
Enigma.io
data sources
 Their datasets included both government data and
data from clients, such as Nike’s public list of its
suppliers.
 Their data deluge had grown to 100,000 datasets
and more than 20 billion individual data points.
+
Enigma.io
what do they do with the data
 Take valuable public data and make it much more
usable.
 Make it possible to search through the entire
dataset at a rapid rate.
 “Develop a robust data resource and an impressive
set of tools”.
+
Enigma.io
how do they serve the clients with the
data
 Help clients find the publicly-accessible information
that is relevant to their corporations or economic
accessibility.
 Take data in all kinds of formats and put them into
easily usable form of entire datasets.
 Turn public sector information into Open Data.
+
Enigma.io
profit model
 For now, Enigma’s profit mainly come from charges
for accessing to data, though they charge more for
hedge funds and less for academics, nonprofits, or
government agencies.
 Enigma wants to, eventually, make their data
accessible to public, and gain profit from analytic
and other premium service.
+
Enigma.io
awards
+
ZocDoc
company’s background
 Founded in 2007 with a mission of improving access
to healthcare, ZocDoc is a free service that allows
patients to find a nearby doctor or dentist.
 Cyrus Massoumi, CEO of ZocDoc said:
"After I ruptured my eardrum on a flight, I couldn't find a
doctor for four days. I knew that there had to be an easier
way for patients to find doctors. That was when I had the
idea for ZocDoc."
+
ZocDoc
data sources
 They collect information for patients to share with
health providers.
 In the mean time obtain information from doctors
such as location, specialty, insurance preferences,
and patient reviews to assist patients with decision
making.
+
ZocDoc
what do they do with the data
 Encrypt data to the same standards that banks use
to safeguard your financial information.
 Constantly analyze data to better understand who
uses ZocDoc and how they can improve it.
+
ZocDoc
how do they serve the clients with the
data
 Help find nearby doctors and dentists who accept
their insurance, see their real-time availability, and
instantly book an appointment via ZocDoc.com or
ZocDoc’s free apps for iPhone or Android.
 Guarantee the patients to get access to cure within
24 – 72 hours.
+
ZocDoc
profit model
 The healthcare providers who partner with ZocDoc
pay a subscription fee for ZocDoc's service. Since
they help increase the efficiency of their practices.
+
ZocDoc
awards
 Time Magazine 50 Best Websites 2012
 Business Insider App 100: World's Greatest Apps 2013
 Fortune Magazine Best Small & Medium Companies
2012, 2013
 Crain’s Best Places to Work in NYC 2010, 2011, 2012,
2013
 Modern Healthcare Best Places to Work 2011, 2012, 2013
 Arizona Business Magazine Most Admired Companies
2013
+
Honest Buildings
company’s background
 Founded in 2011, Honest Buildings now has around
30 employees.
 Honest Buildings is a commercial real estate
marketplace that connects top building professionals
with building owners, decision makers and project
managers.
 It is a social media application for real estate.
+
Honest Buildings
data sources
 Honest Buildings collects information that is posted
by building professionals, owners, tenants, and other
stakeholders.
+
Honest Buildings
what do they do with the data
 “Big data is great, but you need to do something with
it. Platforms that contextualize data are in great
demand.” - HonestBuildings.com
 Honest Buildings collects and posts all sorts of
relevant data that is hard for users to find, like
energy costs, walkability.
+
Honest Buildings
how do they serve the clients with the
data
 Launches matching service that offers service
providers step-by-step guidance to win a matched
project
a. Register your company
b. Build a portfolio
c. Get founded, add tags & connections
d. Win new business.
+
Honest Buildings
profit model
 Honest Buildings’ provides matching services to
developers for free.
 It charges candidates, i.e. builders and contractors
for a fee to match for developers.
+
Honest Buildings
awards
Grand
Prize for Vrban,
Disrupt NY 2014
Hackathon
+
Capital Cube
company’s background
 AnalytixInsight was founded in 2010, with about 10
employees.
 Being the online portal of AnalytixInsight,
CapitalCube.com is a global investor portal for
comprehensive company analysis
 Operate on-demand fundamental research, portfolio
evaluation, and screening tools on over 40,000
global equities.
+
Capital Cube
data sources
 Securities and Exchange Commission (SEC)
 Third-party data providers
a. FactSet
b. Thomson Reuters
c. Capital IQ
+
Capital Cube
what do they do with the data
 Develop a software that
can capture 40,000
internationally-operated
public companies’ data on
everyday basis.
 Transform the data into
word reports and graphics
that investors can use to
compare companies and
make investment decision.
+
Capital Cube
how do they serve the clients with the
data
 Investment tools
 Analytical service
a. Dividend analysis
b. Earnings quality analysis
 Strength
a. Large coverage of 40,000 companies
b. Timeliness on daily basis
+
Capital Cube
profit model
 License out its produced content and ability to
dealers. Two big deals of such are with:
a. Samsung: every time Samsung users download
CapitalCube’s quote, AnalytixInsight gets revenue
share.
b. Dow Jones: AnalytixInsight gets paid for every page
view of its content that is posted on Dow Jones’s
published pages.
+
Capital Cube
awards
 Oct. 28, 2013:
recognition as a
company using federal
open data in innovative
and exciting ways, the
White House Office of
Science and Technology
Policy (OSTP) and the
Science and Technology
Policy Institute (STPI).
+
Bloomberg
company’s background
 Founded in 1982.
 The global business and financial information and
news leader, gives influential decision makers a
critical edge by connecting them to a dynamic
network of information, people and ideas.
 Strength – delivering data, news and analytics
through innovative technology, quickly and
accurately.
+
Bloomberg
data sources
 Large public corporations’ reports about their
sustainability data.
+
Bloomberg
what do they do with the data
 ESG (Environmental, Social, and Governmental)
data on the Bloomberg Terminal is fully integrated
into all of Bloomberg Terminals’ analytics.
+
Bloomberg
how do they serve the clients with the
data
 Large proportion of public
companies are now
reporting their
sustainability data, a set
of metrics innovated by
Global Reporting
Initiative, to public. This is
data that investors are
increasingly caring about.
+
Bloomberg
profit model
 As a data-driven media company, Bloomberg has
journalists all around to congregate financial and
business real-time information.
 The company operates over 300,000 terminals
which receive financial data released by the
company.
 This business generates 6.3 billion US dollars as
estimated in 2008.
+
Bloomberg
awards
 Bloomberg is on track for record revenues of $8.3
billion in 2013 and profits of about $2.7 billion
+
Aidin
company’s background
 Aidin was founded in 2011, and has a scale of less
than 10 employees.
 The foundation of the company was inspired by a
troublesome experience of the Aidin’s founder’s family
about the trivial matters they had to deal with when a
family member was processing the discharge from
hospital.
 It is missioned to bring transparency to healthcare
facilities and provide patients with information that
they need when making healthcare-related decisions.
+
Aidin
data sources
 Government Open Data.
 Patients’ review about hospitals.
 “First with weather and GPS data and now with
health data, the U.S. government has defined its
responsibility as defining, gathering, and
presenting data on important subjects in easily
usable forms.”
+
Aidin
what do they do with the data
+
Aidin
how do they serve the clients with the
data
 Aidin provides post-acute
healthcare facilities with
data to improve their
services.
 Patients can also choose
facilities based on Aidin’s
information.
+
Aidin
profit model
 Aidin profit from their products, including the full
Aidin solution, Aidin Lite, and Aidin Provider
+
Aidin
awards
 Grand Prize Winner, GE's Hospital Quest
competition.
+
Calcbench
company’s background
 Founded in 2011, Calcbench is the first company of
its kind to fully harness the power of the new
government mandated data standard XBRL
(Extensible Business Reporting Language), yielding
an unprecedented direct line into the SEC’s
corporate financial data repository.
 Calcbench currently has less than 10 employees.
+
Calcbench
data sources
 XBRL is a freely available and global standard for
exchanging business information
 XBRL reports
+
Calcbench
what do they do with the data
 After Calcbench’s
processing, it turns
public-accessible but
hard-to-use XBRL
(Extensible Business
Reporting Language)
data into more
detailed and insightful
data, thus adding up
the value to the data.
+
Calcbench
how do they serve the clients with the
data
 Calcbench processes the raw data mandatorily-
collected by government through XBRL and turns it
into usable information to the financial industry.
 Calcbench also uses technologies to help SEC
elevate data quality like in terms of error
identification and correction.
+
Calcbench
profit model
 Calcbench serves corporate reporting professionals,
corporate finance leaders, auditors, investment
researchers, and academies.
 It charges fee for using its service and provides
premium suite to generate revenue.
+
Calcbench
awards
 Grand Prize Winner, XBRL Challenge at the XBRL
and Financial Analysis Technology Conference,
February 29, 2012.
+
Consumer Reports
company’s background
 Founded in 1936, Consumer Reports is an expert,
independent, nonprofit organization whose mission
is to work for a fair, just, and safe marketplace for all
consumers and to empower consumers to protect
themselves.
 It has over 500 employees.
+
Consumer Reports
data sources
 Hospital Compare, a federal website
 Data from various states in the U.S.
+
Consumer Reports
what do they do with the data
 Consumer Reports produces safety ratings for more
over two thousand hospitals, based on Open Data
sources.
+
Consumer Reports
how do they serve the clients with the
data
 Improve American people’s lives by using public-
accessible data.
 Consumer Reports’ safety ratings can assist
customers with decision making on which hospital to
go.
+
Consumer Reports
profit model
 Consumer Reports is a non-profit organization. It
does not accept advertisements nor have
shareholders, but depends solely on subscribers’
fee.
+
Consumer Reports
awards
 Sigma Delta Chi Award for Public Service, Society of
Professional Journalists, 2008
 Honorable Mention, National Press Club for Caution:
The Secret Score Behind Your Auto Insurance, 2007
 People’s Voice Award, International Academy of
Digital Arts and Sciences, 2006
 Golden Triangle Award, American Academy of
Dermatology, 2005
+
GetRaised
company’s background
 GetRaised was founded in 2010, with less than 10
employees now.
 GetRaised aspires to be: a bunch of complicated
data that is hidden behind a very simple, easy-to-use
interface that can help narrow the wage gap and
help people get paid more.
+
GetRaised
data sources
 Data from the U.S. Bureau of Labor Statistics
 Users’ input
 Various online job postings
+
GetRaised
what do they do with the data
 GetRaised uses the collected data to develop a
salary engine.
 It has created raise request based on the analysis of
experts from related fields, like HR and research
institutes.
+
GetRaised
how do they serve the clients with the
data
 GetRaised provides customers with information on
how much they should be paid, was underpaid, and
should ask for an increase in salary.
 A significant proportion of women who request for
salary raise based on the information of GetRaised
got their raise eventually.
+
GetRaised
profit model
 As a non-profit, GetRaised provides free service to
users.
 It receives support from two organizations and four
individuals.
+
GetRaised
awards
 According to Dave Clarke, a communications
strategist, “81% of women that have used GetRaised
have, in fact, earned a raise. The average raise
amount across all users (male and female) is
$6,473”.
+
Palantir
company’s background
 Palantir was founded in 2004, with over 500
employees today
 It develops software which professionals from
different industries and sectors can perform analysis
on massive disparate data
 Its product and service helps combat terrorism,
prosecuting crimes, fighting fraud, and eliminate
waste
+
Palantir
data sources
 Government Open Data
+
Palantir
what do they do with the data
 Various types of data is entered by users into
Palantir’s software, and it will generate reports and
analysis that users can understand directly. It’s not
only artificial intelligence, but something called
intelligence augmentation.
+
Palantir
how do they serve the clients with the
data
 Help government analyze its open data in order to
reflect government expenditure and possible flaws.
 Combat people trafficking by analyzing data.
+
Palantir
profit model
 Investment from shareholders
 Sale of software
 Reported to be valued at $9 billion by the end of
2013
+
Palantir
awards
 VAST2009 Visual Analytics Award, 2009
 Hall of Innovation Award, J. P. Morgan Chase
Technology Innovation Symposium, October 2010
+
Archimedes Inc.
company’s background
 Began as part of Kaiser Permanente in 1993 and
split of as a second company in 2006.
 It has decades of experience in developing
algorithms and predictive models for healthcare.
+
Archimedes Inc.
data sources
 Centers for Medicare & Medicaid Services (CMS).
 Databases of clinical trials.
 National Health and Nutrition Examination Survey.
+
Archimedes Inc.
what do they do with the data
 They use the Archimedes Model as its core tool to
analyze data.
 Archimedes scientists have analyzed many various
relevant data to derive hundreds of equations that
represent the effects of multiple diseases, tests,
and treatments.
 The equations are integrated into a single, large-
scale simulation model using object-oriented
programming.
+
Archimedes Inc.
how do they serve the clients with the
data
 They provide a suite of online healthcare simulation
and analytics tools (ARCHeS) that provide answer
to questions about the health outcomes and
economic effects of different interventions.
 They also have a product (IndiGo) that generates
individual guidelines that identify and help prioritize
the best preventive care for each patient.
+
Archimedes Inc.
profit model
 They make a profit by selling their products:
ARCHeS (Archimedes Healthcare Simulator) and
IndiGo (Individualized Guidelines and Outcomes).
 They also provide modeling and consulting services,
including modeling of disease and new intervention,
and analyzing different prevention strategies.
+
Archimedes Inc.
awards
 Jun 6, 2012: IndiGO Receives Best of Care
Applications Award at the 2012 Health Data
Initiative III.
+
SigFig
company’s background
+
SigFig
data sources
 Historical fundamentals & Price data
 Data on load fees comes from Lipper
 Portions of their advice rely on third-party market
data from companies like Lipper, Thomson-
Reuters, Interactive Data, and Xignite.
+
SigFig
what do they do with the data
+
SigFig
how do they serve the clients with the
data
 A single Portfolio Dashboard
+
SigFig
profit model
 “The company doesn’t charge a management fee.
Instead, it earns its revenue through publishing
arrangements with several websites and referral
fees when users go to a new broker.”
+
SigFig
awards
+
PsychSignal
company’s background
 Launched by SmogFarm, another startup that
analyzes large open datasets.
+
PsychSignal
data sources
 Data is pulled from social media conversations.
+
PsychSignal
what do they do with the data
 Scour the online conversation looking for distinct
psychological expressions of emotion or attitude.
 Aggregate millions of expressions to arrive at a
picture of crowd mood in real time.
 Built an advanced proprietary sentiment engine.
 Use NASA developed signal processing.
+
PsychSignal
how do they serve the clients with the
data
+
PsychSignal
profit model
 NASA recently open sourced an algorithm our
engineers found particularly useful.
 Day to day sentiment is extremely noisy, making it
hard to determine subtle changes in trending
sentiment data.
 These NASA algorithms help uncover the trend
beneath the noise.
+
PsychSignal
awards
+
CARFAX
company’s background
 Provide vehicle history information, used by
millions of consumers each year.
 Receive millions of visitors each month.
 With thousands of auto dealers nationwide
subscribers.
+
CARFAX
data sources
 U.S. motor vehicle agencies
 Canadian provincial motor vehicle agencies
 Auto auctions
 Collision repair facilities
 Service/ maintenance facilities
 Insurance companies
 Salvage auctions
 Automotive recyclers
 Rental/fleet vehicle companies
 State inspection stations
+ CARFAX
what do they do with the data
 Format the data into reports containing the following
information:
a. Title information, including salvaged or junked titles
b. Flood damage history
c. Total loss accident history
d. Odometer readings
e. Lemon history
f. Number of owners
g. Accident indicators, such as airbag deployments
h. State emissions inspection results
i. Service records
j. Vehicle use (taxi, rental, lease, etc.)
+
CARFAX
how do they serve the clients with the
data
 Provide reports at a price.
 Provide information regarding used cars to sell and
car deals.
 Also provide car facts research on various makes
and models of different cars.
+
CARFAX
profit model
 Make a profit mainly by
selling their reports.
 Also offer CARFAX Hot
Listings™ and Safety &
Reliability Ratings™ products.
+
CARFAX
awards
 Has the most comprehensive vehicle history
database available in North America
 One of the top five websites that consumers turn to
for vehicle information
+
Zillow
company’s background
 Founded in 2005, the company is headquartered in
Seattle with offices in New York, San Francisco,
Chicago, Irvine, Calif. and Lincoln, Neb.
 It provide consumers with information and tools to
make smart decisions about homes, real estate,
and mortgages
 “At Zillow, we built our business taking public real
estate information that was previously only
accessible by spending hours in dusty registry of
deeds making it easily accessible to consumers,
for free.”
+
Zillow
data sources
 Freely available data from:
a. The Bureau of Labor Statistics.
b. The Federal Housing Finance Agency.
c. The Census Bureau.
+
Zillow
what do they do with the data
 Collected open government data and build a living
database of more than 110 million U.S. homes.
 Republished the data on their website to provide
users with various information regarding homes,
real estate, and mortgages.
+
Zillow
how do they serve the clients with the
data
 Provide search for houses for sale, for rent, and
pre-market, as well as information for buyers and
lenders.
 Provide an open, anonymous, and free marketplace
for borrowers and lenders.
 Zestimate, Zillow's estimated market value for an
individual home, a starting point in determining a
home's value.
+
Zillow
profit model
 Zillow operates mainly by advertising.
 It operates the largest real estate and rental
advertising networks in the U.S. in partnership with
Yahoo! Homes.
 “The company was founded in 2005, had over $66
million in revenue when they launched an IPO in
2011, and had a valuation of $2.3 billion in 2013.”
+
Zillow
awards
In 2013
 November 17: Zillow CEO Spencer Rascoff named the EY National
National Entrepreneur of the Year™ 2013 Services Award.
 April 30: Zillow sweeps the real estate category in the 17th Annual Webby
Awards, winning the People's Voice Award and the overall Webby Award in
the Real Estate category.
 October 10: Zillow wins the Mortgage Technology Award for its Zillow
Mortgage Marketplace iPhone App.
 June 13: Zillow honored at the Webby Awards as the People's Voice
Winner and the overall Webby Award winner for the Real Estate category.
 May 4: Zillow wins Webby Award as Best Real Estate site.
+
Brightscope
company’s background
 BrightScope®, Inc. is a financial information
company use data to drive better decision-making
for individual investors, corporate plan sponsors,
asset managers, etc.
 It primarily operates in two major segments:
Retirement Plans and Wealth Management.
+
Brightscope
data sources
 Data on Form 5500 from Department of Labor.
 The founders, Mike and Ryan Alfred, actually
persuaded the Department of Labor to begin
collecting and publishing the Form 5500 data on
the Internet.
+
Brightscope
what do they do with the data
 The gathered and republished data to make it more
clear and informative, and more available to the
public.
+
Brightscope
how do they serve the clients with the
data
 Provide ratings of 401k and 403b plans across
critical metrics.
 Launched the first comprehensive and publicly
available directory of Financial Advisors.
 Provides free white papers for the public that
examine trends in the Defined Contribution (DC)
market.
+
Brightscope
profit model
 Make a profit by selling their market intelligence
product called Beacon and their sales intelligence
product for retirement plans called Spyglass.
 Sell research papers with detailed investment and
provide data on 50,000 Defined Contribution (DC)
plans.
+
Brightscope
awards
 The first to convince the Department of Labor to
begin collecting and publishing the Form 5500 data
and utilized that data to make a profit.
 The first comprehensive and publicly available
directory of Financial Advisors.
+
Mastodon C
company’s background
 Mastodon C is a Big Data analytics company.
 It was the first of its kind to join the ODI (Open
Data Institute) incubator.
 CEO Fran Bennett spent years working for search
engines, help them to turn data into money.
+
Mastodon C
data sources
 Their main data source is the big sets of Open
Data from the U.K.’s (NHS) National Health
Services.
+
Mastodon C
what do they do with the data
 Use cloud computing to analyze data.
 Use Hadoop and Cassandra technologies to
integrate real time sensor data, web service data
and spreadsheets from their clients.
 One of their main signatures is analyzing data on
zero carbon infrastructure.
+
Mastodon C
how do they serve the clients with the
data
 Turn messy data, either open data or clients’
proprietary data, into useful insights.
 Serve data into a format that makes sense.
+
Mastodon C
profit model
 “Mastodon C has just done a government-funded
analytics of variations in prescribing patterns
across the United Kingdom, finding areas where
expensive drugs are being prescribed for no
apparent reason when generics would work as
well.”
+
Mastodon C
awards
 Winner of Best Hack and the PeopleFund. it Award
at the London Green Hackathon 2012.
+
Locatable
company’s background
 Another company in the ODI incubator.
 At first they wanted to help people find where to
live and built a website to provide data for such
decisions, now they changed their website to assist
people in managing their home.
 Founders Vasanth Subramanian and David Prime
are both former physics students.
+
Locatable
data sources
+
Locatable
what do they do with the data
 Collect from all sources and integrate them into
one dashboard.
+
Locatable
how do they serve the clients with the
data
 Provide their clients with a single dashboard to
manage their home and optimize their
performances.
 Provide visibility across all the most expensive cost
including mortgage, insurance, utility bills and
maintenance costs.
 Help their clients find cheaper deals for all these
services.
+
Locatable
profit model
 A similar business model to a property portal which
generates leads for estate agents.
 Steps to search:
a. Customers log on and enter the locations that they
want to live near.
b. The results show customers the places which best
fit the bill and the properties available.
c. The website refers customers to sites and charges
an affiliate fee.
+
Locatable
current condition
 As ODI analyzes, their site “currently caters for the
London area but the team are working on rolling it
out across the UK”.
 For example, “in 2 months, Locatable has attracted
more than 4,000 unique visitors, with users offering
some great feedback.”
+
OpenCorporates
company’s background
 The company behind OpenCorporates is Chrinon
Ltd, and the people who founded it are Chris
Taggart and Rob McKinnon.
 Rob built TheyWorkForYou.nz and
WhosLobbying.com.
 Chris built OpenlyLocal.com and OpenCharities,
and sits on the UK Government's Local Public Data
Panel, the UK's Tax Transparency Board, and Open
Knowledge Foundation's open government working
group.
+
OpenCorporates
data sources
 They source the information in their databases
from government and other sources through a
variety of means including:
a. directly from government websites and APIs,
b. from publicly available datasets,
c. or through Freedom of Information requests.
+
OpenCorporates
what do they do with the data
 As Chris explains, “we take messy data from
government websites, company registers, official
filings and data released under the Freedom of
Information Act, clean it up and using clever code
make it available to people.”
+
OpenCorporates
how do they serve the clients with the
data
+
OpenCorporates
profit model
 While Open Corporates releases all its findings
for free as Open Data under an open license, its
business model includes offering additional paid
services.
+
OpenCorporates
current condition
+
DueDil
company’s background
 Launched in 2011, this startup called Duedil -
derived from “due diligence” – “is building an
ambitious database on the other side of the
Atlantic”.
 Their goal is to provide requisite information for
lenders to invest in small and medium-size
companies with confidence.
+
DueDil
data sources
 “Much of Duedil is built on Open Data, like data
from Companies House, the United Kingdom’s
central corporate registry.”
 They also use some proprietary data sources.
+
DueDil
what do they do with the data
 They aggregated more that 20 years of digitized
financial record and made it available on their
website
+
DueDil
how do they serve the clients with the
data
 They provides data on small-to-medium-sized
companies in the hope of encouraging hundreds of
billions of dollars in new investment.
+
DueDil
profit model
 They use Open Data to fill the information gap
between small business owners and the investors.
 They “add value through its analysis and
functionality rather than by having exclusive rights
to any dataset”.
 They also “serve as a platform where SMEs can
provide information about themselves look for
potential business partners, and develop the
groundwork for productive dearls”.
+
DueDil
awards
 They has been nominated for numerous startup
company awards, including:
a. being shortlisted for two Guardian Digital Innovation
2012 awards,
b. finalist in the Orange Innovation Award in the
National Business Awards 2012,
c. named as one of '31 to Watch' in Outsell’s
Information Industry Outlook 2012: Break and
Reset report.
+
CarbonCulture
company’s background
 It is “a digital start-up that was launched in 2009”.
 “Luke Nicholson, the founder and Director of
CarbonCulture, is a social entrepreneur with a
background in design communications and
sustainable innovation.”
 Now, “the team is made up of four full-time
employees and several part-time staff, with a broad
network of associates, partner businesses and
NGOs.”
+
CarbonCulture
data sources
 Data is collected from an organization’s Building
Management Systems and its Automated Meter
Reading.
 Luke Nicholson says: “We can integrate with any
system, as well as with buildings that do not have
automated meter readings. We publish open data
for a number of government departments, local
authorities and universities, and are working now
with corporate customers to do the same for them.”
+
CarbonCulture
what do they do with the data
 They were “inspired by a huge global challenge –
to accelerate sustainable transformation at a large
scale and to use digital technology and great
design to make it happen.
 They use high-tech metering to monitor carbon use
in the workplace.
+
CarbonCulture
how do they serve the clients with the
data
 It helps clients use that data to make better
decisions around energy usage and sustainability,
enabling them to realise cash savings.
 It also places great emphasis on design and user
interface, enabling people within an organisation to
connect, so that employees develop a shared
understanding of sustainability.
+
CarbonCulture
profit model
 Measure and report on clients’ carbon and energy
performance, publishing it in real time online as
well as in workplace receptions and intranets.
 Develop apps for clients that allow people and
buildings to work together to make savings.
+
CarbonCulture
awards
 It delivered much higher engagement and energy
savings than expected - 40% staff take up and a
10% saving in gas usage - leading to
CarbonCulture being deployed in seven more
government departments.
+
Big Data Bureau
Shanghai, China
 Shanghai Municipal Commission of Economy and
Informatization is now preparing a Big Data Bureau
to share government data and information.
 On April 30, 2014, Shanghai Public Credit &
Information Platform was open to the public at the
first time.
 But there are still many challenges for the
government.
+
Taiwan’s Open Data Alliance
+
Taiwan’s Open Data Alliance
 The UK’s Open Data Institute (ODI) and Taiwan’s
Open Data Alliance (ODA) signed a Letter of Intent
on 11 December 2013
 They will collaborate on a range of activities:
 Sharing expertise, knowledge and best practice
 Carrying out collaborative projects
 Designing support and collaboration systems for
open data driven businesses
 Developing open data technologies

More Related Content

What's hot

Bigdata
BigdataBigdata
Business analytics
Business analyticsBusiness analytics
Business analytics
SwarnaLatha177
 
Open Data-Driven Innovation and Smart Cities_Open Data Business Model and Pat...
Open Data-Driven Innovation and Smart Cities_Open Data Business Model and Pat...Open Data-Driven Innovation and Smart Cities_Open Data Business Model and Pat...
Open Data-Driven Innovation and Smart Cities_Open Data Business Model and Pat...Fatemeh Ahmadi
 
Future of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnFuture of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren Ravn
IBM Danmark
 
The Big Data Talent Gap
The Big Data Talent GapThe Big Data Talent Gap
The Big Data Talent Gap
Kip Michael Kelly
 
Open data for UK public sector organisations
Open data for UK public sector organisationsOpen data for UK public sector organisations
Open data for UK public sector organisations
Andrew Mackenzie
 
Achieve Federal Open Data Policy Compliance - Slides
Achieve Federal Open Data Policy Compliance - SlidesAchieve Federal Open Data Policy Compliance - Slides
Achieve Federal Open Data Policy Compliance - Slides
Socrata
 
The Business of Big Data (IA Ventures)
The Business of Big Data (IA Ventures)The Business of Big Data (IA Ventures)
The Business of Big Data (IA Ventures)Ben Siscovick
 
Data Ingestion Engine Theory
Data Ingestion Engine TheoryData Ingestion Engine Theory
Data Ingestion Engine Theory
Jacob Brody
 
Legal tech trends 2019 report
Legal tech trends 2019 reportLegal tech trends 2019 report
Legal tech trends 2019 report
Dan Storbaek
 
Dow Jones: Reimagining the News as a Knowledge Graph
Dow Jones: Reimagining the News as a Knowledge GraphDow Jones: Reimagining the News as a Knowledge Graph
Dow Jones: Reimagining the News as a Knowledge Graph
Connected Data World
 
Big Data Challenges faced by Organizations
Big Data Challenges faced by OrganizationsBig Data Challenges faced by Organizations
Big Data Challenges faced by Organizations
IJCSIS Research Publications
 
20170207 THe Valley_Internet of things ongoing revolution
20170207 THe Valley_Internet of things ongoing revolution20170207 THe Valley_Internet of things ongoing revolution
20170207 THe Valley_Internet of things ongoing revolution
Bernardo Campillo
 
The implications of Big Data for BTS and COS
The implications of Big Data for BTS and COSThe implications of Big Data for BTS and COS
The implications of Big Data for BTS and COS
George Kershoff
 
Data science market insights usa
Data science market insights usaData science market insights usa
Data science market insights usa
Kaitlin McAndrews
 
Is Data Scientist the Sexiest Job of the 21st century?
Is Data Scientist the Sexiest Job of the 21st century?Is Data Scientist the Sexiest Job of the 21st century?
Is Data Scientist the Sexiest Job of the 21st century?
Edureka!
 
Eric van Tol - Businesscases & Verdienmodellen
Eric van Tol - Businesscases & VerdienmodellenEric van Tol - Businesscases & Verdienmodellen
Eric van Tol - Businesscases & Verdienmodellen
Media Perspectives
 
Smart Data Slides: Leverage the IOT to Build a Smart Data Ecosystem
Smart Data Slides: Leverage the IOT to Build a Smart Data EcosystemSmart Data Slides: Leverage the IOT to Build a Smart Data Ecosystem
Smart Data Slides: Leverage the IOT to Build a Smart Data Ecosystem
DATAVERSITY
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group
 
Data mining
Data miningData mining
Data mining
jadhav_priti
 

What's hot (20)

Bigdata
BigdataBigdata
Bigdata
 
Business analytics
Business analyticsBusiness analytics
Business analytics
 
Open Data-Driven Innovation and Smart Cities_Open Data Business Model and Pat...
Open Data-Driven Innovation and Smart Cities_Open Data Business Model and Pat...Open Data-Driven Innovation and Smart Cities_Open Data Business Model and Pat...
Open Data-Driven Innovation and Smart Cities_Open Data Business Model and Pat...
 
Future of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren RavnFuture of Power: Big Data - Søren Ravn
Future of Power: Big Data - Søren Ravn
 
The Big Data Talent Gap
The Big Data Talent GapThe Big Data Talent Gap
The Big Data Talent Gap
 
Open data for UK public sector organisations
Open data for UK public sector organisationsOpen data for UK public sector organisations
Open data for UK public sector organisations
 
Achieve Federal Open Data Policy Compliance - Slides
Achieve Federal Open Data Policy Compliance - SlidesAchieve Federal Open Data Policy Compliance - Slides
Achieve Federal Open Data Policy Compliance - Slides
 
The Business of Big Data (IA Ventures)
The Business of Big Data (IA Ventures)The Business of Big Data (IA Ventures)
The Business of Big Data (IA Ventures)
 
Data Ingestion Engine Theory
Data Ingestion Engine TheoryData Ingestion Engine Theory
Data Ingestion Engine Theory
 
Legal tech trends 2019 report
Legal tech trends 2019 reportLegal tech trends 2019 report
Legal tech trends 2019 report
 
Dow Jones: Reimagining the News as a Knowledge Graph
Dow Jones: Reimagining the News as a Knowledge GraphDow Jones: Reimagining the News as a Knowledge Graph
Dow Jones: Reimagining the News as a Knowledge Graph
 
Big Data Challenges faced by Organizations
Big Data Challenges faced by OrganizationsBig Data Challenges faced by Organizations
Big Data Challenges faced by Organizations
 
20170207 THe Valley_Internet of things ongoing revolution
20170207 THe Valley_Internet of things ongoing revolution20170207 THe Valley_Internet of things ongoing revolution
20170207 THe Valley_Internet of things ongoing revolution
 
The implications of Big Data for BTS and COS
The implications of Big Data for BTS and COSThe implications of Big Data for BTS and COS
The implications of Big Data for BTS and COS
 
Data science market insights usa
Data science market insights usaData science market insights usa
Data science market insights usa
 
Is Data Scientist the Sexiest Job of the 21st century?
Is Data Scientist the Sexiest Job of the 21st century?Is Data Scientist the Sexiest Job of the 21st century?
Is Data Scientist the Sexiest Job of the 21st century?
 
Eric van Tol - Businesscases & Verdienmodellen
Eric van Tol - Businesscases & VerdienmodellenEric van Tol - Businesscases & Verdienmodellen
Eric van Tol - Businesscases & Verdienmodellen
 
Smart Data Slides: Leverage the IOT to Build a Smart Data Ecosystem
Smart Data Slides: Leverage the IOT to Build a Smart Data EcosystemSmart Data Slides: Leverage the IOT to Build a Smart Data Ecosystem
Smart Data Slides: Leverage the IOT to Build a Smart Data Ecosystem
 
Snowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big DataSnowball Group Whitepaper - Spotlight on Big Data
Snowball Group Whitepaper - Spotlight on Big Data
 
Data mining
Data miningData mining
Data mining
 

Similar to Open data now english

The Value of Signal (and the Cost of Noise): The New Economics of Meaning-Making
The Value of Signal (and the Cost of Noise): The New Economics of Meaning-MakingThe Value of Signal (and the Cost of Noise): The New Economics of Meaning-Making
The Value of Signal (and the Cost of Noise): The New Economics of Meaning-Making
Cognizant
 
July Update Breakfast
July Update BreakfastJuly Update Breakfast
July Update Breakfast
ICCI Melbourne
 
Launching the Open Data 500
Launching the Open Data 500Launching the Open Data 500
Launching the Open Data 500
Katherine Garcia
 
Why Alt Data Is So Important
Why Alt Data Is So ImportantWhy Alt Data Is So Important
Why Alt Data Is So Important
Mostafa Abou Gamrah
 
Social Media and Market Intelligence
Social Media and Market IntelligenceSocial Media and Market Intelligence
Social Media and Market Intelligence
Monster
 
Enterprations Weekly Strategy, Number 3, February 2017
Enterprations Weekly Strategy, Number 3, February 2017 Enterprations Weekly Strategy, Number 3, February 2017
Enterprations Weekly Strategy, Number 3, February 2017
Mutiu Iyanda, mMBA, ASM
 
ODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open Data
ODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open DataODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open Data
ODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open Data
Martin Kaltenböck
 
360i Report: Big Data
360i Report: Big Data360i Report: Big Data
360i Report: Big Data360i
 
Digital strategy Report John Lynn
Digital strategy Report John LynnDigital strategy Report John Lynn
Digital strategy Report John LynnJohn Lynn
 
Data - Its a big deal
Data - Its a big dealData - Its a big deal
Data - Its a big dealSubarna Gupta
 
InsideView Clean Data
InsideView Clean DataInsideView Clean Data
InsideView Clean Data
InsideView
 
MeasureMatch: The Transformational On-Demand Future of Tech & Data Talent
MeasureMatch: The Transformational On-Demand Future of Tech & Data TalentMeasureMatch: The Transformational On-Demand Future of Tech & Data Talent
MeasureMatch: The Transformational On-Demand Future of Tech & Data Talent
MeasureMatch
 
IMS-Viewpoint-BigData-Jun2016
IMS-Viewpoint-BigData-Jun2016IMS-Viewpoint-BigData-Jun2016
IMS-Viewpoint-BigData-Jun2016Ben Browning
 
Crowdfunding Industry Report- Infocrowdfunding 2012
Crowdfunding Industry Report- Infocrowdfunding 2012Crowdfunding Industry Report- Infocrowdfunding 2012
Crowdfunding Industry Report- Infocrowdfunding 2012
Uttopy
 
Leading with Data: Boost Your ROI with Open and Big Data
Leading with Data: Boost Your ROI with Open and Big DataLeading with Data: Boost Your ROI with Open and Big Data
Leading with Data: Boost Your ROI with Open and Big Data
McGraw-Hill Professional
 
The 10 best performing big data & business analytics companies july 2017
The 10 best performing big data & business analytics companies july 2017The 10 best performing big data & business analytics companies july 2017
The 10 best performing big data & business analytics companies july 2017
Insights success media and technology pvt ltd
 
About the ODI slides + notes for potential investors
About the ODI slides + notes for potential investors About the ODI slides + notes for potential investors
About the ODI slides + notes for potential investors
theODI
 

Similar to Open data now english (20)

The Value of Signal (and the Cost of Noise): The New Economics of Meaning-Making
The Value of Signal (and the Cost of Noise): The New Economics of Meaning-MakingThe Value of Signal (and the Cost of Noise): The New Economics of Meaning-Making
The Value of Signal (and the Cost of Noise): The New Economics of Meaning-Making
 
July Update Breakfast
July Update BreakfastJuly Update Breakfast
July Update Breakfast
 
Launching the Open Data 500
Launching the Open Data 500Launching the Open Data 500
Launching the Open Data 500
 
Why Alt Data Is So Important
Why Alt Data Is So ImportantWhy Alt Data Is So Important
Why Alt Data Is So Important
 
Lecture week 5 -
Lecture week 5 -Lecture week 5 -
Lecture week 5 -
 
Social Media and Market Intelligence
Social Media and Market IntelligenceSocial Media and Market Intelligence
Social Media and Market Intelligence
 
ZEDventures-highres
ZEDventures-highresZEDventures-highres
ZEDventures-highres
 
Enterprations Weekly Strategy, Number 3, February 2017
Enterprations Weekly Strategy, Number 3, February 2017 Enterprations Weekly Strategy, Number 3, February 2017
Enterprations Weekly Strategy, Number 3, February 2017
 
ODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open Data
ODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open DataODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open Data
ODI Node Vienna: Best Practise Beispiele für: Open Innovation mittels Open Data
 
Transforming Big Data into business value
Transforming Big Data into business valueTransforming Big Data into business value
Transforming Big Data into business value
 
360i Report: Big Data
360i Report: Big Data360i Report: Big Data
360i Report: Big Data
 
Digital strategy Report John Lynn
Digital strategy Report John LynnDigital strategy Report John Lynn
Digital strategy Report John Lynn
 
Data - Its a big deal
Data - Its a big dealData - Its a big deal
Data - Its a big deal
 
InsideView Clean Data
InsideView Clean DataInsideView Clean Data
InsideView Clean Data
 
MeasureMatch: The Transformational On-Demand Future of Tech & Data Talent
MeasureMatch: The Transformational On-Demand Future of Tech & Data TalentMeasureMatch: The Transformational On-Demand Future of Tech & Data Talent
MeasureMatch: The Transformational On-Demand Future of Tech & Data Talent
 
IMS-Viewpoint-BigData-Jun2016
IMS-Viewpoint-BigData-Jun2016IMS-Viewpoint-BigData-Jun2016
IMS-Viewpoint-BigData-Jun2016
 
Crowdfunding Industry Report- Infocrowdfunding 2012
Crowdfunding Industry Report- Infocrowdfunding 2012Crowdfunding Industry Report- Infocrowdfunding 2012
Crowdfunding Industry Report- Infocrowdfunding 2012
 
Leading with Data: Boost Your ROI with Open and Big Data
Leading with Data: Boost Your ROI with Open and Big DataLeading with Data: Boost Your ROI with Open and Big Data
Leading with Data: Boost Your ROI with Open and Big Data
 
The 10 best performing big data & business analytics companies july 2017
The 10 best performing big data & business analytics companies july 2017The 10 best performing big data & business analytics companies july 2017
The 10 best performing big data & business analytics companies july 2017
 
About the ODI slides + notes for potential investors
About the ODI slides + notes for potential investors About the ODI slides + notes for potential investors
About the ODI slides + notes for potential investors
 

More from Vivian S. Zhang

Why NYC DSA.pdf
Why NYC DSA.pdfWhy NYC DSA.pdf
Why NYC DSA.pdf
Vivian S. Zhang
 
Career services workshop- Roger Ren
Career services workshop- Roger RenCareer services workshop- Roger Ren
Career services workshop- Roger Ren
Vivian S. Zhang
 
Nycdsa wordpress guide book
Nycdsa wordpress guide bookNycdsa wordpress guide book
Nycdsa wordpress guide book
Vivian S. Zhang
 
We're so skewed_presentation
We're so skewed_presentationWe're so skewed_presentation
We're so skewed_presentation
Vivian S. Zhang
 
Wikipedia: Tuned Predictions on Big Data
Wikipedia: Tuned Predictions on Big DataWikipedia: Tuned Predictions on Big Data
Wikipedia: Tuned Predictions on Big Data
Vivian S. Zhang
 
A Hybrid Recommender with Yelp Challenge Data
A Hybrid Recommender with Yelp Challenge Data A Hybrid Recommender with Yelp Challenge Data
A Hybrid Recommender with Yelp Challenge Data
Vivian S. Zhang
 
Kaggle Top1% Solution: Predicting Housing Prices in Moscow
Kaggle Top1% Solution: Predicting Housing Prices in Moscow Kaggle Top1% Solution: Predicting Housing Prices in Moscow
Kaggle Top1% Solution: Predicting Housing Prices in Moscow
Vivian S. Zhang
 
Data mining with caret package
Data mining with caret packageData mining with caret package
Data mining with caret package
Vivian S. Zhang
 
Xgboost
XgboostXgboost
Streaming Python on Hadoop
Streaming Python on HadoopStreaming Python on Hadoop
Streaming Python on Hadoop
Vivian S. Zhang
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Vivian S. Zhang
 
Xgboost
XgboostXgboost
Nyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expandedNyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expanded
Vivian S. Zhang
 
Nycdsa ml conference slides march 2015
Nycdsa ml conference slides march 2015 Nycdsa ml conference slides march 2015
Nycdsa ml conference slides march 2015
Vivian S. Zhang
 
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public dataTHE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
Vivian S. Zhang
 
Max Kuhn's talk on R machine learning
Max Kuhn's talk on R machine learningMax Kuhn's talk on R machine learning
Max Kuhn's talk on R machine learning
Vivian S. Zhang
 
Winning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen ZhangWinning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen Zhang
Vivian S. Zhang
 
Using Machine Learning to aid Journalism at the New York Times
Using Machine Learning to aid Journalism at the New York TimesUsing Machine Learning to aid Journalism at the New York Times
Using Machine Learning to aid Journalism at the New York Times
Vivian S. Zhang
 
Introducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with rIntroducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with r
Vivian S. Zhang
 
Bayesian models in r
Bayesian models in rBayesian models in r
Bayesian models in r
Vivian S. Zhang
 

More from Vivian S. Zhang (20)

Why NYC DSA.pdf
Why NYC DSA.pdfWhy NYC DSA.pdf
Why NYC DSA.pdf
 
Career services workshop- Roger Ren
Career services workshop- Roger RenCareer services workshop- Roger Ren
Career services workshop- Roger Ren
 
Nycdsa wordpress guide book
Nycdsa wordpress guide bookNycdsa wordpress guide book
Nycdsa wordpress guide book
 
We're so skewed_presentation
We're so skewed_presentationWe're so skewed_presentation
We're so skewed_presentation
 
Wikipedia: Tuned Predictions on Big Data
Wikipedia: Tuned Predictions on Big DataWikipedia: Tuned Predictions on Big Data
Wikipedia: Tuned Predictions on Big Data
 
A Hybrid Recommender with Yelp Challenge Data
A Hybrid Recommender with Yelp Challenge Data A Hybrid Recommender with Yelp Challenge Data
A Hybrid Recommender with Yelp Challenge Data
 
Kaggle Top1% Solution: Predicting Housing Prices in Moscow
Kaggle Top1% Solution: Predicting Housing Prices in Moscow Kaggle Top1% Solution: Predicting Housing Prices in Moscow
Kaggle Top1% Solution: Predicting Housing Prices in Moscow
 
Data mining with caret package
Data mining with caret packageData mining with caret package
Data mining with caret package
 
Xgboost
XgboostXgboost
Xgboost
 
Streaming Python on Hadoop
Streaming Python on HadoopStreaming Python on Hadoop
Streaming Python on Hadoop
 
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its authorKaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
 
Xgboost
XgboostXgboost
Xgboost
 
Nyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expandedNyc open-data-2015-andvanced-sklearn-expanded
Nyc open-data-2015-andvanced-sklearn-expanded
 
Nycdsa ml conference slides march 2015
Nycdsa ml conference slides march 2015 Nycdsa ml conference slides march 2015
Nycdsa ml conference slides march 2015
 
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public dataTHE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
THE HACK ON JERSEY CITY CONDO PRICES explore trends in public data
 
Max Kuhn's talk on R machine learning
Max Kuhn's talk on R machine learningMax Kuhn's talk on R machine learning
Max Kuhn's talk on R machine learning
 
Winning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen ZhangWinning data science competitions, presented by Owen Zhang
Winning data science competitions, presented by Owen Zhang
 
Using Machine Learning to aid Journalism at the New York Times
Using Machine Learning to aid Journalism at the New York TimesUsing Machine Learning to aid Journalism at the New York Times
Using Machine Learning to aid Journalism at the New York Times
 
Introducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with rIntroducing natural language processing(NLP) with r
Introducing natural language processing(NLP) with r
 
Bayesian models in r
Bayesian models in rBayesian models in r
Bayesian models in r
 

Open data now english

  • 1. + Open Data 500 Study Contributed by SupStat Inc. Data Scientist Team
  • 2. + Outlines  Open Data Now  Open Data 500  List of Cities Which Have Open Data Portals  Companies: including 10 companies in NY 6 companies in the United States excluding NY 5 companies in the U.K. 1 company in Shanghai, China 1 company in Taiwan, China
  • 3. + Open data Now “Today’s Open Data revolution is rapidly leading us into new territory.” “Open Data is becoming a secret to success for smart business leaders around the world.”
  • 4. + Open data Now  “Open Data can best be described as accessible public data that people, companies, and organizations can use to launch new ventures, analyze patterns and trends, and solve complex problems.”  In terms of the similarity and difference between Big Data and Open Data, it is obviously that the introduction about Big Data could not represent the scenario of Open Data.
  • 5. + Open data Now  “This book describes the business applications of Open Data with examples from dozens of companies.”  This book reflects the vision, insights, knowledge, and advice of the leaders in this field who have been interviewed by the author.  This book also “contains several resources to help readers explore the possibilities of Open Data”.
  • 6. + Open Data 500  Open Data 500 is an initiative to research the U.S. companies and organizations that make use of Open Data published by government in an innovative way to develop new businesses.  The research is funded by Knight Foundation, a foundation aiming at promote journalism & media innovation, advance community engagement, and forster the arts.  Governance Lab ("GovLab"), located at New York University, is responsible for conducting the research. GovLab is a platform that brings togother innovators from different backgrounds to collabratively seek new-technology based solutions for better governance.  Open Data 500.com is the website that publishes research outcomes and collects new information about such companies.
  • 7. + Open Data 500  Three established goals of the Open Data 500 research a. Provide a basis for assessing the economic value of government open data b. Encourage the development of new open data companies c. Foster a dialogue between government and business on how government data can be made more useful
  • 8. + Open Data 500  Blueprint a. Domestically, initiate a roundtable series to invovle both government, Open Data providers, and businesses and organizations, Open Data users, to communicate on potential improvement of Open Data. b. Internationally, cooperate with international organizations and governments from other countries to copy U.S. paradiam to worldwide.
  • 9. + Open Data 500  Strategies of ongoing incoporation of other companies that use Open Data a. Outreach campaign: through mass emails, professional meetings, various social media b. Expert advice: through industry practioners and lab advisors c. Research: through other sources of identification, like Online Open Data Userbase
  • 10. + Open Data 500  OpenData500.com has filtering functionality that can filter recorded companies based on industry, location of state, data sources of agency.
  • 11. + What will we cover  Company’s background  Data sources  What do they do with the data  How do they serve the clients with the data  Profit model  Awards/current condition
  • 12. + Companies to be introduced NY: US UK China Enigma Archimedes Inc. Mastodon C Big Data Bureau ZocDoc SigFig Locatable (Shanghai) Honest Buildings PsychSignal Open Corporates Open Data Alliance Capial Cube CARFAX DueDil (Taiwan) Bloomberg Zillow CarbonCulture Aidin Brightscope Calcbench Consumer Reports GetRaised Palantir
  • 13. + Enigma.io company’s background The connotation of the company’s name: “Both in honor of the code-breaking machine developed by computer pioneer Alan Turing during World War II and because they were finding that too much public data was more enigmatic than it should be.”  Team members and size: Now a little over a dozen people, who are all based in New York.
  • 14. + Enigma.io data sources  Their datasets included both government data and data from clients, such as Nike’s public list of its suppliers.  Their data deluge had grown to 100,000 datasets and more than 20 billion individual data points.
  • 15. + Enigma.io what do they do with the data  Take valuable public data and make it much more usable.  Make it possible to search through the entire dataset at a rapid rate.  “Develop a robust data resource and an impressive set of tools”.
  • 16. + Enigma.io how do they serve the clients with the data  Help clients find the publicly-accessible information that is relevant to their corporations or economic accessibility.  Take data in all kinds of formats and put them into easily usable form of entire datasets.  Turn public sector information into Open Data.
  • 17. + Enigma.io profit model  For now, Enigma’s profit mainly come from charges for accessing to data, though they charge more for hedge funds and less for academics, nonprofits, or government agencies.  Enigma wants to, eventually, make their data accessible to public, and gain profit from analytic and other premium service.
  • 19. + ZocDoc company’s background  Founded in 2007 with a mission of improving access to healthcare, ZocDoc is a free service that allows patients to find a nearby doctor or dentist.  Cyrus Massoumi, CEO of ZocDoc said: "After I ruptured my eardrum on a flight, I couldn't find a doctor for four days. I knew that there had to be an easier way for patients to find doctors. That was when I had the idea for ZocDoc."
  • 20. + ZocDoc data sources  They collect information for patients to share with health providers.  In the mean time obtain information from doctors such as location, specialty, insurance preferences, and patient reviews to assist patients with decision making.
  • 21. + ZocDoc what do they do with the data  Encrypt data to the same standards that banks use to safeguard your financial information.  Constantly analyze data to better understand who uses ZocDoc and how they can improve it.
  • 22. + ZocDoc how do they serve the clients with the data  Help find nearby doctors and dentists who accept their insurance, see their real-time availability, and instantly book an appointment via ZocDoc.com or ZocDoc’s free apps for iPhone or Android.  Guarantee the patients to get access to cure within 24 – 72 hours.
  • 23. + ZocDoc profit model  The healthcare providers who partner with ZocDoc pay a subscription fee for ZocDoc's service. Since they help increase the efficiency of their practices.
  • 24. + ZocDoc awards  Time Magazine 50 Best Websites 2012  Business Insider App 100: World's Greatest Apps 2013  Fortune Magazine Best Small & Medium Companies 2012, 2013  Crain’s Best Places to Work in NYC 2010, 2011, 2012, 2013  Modern Healthcare Best Places to Work 2011, 2012, 2013  Arizona Business Magazine Most Admired Companies 2013
  • 25. + Honest Buildings company’s background  Founded in 2011, Honest Buildings now has around 30 employees.  Honest Buildings is a commercial real estate marketplace that connects top building professionals with building owners, decision makers and project managers.  It is a social media application for real estate.
  • 26. + Honest Buildings data sources  Honest Buildings collects information that is posted by building professionals, owners, tenants, and other stakeholders.
  • 27. + Honest Buildings what do they do with the data  “Big data is great, but you need to do something with it. Platforms that contextualize data are in great demand.” - HonestBuildings.com  Honest Buildings collects and posts all sorts of relevant data that is hard for users to find, like energy costs, walkability.
  • 28. + Honest Buildings how do they serve the clients with the data  Launches matching service that offers service providers step-by-step guidance to win a matched project a. Register your company b. Build a portfolio c. Get founded, add tags & connections d. Win new business.
  • 29. + Honest Buildings profit model  Honest Buildings’ provides matching services to developers for free.  It charges candidates, i.e. builders and contractors for a fee to match for developers.
  • 30. + Honest Buildings awards Grand Prize for Vrban, Disrupt NY 2014 Hackathon
  • 31. + Capital Cube company’s background  AnalytixInsight was founded in 2010, with about 10 employees.  Being the online portal of AnalytixInsight, CapitalCube.com is a global investor portal for comprehensive company analysis  Operate on-demand fundamental research, portfolio evaluation, and screening tools on over 40,000 global equities.
  • 32. + Capital Cube data sources  Securities and Exchange Commission (SEC)  Third-party data providers a. FactSet b. Thomson Reuters c. Capital IQ
  • 33. + Capital Cube what do they do with the data  Develop a software that can capture 40,000 internationally-operated public companies’ data on everyday basis.  Transform the data into word reports and graphics that investors can use to compare companies and make investment decision.
  • 34. + Capital Cube how do they serve the clients with the data  Investment tools  Analytical service a. Dividend analysis b. Earnings quality analysis  Strength a. Large coverage of 40,000 companies b. Timeliness on daily basis
  • 35. + Capital Cube profit model  License out its produced content and ability to dealers. Two big deals of such are with: a. Samsung: every time Samsung users download CapitalCube’s quote, AnalytixInsight gets revenue share. b. Dow Jones: AnalytixInsight gets paid for every page view of its content that is posted on Dow Jones’s published pages.
  • 36. + Capital Cube awards  Oct. 28, 2013: recognition as a company using federal open data in innovative and exciting ways, the White House Office of Science and Technology Policy (OSTP) and the Science and Technology Policy Institute (STPI).
  • 37. + Bloomberg company’s background  Founded in 1982.  The global business and financial information and news leader, gives influential decision makers a critical edge by connecting them to a dynamic network of information, people and ideas.  Strength – delivering data, news and analytics through innovative technology, quickly and accurately.
  • 38. + Bloomberg data sources  Large public corporations’ reports about their sustainability data.
  • 39. + Bloomberg what do they do with the data  ESG (Environmental, Social, and Governmental) data on the Bloomberg Terminal is fully integrated into all of Bloomberg Terminals’ analytics.
  • 40. + Bloomberg how do they serve the clients with the data  Large proportion of public companies are now reporting their sustainability data, a set of metrics innovated by Global Reporting Initiative, to public. This is data that investors are increasingly caring about.
  • 41. + Bloomberg profit model  As a data-driven media company, Bloomberg has journalists all around to congregate financial and business real-time information.  The company operates over 300,000 terminals which receive financial data released by the company.  This business generates 6.3 billion US dollars as estimated in 2008.
  • 42. + Bloomberg awards  Bloomberg is on track for record revenues of $8.3 billion in 2013 and profits of about $2.7 billion
  • 43. + Aidin company’s background  Aidin was founded in 2011, and has a scale of less than 10 employees.  The foundation of the company was inspired by a troublesome experience of the Aidin’s founder’s family about the trivial matters they had to deal with when a family member was processing the discharge from hospital.  It is missioned to bring transparency to healthcare facilities and provide patients with information that they need when making healthcare-related decisions.
  • 44. + Aidin data sources  Government Open Data.  Patients’ review about hospitals.  “First with weather and GPS data and now with health data, the U.S. government has defined its responsibility as defining, gathering, and presenting data on important subjects in easily usable forms.”
  • 45. + Aidin what do they do with the data
  • 46. + Aidin how do they serve the clients with the data  Aidin provides post-acute healthcare facilities with data to improve their services.  Patients can also choose facilities based on Aidin’s information.
  • 47. + Aidin profit model  Aidin profit from their products, including the full Aidin solution, Aidin Lite, and Aidin Provider
  • 48. + Aidin awards  Grand Prize Winner, GE's Hospital Quest competition.
  • 49. + Calcbench company’s background  Founded in 2011, Calcbench is the first company of its kind to fully harness the power of the new government mandated data standard XBRL (Extensible Business Reporting Language), yielding an unprecedented direct line into the SEC’s corporate financial data repository.  Calcbench currently has less than 10 employees.
  • 50. + Calcbench data sources  XBRL is a freely available and global standard for exchanging business information  XBRL reports
  • 51. + Calcbench what do they do with the data  After Calcbench’s processing, it turns public-accessible but hard-to-use XBRL (Extensible Business Reporting Language) data into more detailed and insightful data, thus adding up the value to the data.
  • 52. + Calcbench how do they serve the clients with the data  Calcbench processes the raw data mandatorily- collected by government through XBRL and turns it into usable information to the financial industry.  Calcbench also uses technologies to help SEC elevate data quality like in terms of error identification and correction.
  • 53. + Calcbench profit model  Calcbench serves corporate reporting professionals, corporate finance leaders, auditors, investment researchers, and academies.  It charges fee for using its service and provides premium suite to generate revenue.
  • 54. + Calcbench awards  Grand Prize Winner, XBRL Challenge at the XBRL and Financial Analysis Technology Conference, February 29, 2012.
  • 55. + Consumer Reports company’s background  Founded in 1936, Consumer Reports is an expert, independent, nonprofit organization whose mission is to work for a fair, just, and safe marketplace for all consumers and to empower consumers to protect themselves.  It has over 500 employees.
  • 56. + Consumer Reports data sources  Hospital Compare, a federal website  Data from various states in the U.S.
  • 57. + Consumer Reports what do they do with the data  Consumer Reports produces safety ratings for more over two thousand hospitals, based on Open Data sources.
  • 58. + Consumer Reports how do they serve the clients with the data  Improve American people’s lives by using public- accessible data.  Consumer Reports’ safety ratings can assist customers with decision making on which hospital to go.
  • 59. + Consumer Reports profit model  Consumer Reports is a non-profit organization. It does not accept advertisements nor have shareholders, but depends solely on subscribers’ fee.
  • 60. + Consumer Reports awards  Sigma Delta Chi Award for Public Service, Society of Professional Journalists, 2008  Honorable Mention, National Press Club for Caution: The Secret Score Behind Your Auto Insurance, 2007  People’s Voice Award, International Academy of Digital Arts and Sciences, 2006  Golden Triangle Award, American Academy of Dermatology, 2005
  • 61. + GetRaised company’s background  GetRaised was founded in 2010, with less than 10 employees now.  GetRaised aspires to be: a bunch of complicated data that is hidden behind a very simple, easy-to-use interface that can help narrow the wage gap and help people get paid more.
  • 62. + GetRaised data sources  Data from the U.S. Bureau of Labor Statistics  Users’ input  Various online job postings
  • 63. + GetRaised what do they do with the data  GetRaised uses the collected data to develop a salary engine.  It has created raise request based on the analysis of experts from related fields, like HR and research institutes.
  • 64. + GetRaised how do they serve the clients with the data  GetRaised provides customers with information on how much they should be paid, was underpaid, and should ask for an increase in salary.  A significant proportion of women who request for salary raise based on the information of GetRaised got their raise eventually.
  • 65. + GetRaised profit model  As a non-profit, GetRaised provides free service to users.  It receives support from two organizations and four individuals.
  • 66. + GetRaised awards  According to Dave Clarke, a communications strategist, “81% of women that have used GetRaised have, in fact, earned a raise. The average raise amount across all users (male and female) is $6,473”.
  • 67. + Palantir company’s background  Palantir was founded in 2004, with over 500 employees today  It develops software which professionals from different industries and sectors can perform analysis on massive disparate data  Its product and service helps combat terrorism, prosecuting crimes, fighting fraud, and eliminate waste
  • 69. + Palantir what do they do with the data  Various types of data is entered by users into Palantir’s software, and it will generate reports and analysis that users can understand directly. It’s not only artificial intelligence, but something called intelligence augmentation.
  • 70. + Palantir how do they serve the clients with the data  Help government analyze its open data in order to reflect government expenditure and possible flaws.  Combat people trafficking by analyzing data.
  • 71. + Palantir profit model  Investment from shareholders  Sale of software  Reported to be valued at $9 billion by the end of 2013
  • 72. + Palantir awards  VAST2009 Visual Analytics Award, 2009  Hall of Innovation Award, J. P. Morgan Chase Technology Innovation Symposium, October 2010
  • 73. + Archimedes Inc. company’s background  Began as part of Kaiser Permanente in 1993 and split of as a second company in 2006.  It has decades of experience in developing algorithms and predictive models for healthcare.
  • 74. + Archimedes Inc. data sources  Centers for Medicare & Medicaid Services (CMS).  Databases of clinical trials.  National Health and Nutrition Examination Survey.
  • 75. + Archimedes Inc. what do they do with the data  They use the Archimedes Model as its core tool to analyze data.  Archimedes scientists have analyzed many various relevant data to derive hundreds of equations that represent the effects of multiple diseases, tests, and treatments.  The equations are integrated into a single, large- scale simulation model using object-oriented programming.
  • 76. + Archimedes Inc. how do they serve the clients with the data  They provide a suite of online healthcare simulation and analytics tools (ARCHeS) that provide answer to questions about the health outcomes and economic effects of different interventions.  They also have a product (IndiGo) that generates individual guidelines that identify and help prioritize the best preventive care for each patient.
  • 77. + Archimedes Inc. profit model  They make a profit by selling their products: ARCHeS (Archimedes Healthcare Simulator) and IndiGo (Individualized Guidelines and Outcomes).  They also provide modeling and consulting services, including modeling of disease and new intervention, and analyzing different prevention strategies.
  • 78. + Archimedes Inc. awards  Jun 6, 2012: IndiGO Receives Best of Care Applications Award at the 2012 Health Data Initiative III.
  • 80. + SigFig data sources  Historical fundamentals & Price data  Data on load fees comes from Lipper  Portions of their advice rely on third-party market data from companies like Lipper, Thomson- Reuters, Interactive Data, and Xignite.
  • 81. + SigFig what do they do with the data
  • 82. + SigFig how do they serve the clients with the data  A single Portfolio Dashboard
  • 83. + SigFig profit model  “The company doesn’t charge a management fee. Instead, it earns its revenue through publishing arrangements with several websites and referral fees when users go to a new broker.”
  • 85. + PsychSignal company’s background  Launched by SmogFarm, another startup that analyzes large open datasets.
  • 86. + PsychSignal data sources  Data is pulled from social media conversations.
  • 87. + PsychSignal what do they do with the data  Scour the online conversation looking for distinct psychological expressions of emotion or attitude.  Aggregate millions of expressions to arrive at a picture of crowd mood in real time.  Built an advanced proprietary sentiment engine.  Use NASA developed signal processing.
  • 88. + PsychSignal how do they serve the clients with the data
  • 89. + PsychSignal profit model  NASA recently open sourced an algorithm our engineers found particularly useful.  Day to day sentiment is extremely noisy, making it hard to determine subtle changes in trending sentiment data.  These NASA algorithms help uncover the trend beneath the noise.
  • 91. + CARFAX company’s background  Provide vehicle history information, used by millions of consumers each year.  Receive millions of visitors each month.  With thousands of auto dealers nationwide subscribers.
  • 92. + CARFAX data sources  U.S. motor vehicle agencies  Canadian provincial motor vehicle agencies  Auto auctions  Collision repair facilities  Service/ maintenance facilities  Insurance companies  Salvage auctions  Automotive recyclers  Rental/fleet vehicle companies  State inspection stations
  • 93. + CARFAX what do they do with the data  Format the data into reports containing the following information: a. Title information, including salvaged or junked titles b. Flood damage history c. Total loss accident history d. Odometer readings e. Lemon history f. Number of owners g. Accident indicators, such as airbag deployments h. State emissions inspection results i. Service records j. Vehicle use (taxi, rental, lease, etc.)
  • 94. + CARFAX how do they serve the clients with the data  Provide reports at a price.  Provide information regarding used cars to sell and car deals.  Also provide car facts research on various makes and models of different cars.
  • 95. + CARFAX profit model  Make a profit mainly by selling their reports.  Also offer CARFAX Hot Listings™ and Safety & Reliability Ratings™ products.
  • 96. + CARFAX awards  Has the most comprehensive vehicle history database available in North America  One of the top five websites that consumers turn to for vehicle information
  • 97. + Zillow company’s background  Founded in 2005, the company is headquartered in Seattle with offices in New York, San Francisco, Chicago, Irvine, Calif. and Lincoln, Neb.  It provide consumers with information and tools to make smart decisions about homes, real estate, and mortgages  “At Zillow, we built our business taking public real estate information that was previously only accessible by spending hours in dusty registry of deeds making it easily accessible to consumers, for free.”
  • 98. + Zillow data sources  Freely available data from: a. The Bureau of Labor Statistics. b. The Federal Housing Finance Agency. c. The Census Bureau.
  • 99. + Zillow what do they do with the data  Collected open government data and build a living database of more than 110 million U.S. homes.  Republished the data on their website to provide users with various information regarding homes, real estate, and mortgages.
  • 100. + Zillow how do they serve the clients with the data  Provide search for houses for sale, for rent, and pre-market, as well as information for buyers and lenders.  Provide an open, anonymous, and free marketplace for borrowers and lenders.  Zestimate, Zillow's estimated market value for an individual home, a starting point in determining a home's value.
  • 101. + Zillow profit model  Zillow operates mainly by advertising.  It operates the largest real estate and rental advertising networks in the U.S. in partnership with Yahoo! Homes.  “The company was founded in 2005, had over $66 million in revenue when they launched an IPO in 2011, and had a valuation of $2.3 billion in 2013.”
  • 102. + Zillow awards In 2013  November 17: Zillow CEO Spencer Rascoff named the EY National National Entrepreneur of the Year™ 2013 Services Award.  April 30: Zillow sweeps the real estate category in the 17th Annual Webby Awards, winning the People's Voice Award and the overall Webby Award in the Real Estate category.  October 10: Zillow wins the Mortgage Technology Award for its Zillow Mortgage Marketplace iPhone App.  June 13: Zillow honored at the Webby Awards as the People's Voice Winner and the overall Webby Award winner for the Real Estate category.  May 4: Zillow wins Webby Award as Best Real Estate site.
  • 103. + Brightscope company’s background  BrightScope®, Inc. is a financial information company use data to drive better decision-making for individual investors, corporate plan sponsors, asset managers, etc.  It primarily operates in two major segments: Retirement Plans and Wealth Management.
  • 104. + Brightscope data sources  Data on Form 5500 from Department of Labor.  The founders, Mike and Ryan Alfred, actually persuaded the Department of Labor to begin collecting and publishing the Form 5500 data on the Internet.
  • 105. + Brightscope what do they do with the data  The gathered and republished data to make it more clear and informative, and more available to the public.
  • 106. + Brightscope how do they serve the clients with the data  Provide ratings of 401k and 403b plans across critical metrics.  Launched the first comprehensive and publicly available directory of Financial Advisors.  Provides free white papers for the public that examine trends in the Defined Contribution (DC) market.
  • 107. + Brightscope profit model  Make a profit by selling their market intelligence product called Beacon and their sales intelligence product for retirement plans called Spyglass.  Sell research papers with detailed investment and provide data on 50,000 Defined Contribution (DC) plans.
  • 108. + Brightscope awards  The first to convince the Department of Labor to begin collecting and publishing the Form 5500 data and utilized that data to make a profit.  The first comprehensive and publicly available directory of Financial Advisors.
  • 109. + Mastodon C company’s background  Mastodon C is a Big Data analytics company.  It was the first of its kind to join the ODI (Open Data Institute) incubator.  CEO Fran Bennett spent years working for search engines, help them to turn data into money.
  • 110. + Mastodon C data sources  Their main data source is the big sets of Open Data from the U.K.’s (NHS) National Health Services.
  • 111. + Mastodon C what do they do with the data  Use cloud computing to analyze data.  Use Hadoop and Cassandra technologies to integrate real time sensor data, web service data and spreadsheets from their clients.  One of their main signatures is analyzing data on zero carbon infrastructure.
  • 112. + Mastodon C how do they serve the clients with the data  Turn messy data, either open data or clients’ proprietary data, into useful insights.  Serve data into a format that makes sense.
  • 113. + Mastodon C profit model  “Mastodon C has just done a government-funded analytics of variations in prescribing patterns across the United Kingdom, finding areas where expensive drugs are being prescribed for no apparent reason when generics would work as well.”
  • 114. + Mastodon C awards  Winner of Best Hack and the PeopleFund. it Award at the London Green Hackathon 2012.
  • 115. + Locatable company’s background  Another company in the ODI incubator.  At first they wanted to help people find where to live and built a website to provide data for such decisions, now they changed their website to assist people in managing their home.  Founders Vasanth Subramanian and David Prime are both former physics students.
  • 117. + Locatable what do they do with the data  Collect from all sources and integrate them into one dashboard.
  • 118. + Locatable how do they serve the clients with the data  Provide their clients with a single dashboard to manage their home and optimize their performances.  Provide visibility across all the most expensive cost including mortgage, insurance, utility bills and maintenance costs.  Help their clients find cheaper deals for all these services.
  • 119. + Locatable profit model  A similar business model to a property portal which generates leads for estate agents.  Steps to search: a. Customers log on and enter the locations that they want to live near. b. The results show customers the places which best fit the bill and the properties available. c. The website refers customers to sites and charges an affiliate fee.
  • 120. + Locatable current condition  As ODI analyzes, their site “currently caters for the London area but the team are working on rolling it out across the UK”.  For example, “in 2 months, Locatable has attracted more than 4,000 unique visitors, with users offering some great feedback.”
  • 121. + OpenCorporates company’s background  The company behind OpenCorporates is Chrinon Ltd, and the people who founded it are Chris Taggart and Rob McKinnon.  Rob built TheyWorkForYou.nz and WhosLobbying.com.  Chris built OpenlyLocal.com and OpenCharities, and sits on the UK Government's Local Public Data Panel, the UK's Tax Transparency Board, and Open Knowledge Foundation's open government working group.
  • 122. + OpenCorporates data sources  They source the information in their databases from government and other sources through a variety of means including: a. directly from government websites and APIs, b. from publicly available datasets, c. or through Freedom of Information requests.
  • 123. + OpenCorporates what do they do with the data  As Chris explains, “we take messy data from government websites, company registers, official filings and data released under the Freedom of Information Act, clean it up and using clever code make it available to people.”
  • 124. + OpenCorporates how do they serve the clients with the data
  • 125. + OpenCorporates profit model  While Open Corporates releases all its findings for free as Open Data under an open license, its business model includes offering additional paid services.
  • 127. + DueDil company’s background  Launched in 2011, this startup called Duedil - derived from “due diligence” – “is building an ambitious database on the other side of the Atlantic”.  Their goal is to provide requisite information for lenders to invest in small and medium-size companies with confidence.
  • 128. + DueDil data sources  “Much of Duedil is built on Open Data, like data from Companies House, the United Kingdom’s central corporate registry.”  They also use some proprietary data sources.
  • 129. + DueDil what do they do with the data  They aggregated more that 20 years of digitized financial record and made it available on their website
  • 130. + DueDil how do they serve the clients with the data  They provides data on small-to-medium-sized companies in the hope of encouraging hundreds of billions of dollars in new investment.
  • 131. + DueDil profit model  They use Open Data to fill the information gap between small business owners and the investors.  They “add value through its analysis and functionality rather than by having exclusive rights to any dataset”.  They also “serve as a platform where SMEs can provide information about themselves look for potential business partners, and develop the groundwork for productive dearls”.
  • 132. + DueDil awards  They has been nominated for numerous startup company awards, including: a. being shortlisted for two Guardian Digital Innovation 2012 awards, b. finalist in the Orange Innovation Award in the National Business Awards 2012, c. named as one of '31 to Watch' in Outsell’s Information Industry Outlook 2012: Break and Reset report.
  • 133. + CarbonCulture company’s background  It is “a digital start-up that was launched in 2009”.  “Luke Nicholson, the founder and Director of CarbonCulture, is a social entrepreneur with a background in design communications and sustainable innovation.”  Now, “the team is made up of four full-time employees and several part-time staff, with a broad network of associates, partner businesses and NGOs.”
  • 134. + CarbonCulture data sources  Data is collected from an organization’s Building Management Systems and its Automated Meter Reading.  Luke Nicholson says: “We can integrate with any system, as well as with buildings that do not have automated meter readings. We publish open data for a number of government departments, local authorities and universities, and are working now with corporate customers to do the same for them.”
  • 135. + CarbonCulture what do they do with the data  They were “inspired by a huge global challenge – to accelerate sustainable transformation at a large scale and to use digital technology and great design to make it happen.  They use high-tech metering to monitor carbon use in the workplace.
  • 136. + CarbonCulture how do they serve the clients with the data  It helps clients use that data to make better decisions around energy usage and sustainability, enabling them to realise cash savings.  It also places great emphasis on design and user interface, enabling people within an organisation to connect, so that employees develop a shared understanding of sustainability.
  • 137. + CarbonCulture profit model  Measure and report on clients’ carbon and energy performance, publishing it in real time online as well as in workplace receptions and intranets.  Develop apps for clients that allow people and buildings to work together to make savings.
  • 138. + CarbonCulture awards  It delivered much higher engagement and energy savings than expected - 40% staff take up and a 10% saving in gas usage - leading to CarbonCulture being deployed in seven more government departments.
  • 139. + Big Data Bureau Shanghai, China  Shanghai Municipal Commission of Economy and Informatization is now preparing a Big Data Bureau to share government data and information.  On April 30, 2014, Shanghai Public Credit & Information Platform was open to the public at the first time.  But there are still many challenges for the government.
  • 141. + Taiwan’s Open Data Alliance  The UK’s Open Data Institute (ODI) and Taiwan’s Open Data Alliance (ODA) signed a Letter of Intent on 11 December 2013  They will collaborate on a range of activities:  Sharing expertise, knowledge and best practice  Carrying out collaborative projects  Designing support and collaboration systems for open data driven businesses  Developing open data technologies

Editor's Notes

  1. A lemon is a car, often new, that is found to be defective only after it has been bought. Any vehicle with numerous, severe issues can be termed a "lemon," and, by extension, any product with flaws too great or severe to serve its purpose can be described as a "lemon".
  2. Hot listing is
  3. The biggest data sets the Locatable team currently use are public transport related: National Rail, London Underground and Tramlink. Their next step is working on integrating schools data and crime statistics which are all open data sets.