Successfully reported this slideshow.
Your SlideShare is downloading. ×

Open Corporate Data: not just good, better

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 46 Ad

Open Corporate Data: not just good, better

Download to read offline

Presentation given by Chris Taggart, CEO and Co-Founder of OpenCorporates at Open Knowledge Festival, Geneva, September 2013

Discussing benefits and quality of open corporate hierarchy (network) data

Presentation given by Chris Taggart, CEO and Co-Founder of OpenCorporates at Open Knowledge Festival, Geneva, September 2013

Discussing benefits and quality of open corporate hierarchy (network) data

Advertisement
Advertisement

More Related Content

Similar to Open Corporate Data: not just good, better (20)

More from Chris Taggart (17)

Advertisement

Recently uploaded (20)

Open Corporate Data: not just good, better

  1. 1. Open Data Not Just Good. Better
  2. 2. Open Data is Good! http://www.flickr.com/photos/stolidsoul/433129708/sizes/o/in/photostream/
  3. 3. But we’re not the ones we need to convince http://okfestival.org/open-government-data-camp/
  4. 4. Most people don’t care about ‘open’ http://www.flickr.com/photos/erlin1/9312646298/sizes/l/in/photostream/
  5. 5. Even though open data is better (than closed/proprietary)
  6. 6. Even though open data is better (than closed/proprietary) • Better for innovation
  7. 7. Even though open data is better (than closed/proprietary) • Better for innovation • Better for competition
  8. 8. Even though open data is better (than closed/proprietary) • Better for innovation • Better for competition • Better for efficiency
  9. 9. Even though open data is better (than closed/proprietary) • Better for innovation • Better for competition • Better for efficiency • Better for sharing (esp cross- organisation or cross-border)
  10. 10. But open has a secret weapon http://www.flickr.com/photos/x-ray_delta_one/8493335701/sizes/l/in/photostream/
  11. 11. It’s better quality too http://www.flickr.com/photos/infusionsoft/4484373179/sizes/l/in/photostream/
  12. 12. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  13. 13. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  14. 14. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  15. 15. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  16. 16. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  17. 17. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  18. 18. Problem Cause Data accuracy Data is re-keyed. Few eyeballs. Often little downside to lying Gaps in data High (& often duplicated) cost of data entry. Limited to payers Lack of granularity Legacy systems/data models hard to reengineer in closed world Errors go uncorrected Few feedback mechanisms Black box/No provenance Can’t reveal (sometimes dubious) sources. Limits usefulness/trust Isolated Proprietary IDs are internal identifiers & are barriers to sharing & improved data quality Common proprietary data quality issues
  19. 19. A concrete example: corporate networks
  20. 20. Hugely important (and valuable) • The dataset we need to understand the corporate world • Who we (or the government) is really doing business with • Political influence/donations/lobbying • Tax/resource extraction • Corporate Governance • Credit risk
  21. 21. But proprietary datasets on this are problematic • Expensive, so relatively few users • Huge gaps in data • Uses proprietary IDs (so not clear what it’s refers to) • Restrictive licences • Opaque – no info re calculations, provenance or confidence
  22. 22. But proprietary datasets on this are problematic • Expensive, so relatively few users • Huge gaps in data • Uses proprietary IDs (so not clear what it’s refers to) • Restrictive licences • Opaque – no info re calculations, provenance or confidence Result: low-quality data
  23. 23. The open data alternative
  24. 24. The open data alternative Enabled by a grant from the Alfred P Sloan Foundation
  25. 25. Data from disparate public sources
  26. 26. finding new insights
  27. 27. no such company
  28. 28. ...and errorstoo no such company
  29. 29. What a modern financial company looks like (highly simplified & truncated views)
  30. 30. What a modern financial company looks like (highly simplified & truncated views)
  31. 31. What a modern financial company looks like (highly simplified & truncated views)
  32. 32. What a modern financial company looks like (highly simplified & truncated views) private unlimited company
  33. 33. Crowd-sourcing?
  34. 34. Ninja-sourcing! http://www.flickr.com/photos/danielygo/5531024732/sizes/l/in/photostream/
  35. 35. The company that wants to know your network... every friend... every interaction http://www.flickr.com/photos/jeffmcneill/5260815552/sizes/l/ why bother?
  36. 36. Facebook, Inc This is what we got from their SEC filings as text
  37. 37. Facebook, Inc (and turned into data) This is what we got from their SEC filings as text
  38. 38. Facebook, Inc Pinnacle Sweden AB Vitesse LLC Facebook Operations LLC Facebook Ireland Limited Edge Network Services Limited Andale Acquisition Corp (and turned into data) This is what we got from their SEC filings as text
  39. 39. Facebook Ireland Limited Edge Network Services Limited Pinnacle Sweden AB Vitesse LLC Facebook Operations LLC Andale Acquisition Corp Then we started investigating Facebook, Inc
  40. 40. Facebook Ireland Limited Edge Network Services Limited Then we started investigating Facebook, Inc
  41. 41. Facebook, Inc Facebook Ireland Limited Edge Network Services Limited
  42. 42. Facebook, Inc Facebook Ireland Limited Edge Network Services Limited Facebook Cayman Holdings Unlimited IV Facebook Cayman Holdings Unlimited II Facebook Cayman Holdings Unlimited lll Facebook Ireland Holdings Randomus Investments Limited Facebook International Holdings II Ltd Facebook International Holdings I Ltd Facebook Cayman Holdings Unlimited I
  43. 43. Want to help? jobs@opencorporates.com investigators@opencorporates.com

×