Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Data Strategy
Aija Leiponen, Cornell University, aija.leiponen@cornell.edu
Which technology fields are the most influential?
Predicted control patent citations with co-listed technology fields
-101...
Why?
Invention machines:Applicable in many sectors; Facilitate invention in other sectors;A
broad and catalytic impact by ...
Web 3.0
Control instruments – sensors, indicators, logic devices,
actuators
+
Data – social, administrative, industrial, p...
Earlier communication revolutions
} Printing press
} Steam engine
} Telegraph
} Telephone
} Radio
} Television
} Networked...
Printing press
§ Johannes Gutenberg (Germany 1452)
§ Reading became accessible to common people
§ Fiction, entertainment, ...
Impact of the printing press
} 30 years later a printing shop in Florence run
by nuns charged 3 florins for 1000 copies of...
Expect societal changes due to Web 3.0
} Privacy needs to be defined
} Ownership of data
} Right to be forgotten – in/alie...
Information Economy vs. Data Economy
} Nonrival?
} Partially excludable?
} Experience good?
} High fixed cost/low or
const...
PUZZLE: How can data be commercially
exploited?
• Data are not intellectual property
– Individual data points have no lega...
Money vs. Data
} Data is viewed as the “new oil”, an asset class
} Digital currency is data on a fundamental level – strea...
Content vs. Data
} Both have (some) intrinsic value
} Both governed by copyright
} On a fundamental level, content IS data...
Economic features of digital goods –
all controversial and legally contested
Record Data Content Software Currency
Informa...
Characteristics of different data sources
Source of data Privacy
implications
Alienability Duration/
useful life
Sampling
...
Summary I
} The economics of data goods depend on an analysis of
data characteristics
} Data are very heterogeneous
} Desc...
Emergence of data markets?
} Data markets will work differently in different
industries
} The legal framework is evolving ...
Types of market matching mechanisms
Matching Marketplace
design
Terms of
Exchange
Examples
One-to-one Bilateral Negotiated...
Bilateral: Proprietary data vs. other IP
licenses
Data Patents Trademarks Copyrights
License duration 1-2 years 10-20 year...
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Contract type
Proprietary
License
Open Database
Comons
GNU
FOI / Open
Governme...
Multilateral: Centralized Data
Platform
• Selling data outside the
firm through the
platform
• Platform provider takes
the...
Common Pool Resources (Ostrom 1990)
} Costly but not impossible to exclude
potential beneficiaries from obtaining
benefits...
Decentralized Data Platform –
blockchain for data?
Aggregators
User content & sensor data
Tagging & Cleaning
Public
Ledger...
Decentralization tasks
Marketplace and data typology
Matching Marketplace
design
Transaction
costs
Provenance Boundary
definition
Rules
definitio...
Performance of centralized and
decentralized market designs
Centralized Decentralized
Thickness Variable depending on
the ...
Conclusions
} Platforms/multisided markets bring together multiple
different types of parties
} There are complementaritie...
Summary II
• Data really is a different kind of an intellectual asset
– Careful attention to technical, institutional deta...
Data strategy aija leiponen_01112016
Upcoming SlideShare
Loading in …5
×

Data strategy aija leiponen_01112016

436 views

Published on

Aija Leiponen's presentation at the Aalto University seminar on data governance, Nov 1, 2016

Published in: Business
  • Login to see the comments

  • Be the first to like this

Data strategy aija leiponen_01112016

  1. 1. Data Strategy Aija Leiponen, Cornell University, aija.leiponen@cornell.edu
  2. 2. Which technology fields are the most influential? Predicted control patent citations with co-listed technology fields -101234 patentcitations(loggedcoefs) 1920 1930 1940 1950 1960 1970 1980 1990 2000 2010 Computer Control Digicomms Control Data process Control Semiconductor Control Thermal ap. Control Transport Control Material Control Machine Control All control Patents listed in both digital comm & control Koutroumpis-Leiponen-Thomas: “Invention Machines: How Instruments and InformationTechnologies Drive Global Technological Progress”
  3. 3. Why? Invention machines:Applicable in many sectors; Facilitate invention in other sectors;A broad and catalytic impact by enabling follow-on invention in many application sectors; Generate massive knowledge spillovers over long periods of time • Control instruments • Digital communication • Computer technologies Instruments enable manipulation of material; computers enable manipulation of information Automation requires instrumentation Internet ofThings
  4. 4. Web 3.0 Control instruments – sensors, indicators, logic devices, actuators + Data – social, administrative, industrial, personal + Artificial Intelligence – algorithms, machine learning, prescriptive analytics = “SecondWave of the Second Machine Age” (Erik Brynjolfsson/MIT)
  5. 5. Earlier communication revolutions } Printing press } Steam engine } Telegraph } Telephone } Radio } Television } Networked data?
  6. 6. Printing press § Johannes Gutenberg (Germany 1452) § Reading became accessible to common people § Fiction, entertainment, propaganda § Mass education § Network externalities: availability of books à incentives to learn to read à demand for books
  7. 7. Impact of the printing press } 30 years later a printing shop in Florence run by nuns charged 3 florins for 1000 copies of Plato’s Dialogues, while a scribe would have charged 1 florin for 1 copy } Availability of paper from China à prices fell, demand increased } #books produced in 50 years following the invention = #books produced by European scribes in preceding 1000 years! } Fust was suspected in Paris to be in league with the devil – fear of novelty è How did the printing press change society, lifestyles, economy?
  8. 8. Expect societal changes due to Web 3.0 } Privacy needs to be defined } Ownership of data } Right to be forgotten – in/alienability } Intellectual property for data } Data security } Legal framework } Radical transparency } Real-time visibility } Data integration, inference, prediction } New business models, new platforms, new winners
  9. 9. Information Economy vs. Data Economy } Nonrival? } Partially excludable? } Experience good? } High fixed cost/low or constant marginal cost? } Yes } NOT excludable } Yes if no metadata No if proper metadata } Varies: exhaust data vs. data collected for a purpose Are data information goods?
  10. 10. PUZZLE: How can data be commercially exploited? • Data are not intellectual property – Individual data points have no legal protection • Essentially needs to be controlled contractually (secrecy, organization forms, product design, non-compete and confidentiality contracts), – Not via intellectual property rights • How can something so “leaky” be valuable, commercialized?
  11. 11. Money vs. Data } Data is viewed as the “new oil”, an asset class } Digital currency is data on a fundamental level – streams of bits } Currencies rely on trust in the medium – data have intrinsic value. } Increasing subjectivity of data goods as we go from raw to tagging/cleaning, aggregating, combining, processing } Provenance is hard to prove for data, currencies are verifiable } Non-exchangeability of data – there is no quantum of data with a minimum value } Nevertheless CS researchers starting to consider “data as money”; developing conceptual models of a “central bank for data”
  12. 12. Content vs. Data } Both have (some) intrinsic value } Both governed by copyright } On a fundamental level, content IS data and subject to analytics (Natural Language Processing) } But the value of record data largely comes from combination with other data and algorithms (models, statistics, prediction, deep learning…) } And as a result, copyright is very weak on data
  13. 13. Economic features of digital goods – all controversial and legally contested Record Data Content Software Currency Information Type Raw records or structured databases Knowledge (insights) Knowledge (instructions) Pure value Good Type Intermediate/ Final Final Final Final Alienability Variable Medium High High Inferability High Low Low Zero Excludability None Variable Variable High Fungibility Variable Low Low High Protection Method Secrecy Copyright Copyright or patents in some cases Blockchain or other verification technology Protection Aspect Reuse Expression (patterns) Expression (patterns) or insight (invention) Transaction value ?
  14. 14. Characteristics of different data sources Source of data Privacy implications Alienability Duration/ useful life Sampling frequency Inferrability Health care High Low (health, retail, social network, locational) >50 years Very low Low Public sector administration Medium Medium (public sector) – these usually have specific data protection protocols (confidential, etc) >50 years Low Low Manufacturing/ Operations (sensor networks) Medium Medium (manufacturing) - these usually have specific data protection protocols (confidential, etc) 10-20 years Medium Low Individual behavior High Low (health, retail, social network) 1-5 years High High Personal Locational Data Medium Medium 1-5 years Very high Medium
  15. 15. Summary I } The economics of data goods depend on an analysis of data characteristics } Data are very heterogeneous } Description, classification of data and its institutional framework is necessary for understanding its commercialization potential } Overall, data goods substantially differ from other information goods } Excludability (protection) } Transparency (metadata) } Alienability (ongoing implications for individuals) } Inferability (implications of data integration for individuals)
  16. 16. Emergence of data markets? } Data markets will work differently in different industries } The legal framework is evolving à data attributes } Competitive strategies & outcomes will depend particularly on the fungibility, excludability, alienability/inferability of the data in question } Business model design with determine profit potential of fungible, poorly excludable, alienable data
  17. 17. Types of market matching mechanisms Matching Marketplace design Terms of Exchange Examples One-to-one Bilateral Negotiated Data brokers One-to-many Dispersal Standardized Twitter API Many-to-one Harvest Implicit barter Google Services Many-to-many Multilateral Standardized or negotiated InfoChimps, Microsoft Azure “The (unfullfilled) promise of Data Marketplaces”, P. Koutroumpis,A. Leiponen, L.Thomas
  18. 18. Bilateral: Proprietary data vs. other IP licenses Data Patents Trademarks Copyrights License duration 1-2 years 10-20 years Up to 20 years 1-5 years Exclusivity Rare Frequent Often regional Rare Confidentiality Frequent Rare Rare Rare Use restrictions Abundant Concise Specific Concise Warranty ‘As is’ Frequent -- -- Obligation & remedy Correct/refund/replace/ update -- -- -- Audit Frequent -- -- -- Modal fee schedule Annual subscription % of sales or flat fee NA Per device “Data Contracts”, P. Koutroumpis,A. Leiponen, L .Thomas & J.Wu (2016)
  19. 19. 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Contract type Proprietary License Open Database Comons GNU FOI / Open Government 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Commercial use Not Noted No Commercial Use Permitted Commercial Use Permitted 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Data sharing Sharing Permitted Share Alike Not Noted No Sharing Academic 37 % Commerci al 19 % Governme nt 21 % Non-Profit 17 % Personal 4 % Internatio nal 2 % Dispersal: 366 Open Data Contracts (T&C)
  20. 20. Multilateral: Centralized Data Platform • Selling data outside the firm through the platform • Platform provider takes the risk, provides services, takes a cut • Technical challenges in standardization, rights management, • Strategic challenges in revenue sharing, chicken & egg etc Data Marketplace Data Providers Algorithm Providers Expert Advice Customer Customer Customer Complement Complement Supply Demand
  21. 21. Common Pool Resources (Ostrom 1990) } Costly but not impossible to exclude potential beneficiaries from obtaining benefits from use } CPR àTragedy of the Commons } Collective action resolvesTOTC and maintains resource if } Clearly defined boundaries identify legitimate users } Rules define how CPR should be used; metarules to change rules } Effective monitoring to enforce rules, boundaries
  22. 22. Decentralized Data Platform – blockchain for data? Aggregators User content & sensor data Tagging & Cleaning Public Ledger … transactionXX1 transactionXX2 transactionXX3 transactionXX4 transactionXX5 … Trading • “Bottom-up” approach in information exchange • Users and sensors collect data • Aggregators can buy/sell data for profit; data owners get paid and have control over future uses • Processing, analysis and insights are separate A D B C G F E HI “The (unfullfilled) promise of data marketplaces”, P. Koutroumpis, A. Leiponen & L .Thomas (2016)Processing
  23. 23. Decentralization tasks
  24. 24. Marketplace and data typology Matching Marketplace design Transaction costs Provenance Boundary definition Rules definition Effective monitoring Characteristics of data One-to-one Bilateral High High High High High High value, High privacy One-to-many Dispersal Low Low Low Low Minimal Low value, Low privacy Many-to-one Harvest Low Low Low Low Minimal Low value, Low privacy Many-to-many Multilateral Centralized Low Medium Medium Medium Low Medium value, Medium privacy Many-to-many Multilateral Decentralized Medium High Low High High High value, Medium privacy Data is no longer a Common Pool Resource!
  25. 25. Performance of centralized and decentralized market designs Centralized Decentralized Thickness Variable depending on the rules and membership/usage fees Assumed to have full participation Congestion Assumed to have minimal effect Assumed to have minimal effect Transaction costs Very low Increased friction for each transaction (can be limited by using trusted third-party licensing) Decentralized:Trading off market thickness against increasing (technical) transaction costs http://hackingdistributed.com/2016/08/04/byzcoin/ https://www.technologyreview.com/s/600781/technical-roadblock-might-shatter-bitcoin-dreams/
  26. 26. Conclusions } Platforms/multisided markets bring together multiple different types of parties } There are complementarities among the parties } Need to engage the different sides } Pricing and integration strategies may help in reaching critical mass for the platform } Successful platforms benefit from strong network effects and scale economies and can become very profitable …and very powerful } Monopolization of communication and information platforms can be societally harmful } Algorithmic transparency/monitoring will be necessary } How digital platforms are operationalized depends on the nature of the service/good provided, institutional setting, Digital Rights Management – IoT
  27. 27. Summary II • Data really is a different kind of an intellectual asset – Careful attention to technical, institutional detail is required! • Trading regimes: secrecy & trust or verification technology (blockchain?) – or ‘FREE’ – Bilateral trading sets up a complex relationship with remedies, audits, subscriptions as contractual features – Multilateral based on verification tech could be anonymous and one-off – probably for more high-value data due to computing cost • Continuing evolution in control technologies and Artificial Intelligence will be the “invention machines” of the 21st century – data will be the lubricant

×