Digital Enterprise Research Institute                                                            www.deri.ie              ...
AgendaDigital Enterprise Research Institute                          www.deri.ie       n    Motivation - Financial Data E...
Financial Data EcosystemDigital Enterprise Research Institute                www.deri.ie                    Information   ...
Financial Information ProvidersDigital Enterprise Research Institute                          www.deri.ie      n    Indiv...
Various Data FormatsDigital Enterprise Research Institute                                         www.deri.ie      n    U...
Financial Information ConsumersDigital Enterprise Research Institute                                        www.deri.ie   ...
GoalDigital Enterprise Research Institute                       www.deri.ie      n    Integrate data for:             ¨ ...
Converging Financial Data from Multiple         SourcesDigital Enterprise Research Institute                              ...
Converging Financial Data from Multiple         SourcesDigital Enterprise Research Institute                              ...
Data Integration ApproachDigital Enterprise Research Institute                            www.deri.ie      n    Lifting d...
Data Integration ApproachDigital Enterprise Research Institute   www.deri.ie
ArchitectureDigital Enterprise Research Institute   www.deri.ie
Identity MismatchDigital Enterprise Research Institute                                      www.deri.ie  n    Need to con...
Data QueryDigital Enterprise Research Institute                                   www.deri.ie  n    SPARQL, the semantic ...
Data Integration ChallengesDigital Enterprise Research Institute                                    www.deri.ie  n    Tex...
Data Integration ChallengesDigital Enterprise Research Institute                                          www.deri.ie     ...
Data Integration ChallengesDigital Enterprise Research Institute                                        www.deri.ie      n...
RecommendationsDigital Enterprise Research Institute                                         www.deri.ie      n    Agree ...
Digital Enterprise Research Institute                                         www.deri.ie                                 ...
Upcoming SlideShare
Loading in …5
×

Challenges Ahead for Converging Financial Data

6,229 views

Published on

Consumers of financial information come in many guises from personal investors looking for that value for money share, to government regulators investigating corporate fraud, to business executives seeking competitive advantage over their competition. While the particular analysis performed by each of these information consumers will vary, they all have to deal with the explosion of information available from multiple sources including, SEC filings, corporate press releases, market press coverage, and expert commentary. Recent economic events have begun to bring sharp focus on the activities and actions of financial markets, institutions and not least regulatory authorities. Calls for enhanced scrutiny will bring increased regulation and information transparency While extracting information from individual filings is relatively easy to perform when a machine readable format is utilized (for example, using XBRL, the eXtensible Business Reporting Language), cross comparison of extracted financial information can be problematic as descriptions and accounting terms vary across companies and jurisdictions. Across multiple sources the problem becomes the classical data integration problem where a common data abstraction is necessary before functional data use can begin. Within this paper we discuss the challenges in converging financial data from multiple sources. We concentrate on integrating data from multiple sources in terms of the abstraction, linking, and consolidation activities needed to consolidate data before more sophisticated analysis algorithms can examine the data for the objectives of particular information consumers (for e.g. competitive analysis, regulatory compliance, or investor analysis). We base our discussion on several years researching and deploying data integration systems in both the web and enterprise environments.

E. Curry, A. Harth, and S. O’Riain, “Challenges Ahead for Converging Financial Data,” in Proceedings of the XBRL/W3C Workshop on Improving Access to Financial Data on the Web, 2009.

Published in: Technology, Economy & Finance
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
6,229
On SlideShare
0
From Embeds
0
Number of Embeds
4,475
Actions
Shares
0
Downloads
5
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Challenges Ahead for Converging Financial Data

  1. 1. Digital Enterprise Research Institute www.deri.ie Challenges Ahead For Converging Financial Data Edward Curry1, Andreas Harth2, Sean O’Riain1 DERI, NUI Galway, Ireland1 2 Institut für Angewandte Informatik und Formale Beschreibungsverfahren (AIFB), Karlsruher Institut für Technologie (KIT)W3C Workshop on Improving Access toFinancial Data on the WebOctober 2009, Arlington, Virginia USA© Copyright 2009 Digital Enterprise Research Institute. All rights reserved.
  2. 2. AgendaDigital Enterprise Research Institute www.deri.ie n  Motivation - Financial Data Ecosystem ¨  Data Providers ¨  Data Formats ¨  Data Consumers n  Converging Financial Data from Multiple Sources ¨  Entity Centric Approach ¨  Architecture ¨  Identity Mismatch ¨  Data Query n  Data Integration Challenges n  Recommendations
  3. 3. Financial Data EcosystemDigital Enterprise Research Institute www.deri.ie Information Information Providers Consumers ?
  4. 4. Financial Information ProvidersDigital Enterprise Research Institute www.deri.ie n  Individuals: e.g. CEOs reporting equity sale n  Companies: e.g. 10-K filing n  NGOs: e.g. sector-wide lobbying groups n  Government: e.g. regulators, central banks, statistics offices n  Worldwide organisations: UN, OECD n  Academics: various economists, public policy n  ... n  Publicly available datasets, purchased datasets or in- house sources
  5. 5. Various Data FormatsDigital Enterprise Research Institute www.deri.ie n  Unstructed Text ¨  News articles, press releases, raw transcripts of investor calls n  Hypertext ¨  Coporate websites, goverment websites, ... n  Spreadsheets, et al. ¨  CSV files, word docs, pdf, powerpoint, ... n  Strucutred Data ¨  XML, XBRL, CSV, SDMX, ... n  Graph Structured Data in RDF ¨  DBPedia, CrunchBase, RSS-CB, ...
  6. 6. Financial Information ConsumersDigital Enterprise Research Institute www.deri.ie n  Competitive Analysis ¨  Mash-up of financial figures and analyst commentary for decision support n  Regulatory Compliance ¨  Forensic Economics ¨  Spotting patterns or conditions that support fraud or money laundering n  Investment Analysis ¨  Individual/Institutional investors ¨  Transparent fund comparisons ¨  Evaluate potential fund return
  7. 7. GoalDigital Enterprise Research Institute www.deri.ie n  Integrate data for: ¨  Central access ¨  Cross document analysis n  Our group works in data integration and have applied our approach to pilots in the financial services industry n  Report on experiences and lessons learned
  8. 8. Converging Financial Data from Multiple SourcesDigital Enterprise Research Institute www.deri.ie n  Provide common data platform for search, browsing, analysis, and interactive visualisations across sources n  Entity centric approach ¨  Single data view allowing information filtering and cross analysis ¨  Consolidate data into coherent graph mashed up from potentially thousands of sources n  Key challenge is semantic integration of structured and unstructured data from the open Web and internal corporate data sources
  9. 9. Converging Financial Data from Multiple SourcesDigital Enterprise Research Institute www.deri.ie n  Large graph of RDF entities n  Entities typed according to what they describe ¨  People, locations, organizations, publications as well as documents ¨  Inter-relations and structured descriptions of entities n  Entities have specified relations to other entities ¨  People can work for companies, people know other people, people author documents, organisations are based in locations, and so on
  10. 10. Data Integration ApproachDigital Enterprise Research Institute www.deri.ie n  Lifting data sources to common format, in our case RDF (Resource Description Format) n  Integrating the disparate datasets into a holistic dataset by aligning entities and concepts n  Run domain/task specific analysis algorithms on integrated data n  Interactive browsing and exploration of integrated data or results of algorithmic analysis
  11. 11. Data Integration ApproachDigital Enterprise Research Institute www.deri.ie
  12. 12. ArchitectureDigital Enterprise Research Institute www.deri.ie
  13. 13. Identity MismatchDigital Enterprise Research Institute www.deri.ie n  Need to connect sources that may describe the same data on a particular entity n  Case studies analyzing connections between people and organizations ¨  SEC filings (Form 4) identified 69K people connected to 80K organizations ¨  Same analysis on database describing companies produced 122K people connected to 140K organizations ¨  Data needed to be enrich and interlinked using entity consolidation (a.k.a. object consolidation) to avoid having the knowledge split over numerous instances ¨  Ontology-based disambiguation
  14. 14. Data QueryDigital Enterprise Research Institute www.deri.ie n  SPARQL, the semantic query language allows queries/ questions to be asked: ¨  What do the companies ‘Microsoft’ and ‘IBM’ have in common? ¨  What competitors of ‘HP’ are in ‘Arlington’? ¨  What’s the relationship between ‘Microsoft’ and ‘IBM’?
  15. 15. Data Integration ChallengesDigital Enterprise Research Institute www.deri.ie n  Text/Data Mismatch ¨  Human language often ambiguous ¨  Same company might be referred to in several variations (e.g. IBM, International Business Machines, Big Blue) ¨  Ambiguity makes cross-linking with structured data difficult n  Object Identity and Separate Schema ¨  Sources differ in how they state the same fact ¨  Differences on level of individual objects and schema ¨  SEC Central Index Key (CIK) to identify people (CEOs, CFOs), companies, and financial instruments ¨  DBpedia use URIs to identify same entities ¨  Methods have to be in place for reconciling different representations of objects and schema
  16. 16. Data Integration ChallengesDigital Enterprise Research Institute www.deri.ie n  Abstraction Levels (Data Context) ¨  Financial data sources provide data at incompatible levels of abstraction ¨  Classify data in taxonomies pertinent to a certain sector ¨  Differences in legislation on book-keeping (e.g. Indicators from Euro regulators may not be directly comparable with indicators from US-based regulators) ¨  Differences in geographic aggregation (e.g. region data from one source and country-level data from another, IBM Ireland Ltd, IBM Europe, IBM Global,…)
  17. 17. Data Integration ChallengesDigital Enterprise Research Institute www.deri.ie n  Data Quality ¨  General challenge integrating data from multiple sources ¨  Errors in signage, amounts, labelling, and classification can seriously impede utility of systems operating on such data ¨  Combining erroneous data aggravates the problem ¨  Within open environment data aggregator has little or no influence on the data publisher ¨  Challenge for data publishers/consumers to coordinate to fix problems in data or blacklist sites providing unreliable data
  18. 18. RecommendationsDigital Enterprise Research Institute www.deri.ie n  Agree approach to the specification and use of common identifiers or at least their mappings n  Adhering to common publishing method reduces integration effort and facilitates data reuse ¨  Linked Data principles n  Convergence between data providers requires coordination and time ¨  No need for “Big Bang” integration ¨  Follow a pay-as-you-go iterative approach to integration
  19. 19. Digital Enterprise Research Institute www.deri.ie Thank you for listening E. Curry, A. Harth, and S. O’Riain, “Challenges Ahead for Converging Financial Data,” in Proceedings of the XBRL/W3C Workshop on Improving Access to Financial Data on the Web, 2009.

×