The document discusses building data science capabilities at Thomson Reuters. It introduces the Data Innovation Lab, which works on agile data science projects using lean sprints. The lab focuses on delivering insights to customers through proof-of-concepts and analytical models. It also discusses challenges in data monetization like finding reliable data sources and establishing rights to externalize data. Recent projects demonstrated include linking disparate data using PermIDs, predicting patent litigation risk through machine learning, and visualizing relationships between food insecurity and political instability.
Prof Shane Greenstein of Harvard Business School talks about his new book, How the Internet Became Commercial, at the Digital Initiative's Future Assembly.
Looking at what is driving Big Data. Market projections to 2017 plus what is are customer and infrastructure priorities. What drove BD in 2013 and what were barriers. Introduction to Business Analytics, Types, Building Analytics approach and ten steps to build your analytics platform within your company plus key takeaways.
To view recording of this webinar please use the below URL:
http://wso2.com/library/webinars/2016/08/analytics-as-your-business-edge/
Data is the new oil! For most enterprises, data is the oil you’ve been sitting on without realizing its value. You can gain many useful insights from data that lead to new and better products and operations, enables new user experiences, allows better understanding of customers, makes interactions seamless and enables new pay per use business models and dynamic pricing models. Furthermore, data itself can be monetized. Enterprises can broker interactions between end users as done in digital advertising or sell insights to third parties in anonymized forms. Just like in Google and Facebook, data can be a primary asset that organizations collect as part of their operations.
Prof Shane Greenstein of Harvard Business School talks about his new book, How the Internet Became Commercial, at the Digital Initiative's Future Assembly.
Looking at what is driving Big Data. Market projections to 2017 plus what is are customer and infrastructure priorities. What drove BD in 2013 and what were barriers. Introduction to Business Analytics, Types, Building Analytics approach and ten steps to build your analytics platform within your company plus key takeaways.
To view recording of this webinar please use the below URL:
http://wso2.com/library/webinars/2016/08/analytics-as-your-business-edge/
Data is the new oil! For most enterprises, data is the oil you’ve been sitting on without realizing its value. You can gain many useful insights from data that lead to new and better products and operations, enables new user experiences, allows better understanding of customers, makes interactions seamless and enables new pay per use business models and dynamic pricing models. Furthermore, data itself can be monetized. Enterprises can broker interactions between end users as done in digital advertising or sell insights to third parties in anonymized forms. Just like in Google and Facebook, data can be a primary asset that organizations collect as part of their operations.
Big Data : From HindSight to Insight to ForesightSunil Ranka
When it comes to Analytics and Reporting , There is a fine line between HindSight to Insight to Foresight . With the evolution of BigData technology, there is a need in deriving value out of the larger datasets, not available in the past. Even before we can start using the new shiny technologies, there is a need of understanding what is categorized as reporting or business intelligence or Big Data and Analytics. Based on my experience, people struggle to distinguish between reporting, Analytics, and Business Intelligence.
Why Everything You Know About bigdata Is A LieSunil Ranka
As a big data technologist, you can bet that you have heard it all: every crazy claim, myth, and outright lie about what big data is and what it isn't that you can imagine, and probably a few that you can't.If your company has a big data initiative or is considering one, you should be aware of these false statements and the reasons why they are wrong.
Analytics 3.0 Measurable business impact from analytics & big dataMicrosoft
Presentación del evento de Harvard Business Review sobre Analítica y Big Data
(15 de Octubre 2013)
"Featuring analytics expert Tom Davenport, author of Competing on Analytics, Analytics at Work, and the just-released Keeping Up with the Quants" 
You probably have heard about Big Data, but ever wondered what it exactly is? And why should you care?
Mobile is playing a large part in driving this explosion in data. The data are also created by the apps and other services in the background. As people are moving towards more digital channels, tons of data are being created. This data can be used in a lot of ways for personal and professional use. Big Data and mobile apps are converging in an enterprise and interacting; transforming the whole mobile ecosystem.
Big Data and The Future of Insight - Future FoundationForesight Factory
As Big Data sweeps through consumer-facing businesses, we ask:
- If Big Data is truly a revolution, then what (and whom) will it eliminate or elevate?
- What value will still be derived from conventional market research and brand-building techniques?
- If every brand is backed by Big Data, can every brand prosper?
For more information please contact info@futurefoundation.net or visit www.futurefoundation.net
This was first part of the presentation on "Road Map for Careers in Big Data" in Conjunction with Hortonworks/Aengus Rooney on 17th August 2016 in London. For those contemplating moving to Big Data from often Relational Background
This presentation shows how Predictive Analytics can be more futuristic than BI in using past events to predict the future.
Furthermore, we explore the best practices in Predictive Analytics, the challenges in deployment and how this solution can be used to create business value for the organization.
Presented by Ajay Gopikrishnan, our expert in Predictive Analytics and Data Mining at the BA4ALL (Business Analytics Insight 2014) event in the Netherlands.
http://www.capgemini.com/big-data-analytics
The current challenges and opportunities of big data and analytics in emergen...IBM Analytics
Big data and analytics present many possibilities for emergency management specialists and first responders. Some of these benefits include pinpointing vulnerabilities, bringing in the right resources and maximizing existing resources to pave the way to adoption. However, these opportunities are not without challenges. Emergency management experts Adam Crowe, Director, Emergency Preparedness at Virginia Commonwealth University; William Moorhead, President of All Clear Emergency Management Group; and Gary Nestler, Associate Partner and Global Leader, Emergency Management solutions at IBM discuss these challenges and opportunities in this slideshare—which is intended to help disaster management stakeholders achieve the most accurate situational awareness using analytics.
Discover analytics solutions for emergency management http://ibm.co/emergencymgmt
With many organisations considering getting on the Hadoop bandwagon, this document provides an overview of the planned use cases for Hadoop, an illustration of some of the common technology components, suggestions on when Hadoop is worth considering, some the challenges organisations are experiencing, cost considerations and finally, how an organisation should position for a Big Data initiative. Any organisation considering a Big Data initiative with Hadoop should thoroughly consider each of these areas before embarking on a course of action.
Big data & analytics for banking new york lars hambergLars Hamberg
BIG DATA & ANALYTICS FOR BANKING SUMMIT, New York, 1 Dec 2015.
Keynote address: "How Predictive Analytics will change the Financial Services Sector”
Speaker : Lars Hamberg
http://www.specialistspeakers.com/?p=8367
Overview & Outlook: Why Big Data will over-deliver on its hype and transform Financial Services; Use cases with Advanced Analytics and Big Data Analytics in Financial Services, in Production & Distribution of banking products; new opportunities for incumbents in tomorrow’s ecosystem; big data, bigdata, analytics, smart data, data analytics, digitization, digitalization, predictive analytics, sentiment analysis, financial services, banking, asset management, distribution, retail, trading, technology, innovation, fintech, wealth, asset management, investment industry, robo advisory, social investing, behavior, profiling, client segmentation, alias matching, semantic memory models, unstructured data, machine learning, pattern recognition
Big Data : From HindSight to Insight to ForesightSunil Ranka
When it comes to Analytics and Reporting , There is a fine line between HindSight to Insight to Foresight . With the evolution of BigData technology, there is a need in deriving value out of the larger datasets, not available in the past. Even before we can start using the new shiny technologies, there is a need of understanding what is categorized as reporting or business intelligence or Big Data and Analytics. Based on my experience, people struggle to distinguish between reporting, Analytics, and Business Intelligence.
Why Everything You Know About bigdata Is A LieSunil Ranka
As a big data technologist, you can bet that you have heard it all: every crazy claim, myth, and outright lie about what big data is and what it isn't that you can imagine, and probably a few that you can't.If your company has a big data initiative or is considering one, you should be aware of these false statements and the reasons why they are wrong.
Analytics 3.0 Measurable business impact from analytics & big dataMicrosoft
Presentación del evento de Harvard Business Review sobre Analítica y Big Data
(15 de Octubre 2013)
"Featuring analytics expert Tom Davenport, author of Competing on Analytics, Analytics at Work, and the just-released Keeping Up with the Quants" 
You probably have heard about Big Data, but ever wondered what it exactly is? And why should you care?
Mobile is playing a large part in driving this explosion in data. The data are also created by the apps and other services in the background. As people are moving towards more digital channels, tons of data are being created. This data can be used in a lot of ways for personal and professional use. Big Data and mobile apps are converging in an enterprise and interacting; transforming the whole mobile ecosystem.
Big Data and The Future of Insight - Future FoundationForesight Factory
As Big Data sweeps through consumer-facing businesses, we ask:
- If Big Data is truly a revolution, then what (and whom) will it eliminate or elevate?
- What value will still be derived from conventional market research and brand-building techniques?
- If every brand is backed by Big Data, can every brand prosper?
For more information please contact info@futurefoundation.net or visit www.futurefoundation.net
This was first part of the presentation on "Road Map for Careers in Big Data" in Conjunction with Hortonworks/Aengus Rooney on 17th August 2016 in London. For those contemplating moving to Big Data from often Relational Background
This presentation shows how Predictive Analytics can be more futuristic than BI in using past events to predict the future.
Furthermore, we explore the best practices in Predictive Analytics, the challenges in deployment and how this solution can be used to create business value for the organization.
Presented by Ajay Gopikrishnan, our expert in Predictive Analytics and Data Mining at the BA4ALL (Business Analytics Insight 2014) event in the Netherlands.
http://www.capgemini.com/big-data-analytics
The current challenges and opportunities of big data and analytics in emergen...IBM Analytics
Big data and analytics present many possibilities for emergency management specialists and first responders. Some of these benefits include pinpointing vulnerabilities, bringing in the right resources and maximizing existing resources to pave the way to adoption. However, these opportunities are not without challenges. Emergency management experts Adam Crowe, Director, Emergency Preparedness at Virginia Commonwealth University; William Moorhead, President of All Clear Emergency Management Group; and Gary Nestler, Associate Partner and Global Leader, Emergency Management solutions at IBM discuss these challenges and opportunities in this slideshare—which is intended to help disaster management stakeholders achieve the most accurate situational awareness using analytics.
Discover analytics solutions for emergency management http://ibm.co/emergencymgmt
With many organisations considering getting on the Hadoop bandwagon, this document provides an overview of the planned use cases for Hadoop, an illustration of some of the common technology components, suggestions on when Hadoop is worth considering, some the challenges organisations are experiencing, cost considerations and finally, how an organisation should position for a Big Data initiative. Any organisation considering a Big Data initiative with Hadoop should thoroughly consider each of these areas before embarking on a course of action.
Big data & analytics for banking new york lars hambergLars Hamberg
BIG DATA & ANALYTICS FOR BANKING SUMMIT, New York, 1 Dec 2015.
Keynote address: "How Predictive Analytics will change the Financial Services Sector”
Speaker : Lars Hamberg
http://www.specialistspeakers.com/?p=8367
Overview & Outlook: Why Big Data will over-deliver on its hype and transform Financial Services; Use cases with Advanced Analytics and Big Data Analytics in Financial Services, in Production & Distribution of banking products; new opportunities for incumbents in tomorrow’s ecosystem; big data, bigdata, analytics, smart data, data analytics, digitization, digitalization, predictive analytics, sentiment analysis, financial services, banking, asset management, distribution, retail, trading, technology, innovation, fintech, wealth, asset management, investment industry, robo advisory, social investing, behavior, profiling, client segmentation, alias matching, semantic memory models, unstructured data, machine learning, pattern recognition
Los mercados de la Unión Europea con mayor potencial para las empresas mexica...Lourdes Gómez
Presentamos los resultados de un análisis que permite identificar a los países de la Unión Europea con mayor potencial comercial para México. También se presentan algunos de los sectores y productos que tienen mejor perfil de inserción en mercados de la región.
A presentation by Scott Abel, TheContentWrangler.com, to technical communication professionals at the Documentation and Training Conference Boston 2006.
Enabling data scientists within an enterprise requires a well-thought out approach from an organization, technology, and business results perspective. In this talk, Tim and Hussain will share common pitfalls to data science enablement in the enterprise and provide their recommendations to avoid them. Taking an example, actionable use case from the financial services industry, they will focus on how Anaconda plays a pivotal role in setting up big data infrastructure, integrating data science experimentation and production environments, and deploying insights to production. Along the way, they will highlight opportunities for leveraging open source and unleashing data science teams while meeting regulatory and compliance challenges.
SmartData Webinar Slides: How to analyze 72 billion messages a day to find tr...DATAVERSITY
There is an overload of stream data that has led to interest in Big Data, while mostly resulting in a signal-to-noise problem. There is not enough attention in the world, nor enough analyst time to keep up with this deluge of data. Most Big Data tools available today are not up to the task. A radical new form of information retrieval is called for. In this webinar, we will show how we envision the future of automated insight discovery. We will show a very fast interactive analytics engine that allows for slicing and dicing data in many ways. We then go a step further to systematically walk through all these analytics - brute force style - to generate what we call "trends."
Forecast to contribute £216 billion to the UK economy via business creation, efficiency and innovation, and generate 360,000 new jobs by 2020, big data is a key area for recruiters.
In this QuickView:
- Big data in numbers
- Top 10 industries hiring big data professionals
- Top 10 qualifications sought by hirers
- Top 10 database and BI skills sought by hirers
- Getting started in big data: popular big data techniques and vendors
Discussion Forum data, sourced from sites like Reddit and other social media platforms, as well other sources of textual information, provides tremendous opportunity for insight and innovation. This presentation focuses on how an analysis of unstructured data can be used to innovate in Life/Health Science organizations
Overview of major factors in big data, analytics and data science. Illustrates the growing changes from data capture and the way it is changing business beyond technology industries.
As the strategic importance of data has increased, new approaches to customer analytics have emerged as well. As customer interactions with companies grow and diversify, the need to integrate data faster and deliver real-time insights is critical. This presentation explores the underlying trends driving companies to become more data-driven and invest in customer analytics. And, it outlines three types of approaches to capturing, managing, analyzing, and activating customer knowledge and insights.
¿En qué se parece el Gobierno del Dato a un parque de atracciones?Denodo
Watch full webinar here: https://bit.ly/3Ab9gYq
Imagina llegar a un parque de atracciones con tu familia y comenzar tu día sin el típico plano que te permitirá planificarte para saber qué espectáculos ver, a qué atracciones ir, donde pueden o no pueden montar los niños… Posiblemente, no podrás sacar el máximo partido a tu día y te habrás perdido muchas cosas. Hay personas que les gusta ir a la aventura e ir descubriendo poco a poco, pero cuando hablamos de negocios, ir a la aventura puede ser fatídico...
En la era de la explosión de la información repartida en distintas fuentes, el gobierno de datos es clave para garantizar la disponibilidad, usabilidad, integridad y seguridad de esa información. Asimismo, el conjunto de procesos, roles y políticas que define permite que las organizaciones alcancen sus objetivos asegurando el uso eficiente de sus datos.
La virtualización de datos, herramienta estratégica para implementar y optimizar el gobierno del dato, permite a las empresas crear una visión 360º de sus datos y establecer controles de seguridad y políticas de acceso sobre toda la infraestructura, independientemente del formato o de su ubicación. De ese modo, reúne múltiples fuentes de datos, las hace accesibles desde una sola capa y proporciona capacidades de trazabilidad para supervisar los cambios en los datos.
En este webinar aprenderás a:
- Acelerar la integración de datos provenientes de fuentes de datos fragmentados en los sistemas internos y externos y obtener una vista integral de la información.
- Activar en toda la empresa una sola capa de acceso a los datos con medidas de protección.
- Cómo la virtualización de datos proporciona los pilares para cumplir con las normativas actuales de protección de datos mediante auditoría, catálogo y seguridad de datos.
Innovative Data Leveraging for Procurement AnalyticsTejari
This webinar will explore the types of problems and questions faced by procurement executives that can benefit most through the application of analytical solutions (e.g. innovation, strategic cost management, risk mitigation, etc.). In addition, we will cover the different forms of cognitive solutions that are emerging to drive real-time decision-making and predictive sourcing capabilities.
Frontiers in Alternative Data : Techniques and Use CasesQuantUniversity
QuantUniversity Summer School 2020 (https://qusummerschool.splashthat.com/)
https://quspeakerseries10.splashthat.com/
Lecture 1: Alexander Denev
In this talk, Alexander will introduce Alternative Data and discuss it's uses from his book, The Book of Alternative Data
- What is alternative data?
- Adoption of alternative data
- Information value chain
- Risks associated with alternative data
- Processes required to develop signals
- Valuation of alternative data
Lecture 2: Saeed Amen
In this talk, Saeed will discuss use cases in Alternative Data
-Deciphering Federal Reserve communications
- Using CLS flow data to trade FX
- Geospatial Insight satellite data to estimate retailers' EPS
- Saving "alpha" with transaction cost analysis
- Using Bloomberg News data to trade FX
Similar to Blocks & Bots - Digital Summit Harvard Business School 2015 (20)
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...pchutichetpong
M Capital Group (“MCG”) expects to see demand and the changing evolution of supply, facilitated through institutional investment rotation out of offices and into work from home (“WFH”), while the ever-expanding need for data storage as global internet usage expands, with experts predicting 5.3 billion users by 2023. These market factors will be underpinned by technological changes, such as progressing cloud services and edge sites, allowing the industry to see strong expected annual growth of 13% over the next 4 years.
Whilst competitive headwinds remain, represented through the recent second bankruptcy filing of Sungard, which blames “COVID-19 and other macroeconomic trends including delayed customer spending decisions, insourcing and reductions in IT spending, energy inflation and reduction in demand for certain services”, the industry has seen key adjustments, where MCG believes that engineering cost management and technological innovation will be paramount to success.
MCG reports that the more favorable market conditions expected over the next few years, helped by the winding down of pandemic restrictions and a hybrid working environment will be driving market momentum forward. The continuous injection of capital by alternative investment firms, as well as the growing infrastructural investment from cloud service providers and social media companies, whose revenues are expected to grow over 3.6x larger by value in 2026, will likely help propel center provision and innovation. These factors paint a promising picture for the industry players that offset rising input costs and adapt to new technologies.
According to M Capital Group: “Specifically, the long-term cost-saving opportunities available from the rise of remote managing will likely aid value growth for the industry. Through margin optimization and further availability of capital for reinvestment, strong players will maintain their competitive foothold, while weaker players exit the market to balance supply and demand.”
As Europe's leading economic powerhouse and the fourth-largest hashtag#economy globally, Germany stands at the forefront of innovation and industrial might. Renowned for its precision engineering and high-tech sectors, Germany's economic structure is heavily supported by a robust service industry, accounting for approximately 68% of its GDP. This economic clout and strategic geopolitical stance position Germany as a focal point in the global cyber threat landscape.
In the face of escalating global tensions, particularly those emanating from geopolitical disputes with nations like hashtag#Russia and hashtag#China, hashtag#Germany has witnessed a significant uptick in targeted cyber operations. Our analysis indicates a marked increase in hashtag#cyberattack sophistication aimed at critical infrastructure and key industrial sectors. These attacks range from ransomware campaigns to hashtag#AdvancedPersistentThreats (hashtag#APTs), threatening national security and business integrity.
🔑 Key findings include:
🔍 Increased frequency and complexity of cyber threats.
🔍 Escalation of state-sponsored and criminally motivated cyber operations.
🔍 Active dark web exchanges of malicious tools and tactics.
Our comprehensive report delves into these challenges, using a blend of open-source and proprietary data collection techniques. By monitoring activity on critical networks and analyzing attack patterns, our team provides a detailed overview of the threats facing German entities.
This report aims to equip stakeholders across public and private sectors with the knowledge to enhance their defensive strategies, reduce exposure to cyber risks, and reinforce Germany's resilience against cyber threats.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Explore our comprehensive data analysis project presentation on predicting product ad campaign performance. Learn how data-driven insights can optimize your marketing strategies and enhance campaign effectiveness. Perfect for professionals and students looking to understand the power of data analysis in advertising. for more details visit: https://bostoninstituteofanalytics.org/data-science-and-artificial-intelligence/
Opendatabay - Open Data Marketplace.pptxOpendatabay
Opendatabay.com unlocks the power of data for everyone. Open Data Marketplace fosters a collaborative hub for data enthusiasts to explore, share, and contribute to a vast collection of datasets.
First ever open hub for data enthusiasts to collaborate and innovate. A platform to explore, share, and contribute to a vast collection of datasets. Through robust quality control and innovative technologies like blockchain verification, opendatabay ensures the authenticity and reliability of datasets, empowering users to make data-driven decisions with confidence. Leverage cutting-edge AI technologies to enhance the data exploration, analysis, and discovery experience.
From intelligent search and recommendations to automated data productisation and quotation, Opendatabay AI-driven features streamline the data workflow. Finding the data you need shouldn't be a complex. Opendatabay simplifies the data acquisition process with an intuitive interface and robust search tools. Effortlessly explore, discover, and access the data you need, allowing you to focus on extracting valuable insights. Opendatabay breaks new ground with a dedicated, AI-generated, synthetic datasets.
Leverage these privacy-preserving datasets for training and testing AI models without compromising sensitive information. Opendatabay prioritizes transparency by providing detailed metadata, provenance information, and usage guidelines for each dataset, ensuring users have a comprehensive understanding of the data they're working with. By leveraging a powerful combination of distributed ledger technology and rigorous third-party audits Opendatabay ensures the authenticity and reliability of every dataset. Security is at the core of Opendatabay. Marketplace implements stringent security measures, including encryption, access controls, and regular vulnerability assessments, to safeguard your data and protect your privacy.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
2. INTRODUCE THE DATA INNOVATION LAB
HOW WE ARE RESPONDING TO TRENDS AND ISSUES
WHAT ARE THE ISSUES AND CHALLENGES THAT YOU ARE FACING?
DATA MONETIZATION BUSINESS MODELS
DEMONSTRATION OF RECENT DATA SCIENCE PROJECTS
GETTING VALUE OUT OF DATA
1
3. ABOUT THE DATA INNOVATION LAB
We are located in Boston’s Innovation District and closely
connected to the MIT and Cambridge Startup Ecosystem
Currently, 10 data scientists and visualization experts
Mission
• Deliver insights for customers with cutting-edge data science
proof-of-concepts
• Create value with all the content from across Thomson Reuters
enterprise, and from external partners, including open data
2
4. DATA INNOVATION LAB: HOW WE WORK
Project Team
• Data Scientists
(Data Innovation Lab)
• Business Lead & Content
Experts
(Thomson Reuters)
• External partner team
member(s)
(customers, vendors)
Deliverables
• Proof-of-concepts
to gain customer
feedback
• Analytical models
• Interactive demos
and visualizations
3
5. LEAN DATA SCIENCE SPRINTS
Agile Data Science Projects, iterative, 2-week sprints
to build MVPs (Minimum Viable Products (UI, API, Visualization)
ACQUIRE
DATA
CURATE
DATA
MODEL AND
ANALYZE
BUILD MVP
GET
CUSTOMER
FEEDBACK
4
6. DATA INNOVATION LABS: PROJECTS
Sample Projects:
• Integration of data
across Thomson
Reuters
Patents, News, Legal, Tax &
Accounting, etc. via PermID
• Mining new data
sources for Finance
Industry partnerships
Open data
• Knowledge Graphs
Graph databases
Semantic web
Visualization
• Startup Ecosystem
Collaborating and co-developing
solutions for customers
5
7. DATA INNOVATION LAB: CAPABILITIES
Big Data
Hadoop, Spark, Hive,
Graph DB, etc.
Data Science
machine learning,
classification, regression,
anomaly detection
Text Mining and NLP
Rapid Prototyping
& Data
Visualization
Quantitative
Modeling &
Financial Research
6
8. Data Science Skills In the Lab
7
ZacBrian U Brian R Henry Josh Joe Liz
Data Visualization
Math / Stats
Programming
Business
(Finance, Risk, Legal)
Big Data
NLP / Semantic Web
Machine Learning
Dave
9. INTRODUCE THE DATA INNOVATION LAB
GETTING VALUE OUT OF DATA
HOW WE ARE RESPONDING TO TRENDS AND ISSUES
WHAT ARE THE ISSUES AND CHALLENGES THAT YOU ARE FACING?
DATA MONETIZATION BUSINESS MODELS
DEMONSTRATION OF RECENT DATA SCIENCE PROJECTS
8
10. We define the following terms as…
• Raw Data: Data as it is collected from the source – not manipulated
or processed (e.g. sensor data)
• Time Series: Measurements of prices, economic data through time
• Reference Data: Terms and conditions, financials statements
• Analytics: Discovery and communication of meaningful patterns in
data
• Predictive Analytics: Extracting information from data to determine
patterns and predict future outcomes and trends
• Data Exhaust: Data generated as information byproducts resulting
from digital or online activities
• Entity Analytics: The analysis of the connections/linkages between
different entities or information
1. Gartner 9
11. Data is generated at an unprecedented rate
10
2.5 quintillion bytes
(2.3 trillion gigabytes) of data
created each day
40 zettabytes (43 trillion
gigabytes) of data will be created
by 2020 – 300 times
increase from 2005
At least 100 terabytes of
data stored by most companies in
the US
1 terabyte of trade
information captured by the New
York Stock Exchange in each
trading sessionSource: IBM
90%
of data in the
world created in
the last 2 years
12. Big data and analytics market is expected to reach
$125 billion in 2015
11
Source: IDC Worldwide Big Data and Analytics Predictions for 2015
70%
of large organizations
purchase external
data today
100%
of large organizations
will purchase external
data by 2019
More organizations will begin to monetize their data
Applications with advanced and
predictive analytics will grow
65% faster than applications
without predictive functionality
IoT analytics is expected to
grow at a 5-year CAGR of 30%
13. Data has value outside traditional business
operations
12
DATA
INTERNAL
ANALYTICS
CORE BUSINESS
OPERATIONS
DATA MONETIZATION
MODELS
Analytics to drive strategy,
business intelligence, and
marketing analytics. Enable
deeper customer
engagement, optimize
operations, unlock cross-sell
and up-sell opportunities.
Data generated and used as part of
traditional business operations
Data or data-driven insights
(external-facing analytics)
provided to customers.
Enter adjacent and new
markets.
14. There are 3 main data monetization models
13
Provide raw data directly
to customers or through
distributors
Sell Raw Data Provide Data Analytics
Develop new analytics from
internal data alone or combined
with other data sources. Raw data
may or may not be provided in the
solution.
Easy to accomplish and
short time-to-market
Value is a function of
exclusivity or
differentiated access
Need data scientists to develop
the capabilities in-house
Differentiation via data-driven
decision making internally or as a
service
Develop Data Platform
Provide a marketplace and
platform for multiple data sources
and analytics applications.
Need data science, software
development and business
expertise in platform business
strategy
Multi-sided network effects
Platform
15. Selling raw data is most valuable when the data is
unique to the providing firm
14
• Target provides downloads of its point of sale
(POS) data to suppliers who use this information
to monitor inventory levels and customer
demand in real-time.
• Walmart provides all of the sell though data to
their suppliers by SKU, hour and store with their
Retail Link system.
• Nielsen sells customer shopper behavior in
250,000 households in 25 countries to
consumer goods suppliers and retailers, which
in turn use this data to increase marketing and
sales effectiveness.
• INRIX provides real-time traffic information to
Ford Motor Company to be integrated in its in-
car navigation system.
Value of raw data
increases when the
data is inimitable.
Firms may also
provide raw data free
or in a freemium
model to increase
customer
engagement.
16. Providing data analytics
15
• ARPA-E’s Project TERRA collects performance
data from plants in the field, incorporates
genomic data and builds algorithms to correlate
a gene to a specific trait or plant performance.
• Interactions Marketing combines retail POS data
and regional weather data to provide insights
into customer behavior to retailers and
manufacturers.
• Weather Underground provides weather data to
businesses to look for patterns in their sales with
respect to weather changes.
• Boston-based start-up CargoMetrics combines
location of ships, reported through tracking
systems, with the contents of ships to sell
commodities movement information to hedge
funds.
Powerful
analytics
applications
connect and link
data from
different sources.
17. Example: Analytics adds significant value to
commodities trading
16
Boston-based analytics startup CargoMetrics raised $2.14M in 2014
Novel trading
strategies utilized in
internal hedge fund
Proprietary metadata
Advanced cargo-modeling
algorithms and analytics
Maritime movement
of commodities data
Maritime trade accounts
for more than 80% of
the global commerce
18. Data platforms can harness powerful network
effects
17
• Airbnb, Uber, Salesforce provide platforms for
data and service providers to connect with
users.
• Thomson Reuters App Studio connects app
builders to the Eikon platform customers
• Riskpulse combines analytics with its platform
where third parties can supply hazard data to
help businesses identify risks related to weather
conditions, natural disasters (earthquakes,
volcano eruptions, etc.) and port strikes.
Platform
Algorithms that
match customers
with the data they
need is a key
component for
creating value in data
platforms.
19. Common Issues and Challenges
• Finding the initial hypothesis
– eg crop yields affects commodity prices, affects farmer profitability, affects
equipment purchases, affects John Deere Sales, affects John Deere EPS, effects
JD share price performance.
• Finding a reliable, repeatable method to collect the data with the required
frequency
• Establishing history where possible
• Establishing rights to the data /
• Establishing Internal Agreement to Externalize the data
• Striking balance between source protection and maintaining value
• Confirming the data is not trumped by another better, faster, cheaper source
• Concording the data
• Linking to a tradable security
18
20. INTRODUCE THE DATA INNOVATION LAB
GETTING VALUE OUT OF DATA
HOW WE ARE RESPONDING TO TRENDS AND ISSUES
WHAT ARE THE ISSUES AND CHALLENGES THAT YOU ARE FACING?
DATA MONETIZATION BUSINESS MODELS
DEMONSTRATION OF RECENT DATA SCIENCE/DATA
MONETIZATION PROJECTS
19
21. Geopolitical Risk Analytical Framework
• Objectives:
•To demonstrate the power of Thomson Reuters content, on a single
platform, by linking and analyzing internal and external content
•Rapidly prototype an analytical framework to assess the inherent
geopolitical risks associated with a company or instrument as a POC for
more generic thematic investing for the Buy-Side
20
22. Predicting Patent Litigation Risk
• Objectives:
• To combine legal (patent litigation) data and IP&S data to model the
likelihood that a particular patent will be the subject of a lawsuit in its lifetime.
• Machine Learning model combines intrinsic and acquired features of patents.
Number of claims, type of assignee, etc.
21
23. Visualizing Food and Political Instability
• Objectives:
• To visualize relationship between food insecurity and political instability
for company-wide outward facing Thomson Reuters Feed the World site.
Work in progress.
22
24. Linking Data with PermID
23
• The Open PermID database is licensed under the Creative Commons licensing
framework: allowing for free reuse of the ID and certain data (with attribution),
and more restrictive licensing to some data items to prevent product
cannibalization.