SlideShare a Scribd company logo
1 of 29
Gaming SEC filings with machine
learning to detect vectors and
sentiment in reporting language
Steven Cyphers
PRESENTATION AGENDA
1
2
3
4
Background
Data from / Data to
NLP and python tricks
Graphing outcomes
Background
Frame the question
What do we want to know
Is there data to support
Has it been done before
Can FME improve the process
Frame the question
What do we want to know: detect change
Is there data to support
Has it been done before
Can FME improve the process
Frame the question
What do we want to know: detect change
Is there data to support: yes
Has it been done before
Can FME improve the process
Frame the question
What do we want to know: detect change
Is there data to support: yes
Has it been done before: yes (blog)
Can FME improve the process
Frame the question
What do we want to know: detect change
Is there data to support: yes
Has it been done before: yes (blog)
Can FME improve the process: absolutely
Data from / Data to
Dataflow
Fetch & Prep
Collect text/html based
data
FME to Automate
Extract numeric values
for SAP. Extract and tidy
text for sentment
Pass the result
Summary info and
binning to lessen the
information load
+ +
Dataflow
Fetch & Prep
Collect text/html based
data
FME to Automate
Extract numeric values
for SAP. Extract and tidy
text for sentment
Pass the result
Summary info and
binning to lessen the
information load
+ +
Dataflow
Fetch & Prep
Collect text/html based
data
FME to Automate
Extract numeric values
for SAP. Extract and tidy
text for sentment
Pass the result
Summary info and
binning to lessen the
information load
+ +
Dataflow
Fetch & Prep
Collect text/html based
data
FME to Automate
Extract numeric values
for SAP. Extract and tidy
text for sentment
Pass the result
Summary info and
binning to lessen the
information load
+ +
Dataflow
Fetch & Prep
Collect text/html based
data
FME to Automate
Extract numeric values
for SAP. Extract and tidy
text for sentment
Pass the result
Summary info and
binning to lessen the
information load
+ +
NLP & Python tricks
2019.0 NLP will be built-in*
Python libraries
nltk – corpus clean-up, word vectoring
boto3 – interacting with AWS services
json – handy dictionary object handling
keras tensorflow (3.6*)
0..1..2..
Python tips
Its works on terminal and powershell…
but how do I?
python path tricks
Python tips
And how do you AWS CLI?
Create IAM role/group policy credential
~aws configure
https://aws.amazon.com/comprehend/pricing/
json
json extracting
Now we can pass text_line_data to AWS
Comprehend and receive sentiment
scoring. No training data required.
Python libraries
nltk & sklearn - Transform the raw text to
dense text corpus then into word
groupings and vectors
Graphing outcomes
NLP tell me more
Semantic search
● Key phrases, entities & sentiment*
Identify topics
● Organize by market sector, similarity
NLP tell me more
Word vectoring
● Unique word counts, similarity indices,
representing bodies of text as
numerical arrays.
Measure similarity
● For a given company build vectoring
● Change detect intra-company but inter-
document
NLP small swings
While mostly Neutral
● Some Positive/Negative characteristics are there
NLP what can it offer
Word vectoring
● Mathematical representation of relatedness
Identify topics
● Flagged terms
NLP swings in neutral
Graph sentiment changes
● Flag to investigate
Graph other metrics
● Graph EPS
NLP swings in neutral
Graph sentiment changes
● Flag to investigate
Graph other metrics
● Graph EPS
THANK YOU!
steven.cyphers@ghd.com / www.ghd.com/digital

More Related Content

Similar to Gaming SEC Filings Using Machine Learning to Detect Vectors and Sentiment in Reporting Language

Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...
Simplilearn
 

Similar to Gaming SEC Filings Using Machine Learning to Detect Vectors and Sentiment in Reporting Language (20)

Python For SEO specialists and Content Marketing - Hand in Hand
Python For SEO specialists and Content Marketing - Hand in HandPython For SEO specialists and Content Marketing - Hand in Hand
Python For SEO specialists and Content Marketing - Hand in Hand
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
hbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencenthbaseconasia2019 HBase at Tencent
hbaseconasia2019 HBase at Tencent
 
Blazing new trails with salesforce data nov 16, 2021
Blazing new trails with salesforce data   nov 16, 2021Blazing new trails with salesforce data   nov 16, 2021
Blazing new trails with salesforce data nov 16, 2021
 
Unleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
Unleashing the Power of OpenAI GPT-3 in FME Data Integration WorkflowsUnleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
Unleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
 
OSDC 2008 Apache Ofbiz Talk
OSDC 2008 Apache Ofbiz TalkOSDC 2008 Apache Ofbiz Talk
OSDC 2008 Apache Ofbiz Talk
 
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5Certification Study Group - NLP & Recommendation Systems on GCP Session 5
Certification Study Group - NLP & Recommendation Systems on GCP Session 5
 
Machine Learning Platform in LINE Fukuoka
Machine Learning Platform in LINE FukuokaMachine Learning Platform in LINE Fukuoka
Machine Learning Platform in LINE Fukuoka
 
Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...
 
SemTech 2010: Pelorus Platform
SemTech 2010: Pelorus PlatformSemTech 2010: Pelorus Platform
SemTech 2010: Pelorus Platform
 
Spark Based Distributed Deep Learning Framework For Big Data Applications
Spark Based Distributed Deep Learning Framework For Big Data Applications Spark Based Distributed Deep Learning Framework For Big Data Applications
Spark Based Distributed Deep Learning Framework For Big Data Applications
 
InfoPath 2010 - First Look #SPSTCDC
InfoPath 2010 - First Look #SPSTCDCInfoPath 2010 - First Look #SPSTCDC
InfoPath 2010 - First Look #SPSTCDC
 
Leveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven DecisionsLeveraging Cloud Analytics to Support Data-Driven Decisions
Leveraging Cloud Analytics to Support Data-Driven Decisions
 
Business Intelligence with SQL Server
Business Intelligence with SQL ServerBusiness Intelligence with SQL Server
Business Intelligence with SQL Server
 
Loading Attributes from SAP source system
Loading Attributes from SAP source systemLoading Attributes from SAP source system
Loading Attributes from SAP source system
 
Frequently asked MuleSoft Interview Questions and Answers from Techlightning
Frequently asked MuleSoft Interview Questions and Answers from TechlightningFrequently asked MuleSoft Interview Questions and Answers from Techlightning
Frequently asked MuleSoft Interview Questions and Answers from Techlightning
 
Plumb5 Data Bots
Plumb5 Data BotsPlumb5 Data Bots
Plumb5 Data Bots
 
Sentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and HiveSentiment Analysis on Twitter Data Using Apache Flume and Hive
Sentiment Analysis on Twitter Data Using Apache Flume and Hive
 
ChatGPT and AI for web developers - Maximiliano Firtman
ChatGPT and AI for web developers - Maximiliano FirtmanChatGPT and AI for web developers - Maximiliano Firtman
ChatGPT and AI for web developers - Maximiliano Firtman
 

More from Safe Software

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Safe Software
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software
 
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdfIntroducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Safe Software
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Safe Software
 
Taking Off with FME: Elevating Airport Operations to New Heights
Taking Off with FME: Elevating Airport Operations to New HeightsTaking Off with FME: Elevating Airport Operations to New Heights
Taking Off with FME: Elevating Airport Operations to New Heights
Safe Software
 
Initiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance StrategyInitiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance Strategy
Safe Software
 

More from Safe Software (20)

The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action:  Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action:  Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Powering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data StreamsPowering Real-Time Decisions with Continuous Data Streams
Powering Real-Time Decisions with Continuous Data Streams
 
The Critical Role of Spatial Data in Today's Data Ecosystem
The Critical Role of Spatial Data in Today's Data EcosystemThe Critical Role of Spatial Data in Today's Data Ecosystem
The Critical Role of Spatial Data in Today's Data Ecosystem
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration WorkflowsIgniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Mastering MicroStation DGN: How to Integrate CAD and GIS
Mastering MicroStation DGN: How to Integrate CAD and GISMastering MicroStation DGN: How to Integrate CAD and GIS
Mastering MicroStation DGN: How to Integrate CAD and GIS
 
Geospatial Synergy: Amplifying Efficiency with FME & Esri
Geospatial Synergy: Amplifying Efficiency with FME & EsriGeospatial Synergy: Amplifying Efficiency with FME & Esri
Geospatial Synergy: Amplifying Efficiency with FME & Esri
 
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdfIntroducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
Introducing the New FME Community Webinar - Feb 21, 2024 (2).pdf
 
Breaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI TechnologyBreaking Barriers & Leveraging the Latest Developments in AI Technology
Breaking Barriers & Leveraging the Latest Developments in AI Technology
 
Best Practices to Navigating Data and Application Integration for the Enterpr...
Best Practices to Navigating Data and Application Integration for the Enterpr...Best Practices to Navigating Data and Application Integration for the Enterpr...
Best Practices to Navigating Data and Application Integration for the Enterpr...
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
New Year's Fireside Chat with Safe Software’s Founders
New Year's Fireside Chat with Safe Software’s FoundersNew Year's Fireside Chat with Safe Software’s Founders
New Year's Fireside Chat with Safe Software’s Founders
 
Taking Off with FME: Elevating Airport Operations to New Heights
Taking Off with FME: Elevating Airport Operations to New HeightsTaking Off with FME: Elevating Airport Operations to New Heights
Taking Off with FME: Elevating Airport Operations to New Heights
 
Initiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance StrategyInitiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance Strategy
 

Recently uploaded

Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
FIDO Alliance
 

Recently uploaded (20)

State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Top 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development CompaniesTop 10 CodeIgniter Development Companies
Top 10 CodeIgniter Development Companies
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
Human Expert Website Manual WCAG 2.0 2.1 2.2 Audit - Digital Accessibility Au...
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Portal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russePortal Kombat : extension du réseau de propagande russe
Portal Kombat : extension du réseau de propagande russe
 
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptxHarnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
Harnessing Passkeys in the Battle Against AI-Powered Cyber Threats.pptx
 
AI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by AnitarajAI in Action: Real World Use Cases by Anitaraj
AI in Action: Real World Use Cases by Anitaraj
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 

Gaming SEC Filings Using Machine Learning to Detect Vectors and Sentiment in Reporting Language

  • 1. Gaming SEC filings with machine learning to detect vectors and sentiment in reporting language Steven Cyphers
  • 2. PRESENTATION AGENDA 1 2 3 4 Background Data from / Data to NLP and python tricks Graphing outcomes
  • 4. Frame the question What do we want to know Is there data to support Has it been done before Can FME improve the process
  • 5. Frame the question What do we want to know: detect change Is there data to support Has it been done before Can FME improve the process
  • 6. Frame the question What do we want to know: detect change Is there data to support: yes Has it been done before Can FME improve the process
  • 7. Frame the question What do we want to know: detect change Is there data to support: yes Has it been done before: yes (blog) Can FME improve the process
  • 8. Frame the question What do we want to know: detect change Is there data to support: yes Has it been done before: yes (blog) Can FME improve the process: absolutely
  • 9. Data from / Data to
  • 10. Dataflow Fetch & Prep Collect text/html based data FME to Automate Extract numeric values for SAP. Extract and tidy text for sentment Pass the result Summary info and binning to lessen the information load + +
  • 11. Dataflow Fetch & Prep Collect text/html based data FME to Automate Extract numeric values for SAP. Extract and tidy text for sentment Pass the result Summary info and binning to lessen the information load + +
  • 12. Dataflow Fetch & Prep Collect text/html based data FME to Automate Extract numeric values for SAP. Extract and tidy text for sentment Pass the result Summary info and binning to lessen the information load + +
  • 13. Dataflow Fetch & Prep Collect text/html based data FME to Automate Extract numeric values for SAP. Extract and tidy text for sentment Pass the result Summary info and binning to lessen the information load + +
  • 14. Dataflow Fetch & Prep Collect text/html based data FME to Automate Extract numeric values for SAP. Extract and tidy text for sentment Pass the result Summary info and binning to lessen the information load + +
  • 15. NLP & Python tricks
  • 16. 2019.0 NLP will be built-in*
  • 17. Python libraries nltk – corpus clean-up, word vectoring boto3 – interacting with AWS services json – handy dictionary object handling keras tensorflow (3.6*) 0..1..2..
  • 18. Python tips Its works on terminal and powershell… but how do I? python path tricks
  • 19. Python tips And how do you AWS CLI? Create IAM role/group policy credential ~aws configure https://aws.amazon.com/comprehend/pricing/ json
  • 20. json extracting Now we can pass text_line_data to AWS Comprehend and receive sentiment scoring. No training data required.
  • 21. Python libraries nltk & sklearn - Transform the raw text to dense text corpus then into word groupings and vectors
  • 23. NLP tell me more Semantic search ● Key phrases, entities & sentiment* Identify topics ● Organize by market sector, similarity
  • 24. NLP tell me more Word vectoring ● Unique word counts, similarity indices, representing bodies of text as numerical arrays. Measure similarity ● For a given company build vectoring ● Change detect intra-company but inter- document
  • 25. NLP small swings While mostly Neutral ● Some Positive/Negative characteristics are there
  • 26. NLP what can it offer Word vectoring ● Mathematical representation of relatedness Identify topics ● Flagged terms
  • 27. NLP swings in neutral Graph sentiment changes ● Flag to investigate Graph other metrics ● Graph EPS
  • 28. NLP swings in neutral Graph sentiment changes ● Flag to investigate Graph other metrics ● Graph EPS
  • 29. THANK YOU! steven.cyphers@ghd.com / www.ghd.com/digital

Editor's Notes

  1. How much are they spending on X (what are their reserves) Their overall financial indicators (eps, share price, revenues, etc.) What are they spending money on (investments) Sustainability and social responsibility Compare companies across a sector
  2. Quite a lot of data in fact
  3. FME is great at building an API , retrieve the textfiles, parse them for numeric financial data, cleaning the textfiles to create a unified corpus. https://aws.amazon.com/architecture/icons/ https://www.sec.gov/index.htm https://upload.wikimedia.org/wikipedia/commons/5/59/SAP_2011_logo.svg https://upload.wikimedia.org/wikipedia/commons/0/0a/Python.svg
  4. Use FME to grab the catalog files, use those catalog to retrieve the filing data files you are actually after, clean the resultant files for hierarchy and maintain CIK identifier information.(Central Index Key) https://aws.amazon.com/architecture/icons/ https://www.sec.gov/index.htm https://upload.wikimedia.org/wikipedia/commons/5/59/SAP_2011_logo.svg https://upload.wikimedia.org/wikipedia/commons/0/0a/Python.svg
  5. Use FME to clean the resultant files to rebuild hierarchy and maintain CIK identifier information.(Central Index Key) https://aws.amazon.com/architecture/icons/ https://www.sec.gov/index.htm https://upload.wikimedia.org/wikipedia/commons/5/59/SAP_2011_logo.svg https://upload.wikimedia.org/wikipedia/commons/0/0a/Python.svg
  6. Use FME to chase out the HTML encoding and prep dense text data. https://aws.amazon.com/architecture/icons/ https://www.sec.gov/index.htm https://upload.wikimedia.org/wikipedia/commons/5/59/SAP_2011_logo.svg https://upload.wikimedia.org/wikipedia/commons/0/0a/Python.svg
  7. Use FME assemble the CIK based file organization to start the clean catalog. https://aws.amazon.com/architecture/icons/ https://www.sec.gov/index.htm https://upload.wikimedia.org/wikipedia/commons/5/59/SAP_2011_logo.svg https://upload.wikimedia.org/wikipedia/commons/0/0a/Python.svg
  8. By the time we present at WT the batteries included version of NLP will be in latest official release Is it still important to mess around with external libraries and cloud services? What can you accomplish with external libraries and service calls?
  9. Messing with python libraries in FME is extremely useful, it can prove to tricky to maintain symlinks and version parity but is worth understanding how to drive external services from within FME
  10. FME 2019.0 ships with 3.6 & 3.7 Tensorflow / keras as of this writing does not work above 3.6.x; it may require 3.6.x install on your FME machine When symlinks fail, Manually copying the libraries to achieve library parity KB article : https://knowledge.safe.com/questions/49484/how-do-i-set-the-import-path-in-fmes-python-interp.html
  11. The json dictionary object returned by AWS needs to be caught and decomposed before passing it to a FME Object. (stumbled across that one) The AWS API has a (very) limiting byte length it will take. Loop accordingly Use JSON Extractor to expose these returned attributes
  12. Limitations 5000 bytes, pay as you go usage No training, reasonably priced.
  13. Sentiment or lack of Neutral (degrees of neutrality)
  14. Degrees around the Neutral sentiment? Or lack there of