SlideShare a Scribd company logo
1 of 26
How To Download and Process SEC XBRL Data Directly from EDGAR
XBRL Technology Webinar Series
1
Alexander Falk, CEO, Altova, Inc.
Agenda
• Introduction
• Downloading all XBRL data from EDGAR
– Accessing the SEC’s EDGAR archive of all XBRL files via RSS feeds
– Downloading the ZIP file enclosures for each filing
– Special considerations for early years
• Organizing your downloaded files
– You now have 105,840 ZIP files totaling 14.1 GB
– Making them accessible by date, CIK, ticker
• Processing and validating the XBRL filings
• Extracting useful information, e.g., financial ratios
2
Introduction
• The SEC's EDGAR System holds a wealth of XBRL-formatted
financial data for over 9,700 corporate entities.
• Accessing XBRL-formatted SEC data can seem like a daunting
task and most people think it requires creating a database
before the first data-point can be pulled.
• In this webinar we will show you how to pull all the data
directly from the SEC archive and download it to your
computer, then process it, and perform financial analysis.
• This will hopefully spark ideas that you can use in your own
applications to take advantage of the computer-readable XBRL
data now freely available.
3
For all source code
examples we will be
using Python 3.3.3,
since it is widely
available on all
operating system
platforms and also
provides for easily
readable code.
Obviously the approach
shown in this webinar
can easily be
implemented in other
languages, such as Java,
C#, etc.
Why don’t you just store it in a database?
• Processing and validating the XBRL files once and then storing
the extracted data in a relational database is not necessarily a
bad approach… BUT:
• Any data quality analysis as well as validation statistics and a
documentation of data inconsistencies with respect to the
calculation linkbase require the data to remain in XBRL format
• New XBRL technologies, like XBRL Formula and the XBRL Table
Linkbase provide the potential for future applications that
require the original data to be in XBRL format and not
shredded and stored in a database
• Similarly the ability to query the data with an XQuery 3.0
processor only exists if the data remains in its original XML-
based XBRL format
4
Clearly, if the only goal
of processing the filings
is to derive numerical
data from the facts and
build financial analytics,
then pre-processing and
shredding the data into
a database may still be
your best approach…
The XBRL.US database
by Phillip Engel as well
as Herm Fisher’s talk at
the last XBRL.US
conference about using
Arelle to populate a
PostgreSQL database
are good starting points
for the database-based
approach.
Downloading the data – accessing the RSS feeds
• In addition to the nice web-based user interface for searching
XBRL filings, the SEC website offers RSS feeds to all XBRL filings
ever received:
– http://www.sec.gov/spotlight/xbrl/filings-and-feeds.shtml
• In particular, there is a monthly, historical archive of all filings
with XBRL exhibits submitted to the SEC, beginning with the
inception of the voluntary program in 2005:
– http://www.sec.gov/Archives/edgar/monthly/
• This is our starting point, and it contains one RSS file per
month from April 2005 until the current month of March 2014.
5
Downloading the data – understanding the RSS feed
6
Downloading the data – loading the RSS feed
7
In Python we can easily
use the urlopen function
from the urllib package
to open a file from a
URL and read its data
Downloading the data – parsing the RSS feed to
extract the ZIP file enclosure filename
8
In Python we can easily
use the feedparser 5.1.3
package to parse the
RSS feed and extract the
ZIP file name and CIK#
https://pypi.python.org
/pypi/feedparser
Please note that for the
local filename I am
constructing that by
inserting the CIK in front
of the actual ZIP file
name.
Downloading the data – loading the ZIP file enclosure
9
Please note that it is
prudent to first check if
we already have a local
copy of that particular
filing. We should only
download the ZIP file, if
we don’t find it on our
local machine yet.
Also, please note that
common courtesy
dictates that if you plan
to download all years
2005-2014 of XBRL
filings from the SEC’s
EDGAR archive, you
should do so during off-
peak hours, e.g. night-
time or weekends in
order not to tax the
servers during normal
business hours, as you
will be downloading
105,840 files or 14.1GB
of data!
Downloading the data – special considerations for
early years
• For years 2005-2007 most filings do not yet contain a ZIP file
enclosure.
• Even in 2008-2009 some filings can occasionally be found that
are not yet provided in a ZIP file.
• If you are interested in analyzing data from those early years, a
little bit of extra work is required to download all the
individual XBRL files from a filing and then ZIP them up locally.
• If done properly, all future analysis can then access all the
filings in the same manner directly from the ZIP files.
10
Downloading the early years – ZIPping the XBRL files
on our local machine
11
If we want to download
data from the early
years, we need to use
two additional Python
packages:
(a) The ElementTree
XML parser, because
feedparser cannot
handle multiple nested
elements for the
individual filings
(b) The zipfile
package so that we can
ZIP the downloaded files
up ourselves
Demo time
12
Organizing the downloaded files – file system
structure
13
For my purposes I have
already organized the
files at the same time as
I have downloaded
them. Since the RSS
feeds group the filings
nicely by year and
month, I have created
one subdirectory for
each year and one
subdirectory for each
month.
In order to easily
process the filings of
one particular reporting
entity, I have also
inserted the CIK# in
front of the ZIP file
name, since the SEC-
assigned accession
number does not really
help me when I want to
locate all filings for one
particular filer.
Organizing the downloaded files – making them
accessible by date, CIK, ticker
14
Selecting files for further
processing by date is
trivial due to our
directory structure.
Similarly, selecting
filings by CIK is easily
facilitated since the
filenames of all filings
now begin with the CIK.
The only part that needs
a little work is to make
them accessible by
ticker – fortunately the
SEC provides a web
service interface to look
up company
information and filings
by ticker, and the
resulting XML also
contains the CIK, which
we can retrieve via
simple XML parsing.
Processing and validating the XBRL filings
• Now that we have all the data organized, we can use Python to
process, e.g., all filings from one filer for a selected date range
• For this webinar we are going to use RaptorXML® Server to
process and validate the XBRL filings
• RaptorXML can directly process the filings inside of ZIP files, so no
manual extraction step is necessary
• We can also pass an entire batch
of jobs to RaptorXML at once:
• We can do this either via direct call
as shown here, or over the HTTP
API provided by RaptorXML Server.
15
Shameless Plug:
RaptorXML® is built
from the ground up to
be optimized for the
latest standards and
parallel computing
environments. Designed
to be highly cross-
platform capable, the
engine takes advantage
of today’s ubiquitous
multi-CPU computers to
deliver lightning fast
processing of XML and
XBRL data.
Therefore, we can pass
an entire batch of jobs
to RaptorXML to process
and validate in parallel,
maximizing CPU
utilization and available
system resources.
Processing and validating the filings – building the job
list based on dates and CIK
16
This directory contains
all filings for one
particular year and
month.
We iterate over all the
files in that directory.
If a list of CIKs was
provided, then we make
sure the filename starts
with the CIK.
Demo time – validating all 2010-2014 filings for ORCL
17
Demo time – validating all 2010-2014 filings for AAPL
18
Extracting useful information, e.g. financial ratios
• While it is interesting to discuss data quality of XBRL filings,
and to muse over inconsistencies for some filers or whether
the SEC should put more stringent validation checks on the
data it accepts, we really want to do more here…
• Can we extract useful financial ratios from these XBRL filings?
• For example, from the balance sheet:
19
Extracting useful information – passing Python script
to the built-in Python interpreter inside RaptorXML
• We can ask RaptorXML to execute some Python script code if
the XBRL validation has succeeded.
• From our outer Python code we pass
the script to RaptorXML:
• Then, whenever validation succeeds,
RaptorXML will execute that script
using its built-in Python interpreter:
20
RaptorXML is written in
C++ and available on all
major operating system
platforms, including
Windows, Linux,
MacOS, etc.
To facilitate easy
customization and
building powerful
solutions on top of
RaptorXML, it includes a
built-in Python
interpreter that makes
the entire DTS, XBRL
instance, schema, and
other relevant
information accessible
to 3rd party developers.
Extracting useful information – calculating ratios
using Python script inside RaptorXML
• The RaptorXML Python API makes available all necessary
components of the XBRL instance document and the DTS
(=Discoverable Taxonomy Set)
• To make the code more easily understandable for this webinar,
we’ve created a few helper functions to locate relevant facts
and print them, e.g., for the Current Ratio:
21
Extracting useful information – not all financial ratios
are easy to calculate
• Not all ratios are calculated from elements that can be easily
found in the US GAAP taxonomy
• Even for those ratios where an exact match exists in the
taxonomy, it is often necessary to walk through the calculation
linkbase chain and identify appropriate matches in order to
calculate ratios across filings from different entities
• For further information please see Roger Debreceny, et al:
“Feeding the Information Value Chain: Deriving Analytical
Ratios from XBRL filings to the SEC”, Draft research paper,
December 2010:http://eycarat.faculty.ku.edu//myssi/_pdf/2-
Debreceny-XBRL%20Ratios%2020101213.pdf
22
Extracting useful information – Quick Ratio
• One such example is the Cash fact needed to calculate the
Quick Ratio: we have to try three different facts depending on
what is available in the actual XBRL filing:
23
Demo time – calculating financial ratios for some
companies in my investment portfolio
24
In this example, we are
calculating the Current
Ratio, Quick Ratio, and
Cash Ratio for a set of
companies that happen
to be part of my
investment portfolio…
In addition to printing
the ratios on the screen,
we also store them in a
CSV file for further
processing and
graphing…
Financial Ratio Results
25
0
1
2
3
4
5
6
7
12/1/09
2/1/10
4/1/10
6/1/10
8/1/10
10/1/10
12/1/10
2/1/11
4/1/11
6/1/11
8/1/11
10/1/11
12/1/11
2/1/12
4/1/12
6/1/12
8/1/12
10/1/12
12/1/12
2/1/13
4/1/13
6/1/13
8/1/13
10/1/13
12/1/13
Current Ratio
APPLE INC
CATERPILLAR INC
CISCO SYSTEMS INC
COCA COLA CO
CORNING INC /NY
EXXON MOBIL CORP
Google Inc.
HARLEY DAVIDSON INC
Intel Corp
LOCKHEED MARTIN CORP
MICROSOFT CORP
Oracle Corporation
PFIZER INC
RAYTHEON CO/
Q&A and next steps
• Thank you for your time and for watching this webinar! Time
for some Q&A now…
• For more information on the XBRL data available from the SEC,
please visit http://xbrl.sec.gov/
• We also plan to post the Python scripts shown here on GitHub
in the near future.
• If you would like to learn more about RaptorXML®, please visit
the Altova website at http://www.altova.com/raptorxml.html
• For all other Altova® XBRL solutions, including taxonomy
development, data mapping, and rendering tools, please visit
http://www.altova.com/solutions/xbrl.html
• Free 30-day evaluation versions of all Altova products can be
downloaded from http://www.altova.com/download.html
26

More Related Content

What's hot

AWS 101 and the benefits of Migrating to the Cloud
AWS 101 and the benefits of Migrating to the CloudAWS 101 and the benefits of Migrating to the Cloud
AWS 101 and the benefits of Migrating to the CloudCloudHesive
 
Cloud Scale Analytics Pitch Deck
Cloud Scale Analytics Pitch DeckCloud Scale Analytics Pitch Deck
Cloud Scale Analytics Pitch DeckNicholas Vossburg
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureDATAVERSITY
 
Accelerate Revenue with a Customer Data Platform
Accelerate Revenue with a Customer Data PlatformAccelerate Revenue with a Customer Data Platform
Accelerate Revenue with a Customer Data PlatformLattice Engines
 
Replacing Tape Backup with Cloud-Enabled Solutions by Index Engines
Replacing Tape Backup with Cloud-Enabled Solutions by Index EnginesReplacing Tape Backup with Cloud-Enabled Solutions by Index Engines
Replacing Tape Backup with Cloud-Enabled Solutions by Index EnginesAmazon Web Services
 
Application of the Strategic Management Theories in Uber Bangladesh
Application of the Strategic Management Theories in Uber BangladeshApplication of the Strategic Management Theories in Uber Bangladesh
Application of the Strategic Management Theories in Uber BangladeshPantho Sarker
 
Capgemini Consulting - Digital Transformation
Capgemini Consulting - Digital TransformationCapgemini Consulting - Digital Transformation
Capgemini Consulting - Digital TransformationJohn Van Pijkeren
 
Delivery bros pitch deck
Delivery bros pitch deckDelivery bros pitch deck
Delivery bros pitch deckAbiodun Abbey
 
Deliver New Customer Experiences Through AI-enabled Chatbots
 Deliver New Customer Experiences Through AI-enabled Chatbots Deliver New Customer Experiences Through AI-enabled Chatbots
Deliver New Customer Experiences Through AI-enabled ChatbotsAmazon Web Services
 
Collibra - Forrester Presentation : Data Governance 2.0
Collibra - Forrester Presentation : Data Governance 2.0Collibra - Forrester Presentation : Data Governance 2.0
Collibra - Forrester Presentation : Data Governance 2.0Guillaume LE GALIARD
 
Digitalization and business model innovation
Digitalization and business model innovationDigitalization and business model innovation
Digitalization and business model innovationPeter Tyreholt
 
Digital Transformation. Examples from Automotive Industry
Digital Transformation. Examples from Automotive IndustryDigital Transformation. Examples from Automotive Industry
Digital Transformation. Examples from Automotive IndustryBoost40
 
Technology stack behind Airbnb
Technology stack behind Airbnb Technology stack behind Airbnb
Technology stack behind Airbnb Rohan Khude
 
Overcoming Digital Transformation Pain Points
Overcoming Digital Transformation Pain PointsOvercoming Digital Transformation Pain Points
Overcoming Digital Transformation Pain PointsInductive Automation
 
Robotic Process Automation | Accenture
Robotic Process Automation | AccentureRobotic Process Automation | Accenture
Robotic Process Automation | Accentureaccenture
 

What's hot (20)

AWS 101 and the benefits of Migrating to the Cloud
AWS 101 and the benefits of Migrating to the CloudAWS 101 and the benefits of Migrating to the Cloud
AWS 101 and the benefits of Migrating to the Cloud
 
Cloud Scale Analytics Pitch Deck
Cloud Scale Analytics Pitch DeckCloud Scale Analytics Pitch Deck
Cloud Scale Analytics Pitch Deck
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Accelerate Revenue with a Customer Data Platform
Accelerate Revenue with a Customer Data PlatformAccelerate Revenue with a Customer Data Platform
Accelerate Revenue with a Customer Data Platform
 
Replacing Tape Backup with Cloud-Enabled Solutions by Index Engines
Replacing Tape Backup with Cloud-Enabled Solutions by Index EnginesReplacing Tape Backup with Cloud-Enabled Solutions by Index Engines
Replacing Tape Backup with Cloud-Enabled Solutions by Index Engines
 
Uber Case Presentation
Uber Case PresentationUber Case Presentation
Uber Case Presentation
 
Application of the Strategic Management Theories in Uber Bangladesh
Application of the Strategic Management Theories in Uber BangladeshApplication of the Strategic Management Theories in Uber Bangladesh
Application of the Strategic Management Theories in Uber Bangladesh
 
Capgemini Consulting - Digital Transformation
Capgemini Consulting - Digital TransformationCapgemini Consulting - Digital Transformation
Capgemini Consulting - Digital Transformation
 
Delivery bros pitch deck
Delivery bros pitch deckDelivery bros pitch deck
Delivery bros pitch deck
 
Deliver New Customer Experiences Through AI-enabled Chatbots
 Deliver New Customer Experiences Through AI-enabled Chatbots Deliver New Customer Experiences Through AI-enabled Chatbots
Deliver New Customer Experiences Through AI-enabled Chatbots
 
Collibra - Forrester Presentation : Data Governance 2.0
Collibra - Forrester Presentation : Data Governance 2.0Collibra - Forrester Presentation : Data Governance 2.0
Collibra - Forrester Presentation : Data Governance 2.0
 
Introduction to Sagemaker
Introduction to SagemakerIntroduction to Sagemaker
Introduction to Sagemaker
 
Rubix Health Columbia 015
Rubix Health Columbia 015Rubix Health Columbia 015
Rubix Health Columbia 015
 
Digitalization and business model innovation
Digitalization and business model innovationDigitalization and business model innovation
Digitalization and business model innovation
 
Magicbricks-pptx.pptx
Magicbricks-pptx.pptxMagicbricks-pptx.pptx
Magicbricks-pptx.pptx
 
Digital Transformation. Examples from Automotive Industry
Digital Transformation. Examples from Automotive IndustryDigital Transformation. Examples from Automotive Industry
Digital Transformation. Examples from Automotive Industry
 
Technology stack behind Airbnb
Technology stack behind Airbnb Technology stack behind Airbnb
Technology stack behind Airbnb
 
Overcoming Digital Transformation Pain Points
Overcoming Digital Transformation Pain PointsOvercoming Digital Transformation Pain Points
Overcoming Digital Transformation Pain Points
 
Robotic Process Automation | Accenture
Robotic Process Automation | AccentureRobotic Process Automation | Accenture
Robotic Process Automation | Accenture
 
Amazon Rekognition
Amazon RekognitionAmazon Rekognition
Amazon Rekognition
 

Viewers also liked

Database mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomyDatabase mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomyAlexander Falk
 
What has worked on investing - data only
What has worked on investing - data onlyWhat has worked on investing - data only
What has worked on investing - data onlyGiraffeValue
 
Nouveautés dans les logiciels frédéric testaert raphaël swinnen
Nouveautés dans les logiciels   frédéric testaert raphaël swinnenNouveautés dans les logiciels   frédéric testaert raphaël swinnen
Nouveautés dans les logiciels frédéric testaert raphaël swinnenTaxTalk
 
Workshop Fonctionnel - Mecanisme surveillance unique
Workshop Fonctionnel - Mecanisme surveillance uniqueWorkshop Fonctionnel - Mecanisme surveillance unique
Workshop Fonctionnel - Mecanisme surveillance uniqueNovencia Groupe
 
MACPA XBRL Case Study - Altova
MACPA XBRL Case Study - AltovaMACPA XBRL Case Study - Altova
MACPA XBRL Case Study - AltovaThomas Hood
 
Building a data processing pipeline in Python
Building a data processing pipeline in PythonBuilding a data processing pipeline in Python
Building a data processing pipeline in PythonJoe Cabrera
 
Guia educacion-sexual-integral-nivel-primaria
Guia educacion-sexual-integral-nivel-primariaGuia educacion-sexual-integral-nivel-primaria
Guia educacion-sexual-integral-nivel-primariaIsela Guerrero Pacheco
 
O lobo instruido
O lobo instruidoO lobo instruido
O lobo instruidotlfleite
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataPatrick McFadin
 

Viewers also liked (10)

Database mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomyDatabase mapping of XBRL instance documents from the WIP taxonomy
Database mapping of XBRL instance documents from the WIP taxonomy
 
What has worked on investing - data only
What has worked on investing - data onlyWhat has worked on investing - data only
What has worked on investing - data only
 
Nouveautés dans les logiciels frédéric testaert raphaël swinnen
Nouveautés dans les logiciels   frédéric testaert raphaël swinnenNouveautés dans les logiciels   frédéric testaert raphaël swinnen
Nouveautés dans les logiciels frédéric testaert raphaël swinnen
 
Altova NIEM keynote
Altova NIEM keynoteAltova NIEM keynote
Altova NIEM keynote
 
Workshop Fonctionnel - Mecanisme surveillance unique
Workshop Fonctionnel - Mecanisme surveillance uniqueWorkshop Fonctionnel - Mecanisme surveillance unique
Workshop Fonctionnel - Mecanisme surveillance unique
 
MACPA XBRL Case Study - Altova
MACPA XBRL Case Study - AltovaMACPA XBRL Case Study - Altova
MACPA XBRL Case Study - Altova
 
Building a data processing pipeline in Python
Building a data processing pipeline in PythonBuilding a data processing pipeline in Python
Building a data processing pipeline in Python
 
Guia educacion-sexual-integral-nivel-primaria
Guia educacion-sexual-integral-nivel-primariaGuia educacion-sexual-integral-nivel-primaria
Guia educacion-sexual-integral-nivel-primaria
 
O lobo instruido
O lobo instruidoO lobo instruido
O lobo instruido
 
Apache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series dataApache cassandra & apache spark for time series data
Apache cassandra & apache spark for time series data
 

Similar to Download and Process SEC XBRL Data for Financial Analysis

S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionFlink Forward
 
Centralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackCentralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackRohit Sharma
 
Instrumentation with Splunk
Instrumentation with SplunkInstrumentation with Splunk
Instrumentation with SplunkDatavail
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunk
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunk
 
Meetup Oracle Database: 3 Analizar, Aconsejar, Automatizar… las nuevas funcio...
Meetup Oracle Database: 3 Analizar, Aconsejar, Automatizar… las nuevas funcio...Meetup Oracle Database: 3 Analizar, Aconsejar, Automatizar… las nuevas funcio...
Meetup Oracle Database: 3 Analizar, Aconsejar, Automatizar… las nuevas funcio...avanttic Consultoría Tecnológica
 
It Shore Beats Working: Configuring Elasticsearch to get the Most out of Clo...
It Shore Beats Working:  Configuring Elasticsearch to get the Most out of Clo...It Shore Beats Working:  Configuring Elasticsearch to get the Most out of Clo...
It Shore Beats Working: Configuring Elasticsearch to get the Most out of Clo...Ipro Tech
 
Using Perforce Data in Development at Tableau
Using Perforce Data in Development at TableauUsing Perforce Data in Development at Tableau
Using Perforce Data in Development at TableauPerforce
 
Alfresco benchmark report_bl100093
Alfresco benchmark report_bl100093Alfresco benchmark report_bl100093
Alfresco benchmark report_bl100093ECNU
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataRobert Grossman
 
2015 03-16-elk at-bsides
2015 03-16-elk at-bsides2015 03-16-elk at-bsides
2015 03-16-elk at-bsidesJeremy Cohoe
 
Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...Mich Talebzadeh (Ph.D.)
 
Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...Mich Talebzadeh (Ph.D.)
 
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...Landon Robinson
 
Open Source LinkedIn Analytics Pipeline - BOSS 2016 (VLDB)
Open Source LinkedIn Analytics Pipeline - BOSS 2016 (VLDB)Open Source LinkedIn Analytics Pipeline - BOSS 2016 (VLDB)
Open Source LinkedIn Analytics Pipeline - BOSS 2016 (VLDB)Issac Buenrostro
 
Introduction to Filecoin
Introduction to Filecoin   Introduction to Filecoin
Introduction to Filecoin Vanessa Lošić
 
Intro to InfluxDB 2.0 and Your First Flux Query by Sonia Gupta
Intro to InfluxDB 2.0 and Your First Flux Query by Sonia GuptaIntro to InfluxDB 2.0 and Your First Flux Query by Sonia Gupta
Intro to InfluxDB 2.0 and Your First Flux Query by Sonia GuptaInfluxData
 
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringApache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringDatabricks
 
Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021Max Lapan
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkSlim Baltagi
 

Similar to Download and Process SEC XBRL Data for Financial Analysis (20)

S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data CompanionS. Bartoli & F. Pompermaier – A Semantic Big Data Companion
S. Bartoli & F. Pompermaier – A Semantic Big Data Companion
 
Centralized Logging System Using ELK Stack
Centralized Logging System Using ELK StackCentralized Logging System Using ELK Stack
Centralized Logging System Using ELK Stack
 
Instrumentation with Splunk
Instrumentation with SplunkInstrumentation with Splunk
Instrumentation with Splunk
 
SplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding OverviewSplunkLive! Munich 2018: Data Onboarding Overview
SplunkLive! Munich 2018: Data Onboarding Overview
 
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding OverviewSplunkLive! Frankfurt 2018 - Data Onboarding Overview
SplunkLive! Frankfurt 2018 - Data Onboarding Overview
 
Meetup Oracle Database: 3 Analizar, Aconsejar, Automatizar… las nuevas funcio...
Meetup Oracle Database: 3 Analizar, Aconsejar, Automatizar… las nuevas funcio...Meetup Oracle Database: 3 Analizar, Aconsejar, Automatizar… las nuevas funcio...
Meetup Oracle Database: 3 Analizar, Aconsejar, Automatizar… las nuevas funcio...
 
It Shore Beats Working: Configuring Elasticsearch to get the Most out of Clo...
It Shore Beats Working:  Configuring Elasticsearch to get the Most out of Clo...It Shore Beats Working:  Configuring Elasticsearch to get the Most out of Clo...
It Shore Beats Working: Configuring Elasticsearch to get the Most out of Clo...
 
Using Perforce Data in Development at Tableau
Using Perforce Data in Development at TableauUsing Perforce Data in Development at Tableau
Using Perforce Data in Development at Tableau
 
Alfresco benchmark report_bl100093
Alfresco benchmark report_bl100093Alfresco benchmark report_bl100093
Alfresco benchmark report_bl100093
 
A Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate DataA Gen3 Perspective of Disparate Data
A Gen3 Perspective of Disparate Data
 
2015 03-16-elk at-bsides
2015 03-16-elk at-bsides2015 03-16-elk at-bsides
2015 03-16-elk at-bsides
 
Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...
 
Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...Real time processing of trade data with kafka, spark streaming and aerospike ...
Real time processing of trade data with kafka, spark streaming and aerospike ...
 
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
Spark + AI Summit 2019: Apache Spark Listeners: A Crash Course in Fast, Easy ...
 
Open Source LinkedIn Analytics Pipeline - BOSS 2016 (VLDB)
Open Source LinkedIn Analytics Pipeline - BOSS 2016 (VLDB)Open Source LinkedIn Analytics Pipeline - BOSS 2016 (VLDB)
Open Source LinkedIn Analytics Pipeline - BOSS 2016 (VLDB)
 
Introduction to Filecoin
Introduction to Filecoin   Introduction to Filecoin
Introduction to Filecoin
 
Intro to InfluxDB 2.0 and Your First Flux Query by Sonia Gupta
Intro to InfluxDB 2.0 and Your First Flux Query by Sonia GuptaIntro to InfluxDB 2.0 and Your First Flux Query by Sonia Gupta
Intro to InfluxDB 2.0 and Your First Flux Query by Sonia Gupta
 
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy MonitoringApache Spark Listeners: A Crash Course in Fast, Easy Monitoring
Apache Spark Listeners: A Crash Course in Fast, Easy Monitoring
 
Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021Scalable crawling with Kafka, scrapy and spark - November 2021
Scalable crawling with Kafka, scrapy and spark - November 2021
 
Unified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache FlinkUnified Batch and Real-Time Stream Processing Using Apache Flink
Unified Batch and Real-Time Stream Processing Using Apache Flink
 

Recently uploaded

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Download and Process SEC XBRL Data for Financial Analysis

  • 1. How To Download and Process SEC XBRL Data Directly from EDGAR XBRL Technology Webinar Series 1 Alexander Falk, CEO, Altova, Inc.
  • 2. Agenda • Introduction • Downloading all XBRL data from EDGAR – Accessing the SEC’s EDGAR archive of all XBRL files via RSS feeds – Downloading the ZIP file enclosures for each filing – Special considerations for early years • Organizing your downloaded files – You now have 105,840 ZIP files totaling 14.1 GB – Making them accessible by date, CIK, ticker • Processing and validating the XBRL filings • Extracting useful information, e.g., financial ratios 2
  • 3. Introduction • The SEC's EDGAR System holds a wealth of XBRL-formatted financial data for over 9,700 corporate entities. • Accessing XBRL-formatted SEC data can seem like a daunting task and most people think it requires creating a database before the first data-point can be pulled. • In this webinar we will show you how to pull all the data directly from the SEC archive and download it to your computer, then process it, and perform financial analysis. • This will hopefully spark ideas that you can use in your own applications to take advantage of the computer-readable XBRL data now freely available. 3 For all source code examples we will be using Python 3.3.3, since it is widely available on all operating system platforms and also provides for easily readable code. Obviously the approach shown in this webinar can easily be implemented in other languages, such as Java, C#, etc.
  • 4. Why don’t you just store it in a database? • Processing and validating the XBRL files once and then storing the extracted data in a relational database is not necessarily a bad approach… BUT: • Any data quality analysis as well as validation statistics and a documentation of data inconsistencies with respect to the calculation linkbase require the data to remain in XBRL format • New XBRL technologies, like XBRL Formula and the XBRL Table Linkbase provide the potential for future applications that require the original data to be in XBRL format and not shredded and stored in a database • Similarly the ability to query the data with an XQuery 3.0 processor only exists if the data remains in its original XML- based XBRL format 4 Clearly, if the only goal of processing the filings is to derive numerical data from the facts and build financial analytics, then pre-processing and shredding the data into a database may still be your best approach… The XBRL.US database by Phillip Engel as well as Herm Fisher’s talk at the last XBRL.US conference about using Arelle to populate a PostgreSQL database are good starting points for the database-based approach.
  • 5. Downloading the data – accessing the RSS feeds • In addition to the nice web-based user interface for searching XBRL filings, the SEC website offers RSS feeds to all XBRL filings ever received: – http://www.sec.gov/spotlight/xbrl/filings-and-feeds.shtml • In particular, there is a monthly, historical archive of all filings with XBRL exhibits submitted to the SEC, beginning with the inception of the voluntary program in 2005: – http://www.sec.gov/Archives/edgar/monthly/ • This is our starting point, and it contains one RSS file per month from April 2005 until the current month of March 2014. 5
  • 6. Downloading the data – understanding the RSS feed 6
  • 7. Downloading the data – loading the RSS feed 7 In Python we can easily use the urlopen function from the urllib package to open a file from a URL and read its data
  • 8. Downloading the data – parsing the RSS feed to extract the ZIP file enclosure filename 8 In Python we can easily use the feedparser 5.1.3 package to parse the RSS feed and extract the ZIP file name and CIK# https://pypi.python.org /pypi/feedparser Please note that for the local filename I am constructing that by inserting the CIK in front of the actual ZIP file name.
  • 9. Downloading the data – loading the ZIP file enclosure 9 Please note that it is prudent to first check if we already have a local copy of that particular filing. We should only download the ZIP file, if we don’t find it on our local machine yet. Also, please note that common courtesy dictates that if you plan to download all years 2005-2014 of XBRL filings from the SEC’s EDGAR archive, you should do so during off- peak hours, e.g. night- time or weekends in order not to tax the servers during normal business hours, as you will be downloading 105,840 files or 14.1GB of data!
  • 10. Downloading the data – special considerations for early years • For years 2005-2007 most filings do not yet contain a ZIP file enclosure. • Even in 2008-2009 some filings can occasionally be found that are not yet provided in a ZIP file. • If you are interested in analyzing data from those early years, a little bit of extra work is required to download all the individual XBRL files from a filing and then ZIP them up locally. • If done properly, all future analysis can then access all the filings in the same manner directly from the ZIP files. 10
  • 11. Downloading the early years – ZIPping the XBRL files on our local machine 11 If we want to download data from the early years, we need to use two additional Python packages: (a) The ElementTree XML parser, because feedparser cannot handle multiple nested elements for the individual filings (b) The zipfile package so that we can ZIP the downloaded files up ourselves
  • 13. Organizing the downloaded files – file system structure 13 For my purposes I have already organized the files at the same time as I have downloaded them. Since the RSS feeds group the filings nicely by year and month, I have created one subdirectory for each year and one subdirectory for each month. In order to easily process the filings of one particular reporting entity, I have also inserted the CIK# in front of the ZIP file name, since the SEC- assigned accession number does not really help me when I want to locate all filings for one particular filer.
  • 14. Organizing the downloaded files – making them accessible by date, CIK, ticker 14 Selecting files for further processing by date is trivial due to our directory structure. Similarly, selecting filings by CIK is easily facilitated since the filenames of all filings now begin with the CIK. The only part that needs a little work is to make them accessible by ticker – fortunately the SEC provides a web service interface to look up company information and filings by ticker, and the resulting XML also contains the CIK, which we can retrieve via simple XML parsing.
  • 15. Processing and validating the XBRL filings • Now that we have all the data organized, we can use Python to process, e.g., all filings from one filer for a selected date range • For this webinar we are going to use RaptorXML® Server to process and validate the XBRL filings • RaptorXML can directly process the filings inside of ZIP files, so no manual extraction step is necessary • We can also pass an entire batch of jobs to RaptorXML at once: • We can do this either via direct call as shown here, or over the HTTP API provided by RaptorXML Server. 15 Shameless Plug: RaptorXML® is built from the ground up to be optimized for the latest standards and parallel computing environments. Designed to be highly cross- platform capable, the engine takes advantage of today’s ubiquitous multi-CPU computers to deliver lightning fast processing of XML and XBRL data. Therefore, we can pass an entire batch of jobs to RaptorXML to process and validate in parallel, maximizing CPU utilization and available system resources.
  • 16. Processing and validating the filings – building the job list based on dates and CIK 16 This directory contains all filings for one particular year and month. We iterate over all the files in that directory. If a list of CIKs was provided, then we make sure the filename starts with the CIK.
  • 17. Demo time – validating all 2010-2014 filings for ORCL 17
  • 18. Demo time – validating all 2010-2014 filings for AAPL 18
  • 19. Extracting useful information, e.g. financial ratios • While it is interesting to discuss data quality of XBRL filings, and to muse over inconsistencies for some filers or whether the SEC should put more stringent validation checks on the data it accepts, we really want to do more here… • Can we extract useful financial ratios from these XBRL filings? • For example, from the balance sheet: 19
  • 20. Extracting useful information – passing Python script to the built-in Python interpreter inside RaptorXML • We can ask RaptorXML to execute some Python script code if the XBRL validation has succeeded. • From our outer Python code we pass the script to RaptorXML: • Then, whenever validation succeeds, RaptorXML will execute that script using its built-in Python interpreter: 20 RaptorXML is written in C++ and available on all major operating system platforms, including Windows, Linux, MacOS, etc. To facilitate easy customization and building powerful solutions on top of RaptorXML, it includes a built-in Python interpreter that makes the entire DTS, XBRL instance, schema, and other relevant information accessible to 3rd party developers.
  • 21. Extracting useful information – calculating ratios using Python script inside RaptorXML • The RaptorXML Python API makes available all necessary components of the XBRL instance document and the DTS (=Discoverable Taxonomy Set) • To make the code more easily understandable for this webinar, we’ve created a few helper functions to locate relevant facts and print them, e.g., for the Current Ratio: 21
  • 22. Extracting useful information – not all financial ratios are easy to calculate • Not all ratios are calculated from elements that can be easily found in the US GAAP taxonomy • Even for those ratios where an exact match exists in the taxonomy, it is often necessary to walk through the calculation linkbase chain and identify appropriate matches in order to calculate ratios across filings from different entities • For further information please see Roger Debreceny, et al: “Feeding the Information Value Chain: Deriving Analytical Ratios from XBRL filings to the SEC”, Draft research paper, December 2010:http://eycarat.faculty.ku.edu//myssi/_pdf/2- Debreceny-XBRL%20Ratios%2020101213.pdf 22
  • 23. Extracting useful information – Quick Ratio • One such example is the Cash fact needed to calculate the Quick Ratio: we have to try three different facts depending on what is available in the actual XBRL filing: 23
  • 24. Demo time – calculating financial ratios for some companies in my investment portfolio 24 In this example, we are calculating the Current Ratio, Quick Ratio, and Cash Ratio for a set of companies that happen to be part of my investment portfolio… In addition to printing the ratios on the screen, we also store them in a CSV file for further processing and graphing…
  • 25. Financial Ratio Results 25 0 1 2 3 4 5 6 7 12/1/09 2/1/10 4/1/10 6/1/10 8/1/10 10/1/10 12/1/10 2/1/11 4/1/11 6/1/11 8/1/11 10/1/11 12/1/11 2/1/12 4/1/12 6/1/12 8/1/12 10/1/12 12/1/12 2/1/13 4/1/13 6/1/13 8/1/13 10/1/13 12/1/13 Current Ratio APPLE INC CATERPILLAR INC CISCO SYSTEMS INC COCA COLA CO CORNING INC /NY EXXON MOBIL CORP Google Inc. HARLEY DAVIDSON INC Intel Corp LOCKHEED MARTIN CORP MICROSOFT CORP Oracle Corporation PFIZER INC RAYTHEON CO/
  • 26. Q&A and next steps • Thank you for your time and for watching this webinar! Time for some Q&A now… • For more information on the XBRL data available from the SEC, please visit http://xbrl.sec.gov/ • We also plan to post the Python scripts shown here on GitHub in the near future. • If you would like to learn more about RaptorXML®, please visit the Altova website at http://www.altova.com/raptorxml.html • For all other Altova® XBRL solutions, including taxonomy development, data mapping, and rendering tools, please visit http://www.altova.com/solutions/xbrl.html • Free 30-day evaluation versions of all Altova products can be downloaded from http://www.altova.com/download.html 26