This document provides a summary of a presentation about solving geophysics problems with Python. The presentation introduces geophysics and lists popular Python libraries for working with geophysical data. It discusses the history of technologies in the oil and gas industry like well logs, seismography, and how advances in data acquisition and analysis have improved oil discovery. The presentation outlines the iterative workflow for subsurface characterization and notes how data impacts the entire oil and gas value chain. It predicts the current decade will be one of increased sensing and data collection as mobility, IoT, and analytics bring more value to the industry.
Solving Geophysics Problems with Python - Speaker Notes
1. Slide 1
SOLVING
GEOPHYSICS
PROBLEMS
WITH PYTHON
PAIGE BAILEY
SEPTEMBER 29, 2015
STRATA + HADOOP WORLD 2015
Slide 2 YOUR MISSION, SHOULD YOU CHOOSE TO ACCEPT IT
So you’ll have a good idea on whether you
want to stick around or not… ;)
- General overview of what Geophysics is
- Listing of some of my favorite python and
geophysics libraries
- People are doing great work, and deserve
to be recognized
- The final topic is going to be what I know
best (I guess) – the progression of data
throughout the life cycle of the oil industry
Slide 3
WARNING!
…OR DISCLAIMER, RATHER
2. Slide 4 PAIGE BAILEY
@DynamicWebPaige
Employed by a truly rad technology-focused
O&G company by day, MS Earth Sciences
graduate student at Rice University by night,
founder of PyLadies-HTX (though these
sometimes all bleed into one another)
Background: degrees are ABA, BA Sociology,
BS Geophysics – which is the weirdest combo
anyone could ever have
Slide 5
WHAT IS
“GEOPHYSICS”?
Slide 6 Adrian Lenardic’s first class.
Magritte actually had a series of paintings of
curiously-shaped rocks suspended in space,
or in natural settings. Arches national park;
other curious geologic formations. How did
they get there? What processes shaped
them?
Hydrology and the Talking Heads.
5. Slide 13
WHAT IS
“GEOPHYSICS”?
Slide 14
THEMES
• Gravity
• Heat flow
• Electricity
• Fluid dynamics
• Magnetism
• Radioactivity
• Mineral Physics
• Vibration
…handshakes with atmospheric sciences, geology, engineering,
hydrology, planetary sciences, global positioning systems…
Huge concepts, right?
Slide 15
GRAVITY
Bouguer anomaly
Geoid
Geopotential
Gravity anomaly
Undulation of the geoid
6. Slide 16 20,000 feet tall
Cathedral sized. More than a cathedral. For
context, the Empire State Building is like 1300
feet.
Slide 17 And they’re all over the dang place. Mention
the Lake Peigneur salt mine fiasco.
Slide 18
HEAT FLOW
Geothermal gradients and internal heating
Suburface heat flow – whole earth
geophysics
Heating of hydrocarbons – if the organic
material is too deeply buried, it turns into gas
or “overcooks” entirely
7. Slide 19
FLUID DYNAMICS
Isostasy
Post-glacial rebound
Mantle convection
Geodynamo
Rate of lithospheric uplift due to Postglacial
Rebound, as modelled by Paulson, A., S.
Zhong, and J. Wahr. Inference of mantle
viscosity from GRACE and relative sea level
data, Geophys. J. Int. (2007) 171, 497–508.
doi: 10.1111/j.1365-246X.2007.03556.x
Slide 20 This layered beach at Bathurst
Inlet,Nunavut is an example of post-glacial
rebound after the last Ice Age. Little to no
tide helped to form its layer-cake look.
Isostatic rebound is still underway here.
Canada.
Slide 21
MAGNETISM
The Earth’s poles sometimes reverse
direction – and we don’t know why. North at
the bottom, south at the top. What’s
interesting is that as the seafloor spreads,
cools, and lithifies, certain minerals in the
rock orient themselves to align with Earth’s
current polarity. This means that as you
check magnetism readings along the bottom
of the seafloor, you see these wonderful
bands
8. Slide 22 Whole earth perspective: Earth’s magnetic
field
Slide 23
MINERAL PHYSICS
Basically materials science – researching how
structures change based on differential
heating, pressure, compaction. Same
chemical makeup, different expressions and
structures.
Slide 24 VIBRATION
(A.K.A., SEISMIC)
A great resource for this is USGS’s
earthquakes website.
9. Slide 25 VIBRATION
(A.K.A., SEISMIC)
…WE’LL TALK ABOUT THIS MORE SOON
Slide 26
…AND UNEXPECTED USE CASES
3D-printing Geology with Python
Joe Kington’s presentation on 3D-printing
cubes of geology (to get a better feel for the
stratigraphy) and seismic
Slide 27 LIBRARIES / SOFTWARE
MENTIONED
Madagascar
PySIT
Segpy
segpy-py
SLIMpy
Fatiando a Terra
ObsPy
PyGMI
SimPEG
Seismic Handler
sgp4
PyGMI
SgFm
laspy
ParaView Geo
3ptScience
Agile Geoscience
- Bruges
- Modelr
- Pick This
- G3.js
- Striplog
ArcPy
PyQGIS
…so many other geospatial libraries
Madagascar – multi-dimensional data
analysis, including seismic processing
PySIT – imaging and inversion
Segpy – reading and writing SEG-Y files
segpy-py – reading SEG-Y files
SLIMpy – processing front end
Fatiando a Terra – geophysical modeling and
inversion; extensive cookbook
ObsPy – seismology toolbox
PyGMI – 3D interpretation and modelling of
magnetic and gravity data
SimPEG – simulation and parameter
estimation in geophysics; great learning
utility
Seismic Handler – signal processing for
earthquakes
sgp4 – tracking earth satellites
Py-ART – python ARM radar toolkit (weather
10. data)
SgFm – sediment transport at geologic scale
Laspy – LAS file conversion
ParaView Geo – 3D geoscience visualization
3ptScience – Rowan Cockett’s website
Bruges – modelling and post-processing
seismic reflection data
Modelr – seismic forward modeling on the
web
Pick This – social image interpretation
G3.js – coming soon, a geoscience wrapper
for D3.js
Striplog – wrangling 1D data, usually core
with varying sample rates
ArcPy – geospatial processing tools for ArcGIS
PyQGIS – the same, for the open-source
mapping alternative QGIS
University of British Columbia
SEG-Y is one of the standards developed by
SEG for storing geophysical data
Slide 28
ALMOST ALL
OF THAT IS
OPEN-SOURCE
BUT HERE’S THE KICKER:
11. Slide 29
ALMOST ALL
OF THAT IS
OPEN-SOURCE
(AND SO IS THE DATA)
BUT HERE’S THE KICKER:
USGS puts out scads of data sets; so does
NASA
Mention the importance of Python in
geoscience research (and science research in
general) because there’s a move toward
reusable code and repeatable experiments
“Github for scientists is just… Github.”
Slide 30 GEOPHYSICS-FOCUSED
SCIPY TALKS
2012
ALGES: Geostatistics and Pythong
Py-ART: Python for Remote Sensing Science
Building a Solver Based on PyClaw for the Solution of the Multi-Layer Shallow Water Equations
2013
Modeling the Earth with Fatiando a Terra
2014
The Road to Modelr: Building a Commercial Web Application on an Open-Source Foundation
Measuring Rainshafts: Bringing Python to Bear on Remote Sensing Data
The History and Design Behind the Python Geophysical Modeling and Interpretation (PyGMI) Package
Prototyping a Geophysical Algorithm in Python
2015
(and an entire Geophysics Track)
Using Python to Span the Gap Between Education, Research, and Industry Applications in Geophysics
Practical Integration of Processing, Inversion, and Visualization of Magnetotelluric Geophysial Data
Striplog: Wranging 1D Subsurface Data
Geodynamic Simulations in HPC with Python
SEG Hackathon – sponsored by Agile
geoscience, I believe it’s their third
Saturday and Sunday, October 17th
and 18th
so you can go to this without going to the
SEG Conference as a whole, if you can’t get
off work.
Slide 31
LET’S TALK ABOUT ENERGY
…but now for something completely different
And apologies for focusing on the oil and gas
aspects of energy.
12. Slide 32
FIRST WELL LOG?
Slide 33 - 1927 by Conrad Schlumberger, though
he’d been formulating the idea since 1919
- He sent down a sonde (sensor attached to
a wire) into a 500m deep well in the Alsace
region of France and started collecting
information
- “Electrical resistivity log”
- All measurements were made by hand
Slide 34
FIRST
SEISMOGRAPH?
13. Slide 35 - 1921 by J. Clarence Karcher, who was an
Electrical Engineer
- This is the means by which the majority of
the world’s oil reserves have been
discovered
- Founded Geophysical Service Incorporated
in 1930, which eventually turned into
Texas Instruments
- Got the idea because his assignment in
World War I, the assignment that took him
out of grad school, was to locate heavy
artillery batteries in France by studying the
acoustic waves the guns generated in the
air.
- He noticed an unexpected event in his
research and switch his concentration to
seismic waves in the earth
- He thoughts it would be possible to
determine the depths of the underlying
geologic strata by vibrating the earth’s
surface while precisely recording and
timing the waves of energy
Slide 36
FIRST OIL WELL?
14. Slide 37 - Earliest known oil wells were drilled in
China, in 347 AD
- These wells had depths of up to about 790
feet, and were drilled using bits attached
to bamboo poles
Egyptians were using asphalt more than 4000
years ago, in the construction of the walls of
Babylon. Ancient Persians were using
petroleum for medicinal and lighting uses.
The first streets of Baghdad were paved with
tar.
Befuddled “shoot the ground and gusher
comes up” situations. Producing dozens of
barrels a day, maybe hundreds, but recovery
rates were exceptionally low, and you weren’t
really finding anything interesting.
Slide 38
Drilling has been around for a long
time, but its success is due to
improved data acquisition and
data analysis methods.
I guess the point that I’m trying to make is
that…
[read slide]
Advances in technology create a marked step
change in petroleum exploration. Those
advances are primarily in terms of better
hardware / equipment, which give explorers
better data about the subsurface. The data is
the key.
15. Slide 39
NOW
Slide 40 Now, I’m a geophysicist – so those advances
are the ones I’m best at spotting.
- Point out the upticks for 2D seismic, better
resolution for 3D seismic
80’s: 2D data acquired, pre-stack and post-
stack imaging, Cray supercomputers
90’s: 3D narrow azimuth data, 3D post-stack
and pre-stack imaging, Unix
00’s: 3D wide azimuth data, imaging, reverse
time migration; Linux clusters
Now: coil shooting, continuous machine-
generated sensory data
Mathematical insights – mention that last
night you found out that the guy who first
discovered the FFT was a Chevron employee,
ain’t no thing
16. Slide 41 Point out fracking boom, mention that the
crazy upward tick has continued, though the
steepness of the slope has decreased a bit
due to the drop in oil prices
Slide 42
WORLD’S LARGEST PUBLIC, STATE-OWNED,
AND PRIVATE BUSINESSES
Shamelessly stolen from Wikipedia
Slide 43
WORLD’S LARGEST PUBLIC, STATE-OWNED,
AND PRIVATE BUSINESSES
7 out of 10
7 out of 10 of the largest public, state-owned,
and private businesses – and a huge
proportion of the overall list. Trillions of
dollars of revenue.
Direct link to reserves and success of a
company. We’re selling a thing; the margins
on the beef jerky you buy in a gas station are
higher than the margins for a barrel of oil
17. Slide 44
Profitability for oil companies is
directly tied to reserves.
Oil companies are all in the business of
getting barrels out of the ground – so
characterizing the subsurface is incredibly
important. Both of those bits of data that I
mentioned before – that came so late in the
game – were huge technological step changes
for the industry, and drastically impacted oil
discovery.
Improved resolution within the reservoir is
critical because deepwater wells cost a lot -
$100 million or more – and fully exploiting
assets is essential
Slide 45
UPSTREAM BIG DATA
(Seshadri M., 2013)
Slide 46
Mapping
Reservoir
Characterization
Cross-sections
Petrophysics
Reservoir Simulation
Well Planning &
Drilling Simulation Stratigraphic Modeling
Seismic Interpretation
The oil industry is a bit like an ecosystem. This
particular piece is subsurface characterization
– the earth science-y and engineering bits
- Every image you see here has a data type
(or more!) associated with it, and, though
it’s getting better, a shortage of standards
18. Slide 47
Mapping
Reservoir
Characterization
Cross-sections
Petrophysics
Reservoir Simulation
Well Planning &
Drilling Simulation Stratigraphic Modeling
Seismic Interpretation
So these components of the energy
ecosystem, and this subsurface data
workflow can be grouped into “earth science-
y bits” and “engineering bits” with this kind
of fuzzy area in between with petrophysics
Earth scientists record millions and billions of
data points called “seismic” and they don’t
trust any of them unless you put them all
together
Engineers trust pressure readings in the well,
the stuff they can measure with sensors –
and trust it everywhere, and extrapolate
everywhere
Slide 48
Mapping
Reservoir
Characterization
Cross-sections
Petrophysics
Reservoir Simulation
Well Planning &
Drilling Simulation Stratigraphic Modeling
Seismic Interpretation
Something that I should also mention is that
this is an iterative process. I put a loop here,
but in reality, all of these steps can feed back
into one another – and a change to one
component of the subsurface model
drastically impacts all other components
New sorts of geology: horizontal drilling and
hydraulic fracturing combined have been
revolutionary
Slide 49
Data impacts the entire value chain.
All that I mentioned before was earth
sciences or drilling related – impacting the
“upstream” components of the oil industry.
But in reality, data impacts every single
component of the oil and gas value chain.
And what’s more: it’s a variety of data,
coming in at asynchronous rates.
19. Slide 50 How we get it, how we transport it, how we
process it, how we use it – and of these
components have the opportunity to be
honed by analytics insights.
Streamlining the transport, refinement, and
distribution of O&G is vital.
Slide 51
THE
FUTURE
Slide 52
2000 – 2010 :
Decade of “Big Data”
So this past decade, the first one of the
thousands, 2000 – 2010, has been the decade
of “big data”.
Kind of a buzzword, right? Like “in the cloud”.
20. Slide 53
2000 – 2010 :
Decade of “Big Data”
2010 – 2020 :
Decade of Sensing
- and if you thought there was a lot of data
in this first decade, you realize there's
going to be a heck of a lot more in the
second.
- Mobility, infrastructure, and collaboration
technologies currently are the biggest
investment areas
- In the next three to five years,
investments are expected to increase in
big data, the industrial IoT, and
automation
- In a recent study (May 2015) from
Microsoft and Accenture, 86 – 90% of
respondents said that increasing their
analytical, mobile, and internet of things
capabilities would increase the value of
their business
- In the near term during the current low
crude price cycle, approximately 3 out of 5
respondents said they plan to invest the
same amount (32%) or more or
significantly more (25%) in digital
technologies
89% noted that leveraging more analytics
capabilities would add business value
90% felt more mobile tech in the field would
add business value
86% leveraging more IIoT and automation
would boost value
That’s near unanimous. I’ve never seen
management be unanimous about
*anything*.
21. Slide 54
“The oil and gas upstream sector is
a complex, data-driven business
with data volumes growing
exponentially.”
(Feblowitz, 2012)
Structured and Unstructured Data
Slide 55
V’S
Data scientists seem to really like alliteration,
for whatever reason.
Slide 56
V’S
VOLUME – VARIETY – VELOCITY – VERACITY
…and all supposedly leading up to “Value”
22. Slide 57
VOLUME
Seismic data acquisition (wide-azimuth)
Seismic processing
5D interpolated data sets
Fiberoptics
Slide 58
How big is “big”?
In the 80’s, seismic was gigabytes in size;
some people were still hand-interpreting on
paper
Static
5D interpolation: can produce file sets that
exceed 100 TB in size. Some seismic surveys
I’ve seen – regional studies – can reach
petabytes.
This is partially due to the way that the
seismic is acquired
Coil seismic has replaced lines and grids –
explain why, and explain why that impacts
the size of the data that you’re looking at
Real-Time
Shell is using fiberoptic cables created in a
special partnership with HP for their sensors,
and this data is transferred to AWS servers –
1TB / day
And it’s not just in the engineering realm. On
the business side:
Chevron’s internal IT traffic alone exceeds 1.5
TB a day – and that’s 2013 numbers.
23. Slide 59 CAT scanning of cores
What you’re seeing here is a subsection of
the well
Pore-scale imaging (.01 to 10 microns) can
generate large data sets, as well: a
centimeter cubed can exceed 10GB, and
when you take into account that you’re
measuring 1000 meters of core, that’s 1
exabyte
Reducing the approximations, improving the
equations
Images taken from Schlumberger
Slide 60
STRUCTURED
Handled with specific applications used to
manage surveying, processing and imaging,
exploration planning, reservoir modeling,
production, and other upstream activities
The structured stuff’s (mostly) easy to deal
with. You might not have standard naming
conventions, and it might not always be as
complete as you’d like, but (for the most part)
you know what you’re getting and you know
what it’s intended for
Slide 61
UNSTRUCTURED
Unstructured or semi-structured such as:
• Emails
• Word processing documents
• Spreadsheets
• Images
• Voice recordings
• Multimedia
• Data market feeds
• Pictures of well logs
• PDF’s
This all makes it difficult and costly to store in
traditional data warehouses or routinely
query and analyze. Enter Hadoop (or other
24. large-scale unstructured databases)
Slide 62
VARIETY
• Structured
• Standard data models
• SEG-Y
• WITSML
• RESQML
• PRODML
• LAS
• .shp, .lyr, other GIS files
• Unstructured
• Images (maps, embedded well logs in .PDF’s)
• Audio, video
• …and more, on both fronts
And a note – even though data is structured,
it can come in a variety of formats. There’s
no such thing as a pristine data set, out of the
box.
Slide 63
VELOCITY
Real-time streaming data
Drilling equipment (EDR, LWD, MWD, mud logging…)
Sensors (flow, pressure, ROP, etc.)
Real-time streaming data: offshore, onshore;
pipelines, refineries, in the wellbore, on
machinery at the wellsite, in office buildings…
But, again, it’s that variety in the velocity
that’s important. We have some data that
comes in immediately, and some that comes
in three months later via spreadsheet. How
can we consolidate and use both?
25. Slide 64
VERACITY
…in other words, data quality.
It’s not that great
“success rate” for exploration is very low
Slide 65
VERACITY
…in other words, data quality.
…IT’S NOT
THAT GREAT.
It’s not that great
“success rate” for exploration is very low
Slide 66
VALUE
…ALL LEADING UP TO
Studies show that a gradual shift to a data
and technology-driven oilfield is expected to
tap into 125 billion barrels of oil, equal to the
current estimated reserves of Iraq
Currently, recovery rates are only about 50%.
The biggest risk is finding the oil; the second
biggest risk is getting it out of the ground
safely.
Increased speed to first oil
Enhanced production
Reduced costs, such as non-productive time
Reduced risks, especially in the area of
health, environment, and safety
26. Slide 67
“Analytic advantages could help oil
and gas companies improve
production by 6% to 8%.”
(Bain Energy Report)
Our survey of more than 400 executives
in many sectors revealed that companies with
better analytics
capabilities were twice as likely to be in the
top quartile
of financial performance in their industry, five
times
more likely to make decisions faster than
their peers and
three times more likely to execute decisions
as planned.
The evidence is compelling.
…which leads to more alliteration.
Slide 68
C’S
Remember what I said about data scientists
loving alliteration?
So you’ve got all this data. How can you use
it?
Slide 69
C’S
CREATING – CLEANING – CURATING DATASETS
The business of a data scientist.
27. Slide 70
C’S
CREATING – CLEANING – CURATING DATASETS
…CHALLENGES
And making sure that data from all sectors is
integrated.
Slide 71
BIG 3
ADVANCED ANALYTICS TODAY
And there are opportunities for so many
others – everything from HR Analytics, to
looking at social media to detect political
unrest, to machine learning on seismic to
detect channels or slug models – things that
geologists usually hunt for
Slide 72
UNCONVENTIONALS
Huge number of wells operating simultaneously
Operators need to make decisions very quickly, and are far
removed from central business units – autonomy
• Geology interpretation – comparing geology to production
• New well delivery – improving drilling and completions,
reducing lag time and minimizing the number of wells in
process at any given moment in time
• Well and field optimization – well spacing and completions
techniques (cluster spacing, number of stages, proppants
and fluids used, etc.)
“Unconventional resources” such as shale gas
and tight oil supply 20% of the gas used in
the USA and is expanding rapidly around the
globe.
Mention the tech talk that you went to that
was sponsored by the SPE – Randy LaFollette,
Baker Hughes
flat time
which crews are most efficient
bit economics
when to use different bits
mud-motor optimization
28. Slide 73
CONVENTIONALS
Fewer wells in this scenario
Can still spot trends from the constant streams of
information, particularly sensors – spotting where a piece of
equipment might fail
Reducing the potential for environmental disasters
Not any of the fancy horizontal drilling.
Deepwater wells are key here; onshore is less
complex.
Slide 74 MIDSTEAM /
DOWNSTREAM
Monitoring pipelines and equipment for a more predictable
and precise approach to maintenance
Preventing shutdowns and launching interventions to
prevent spills
Ideally, we would have as few people operating in hazardous
locations as possible
Refineries have limited capacity, and fuel
needs to be produced as close as possible to
its point of end use to minimize
transportation costs. Complex algorithms
take into account the cost of producing the
fuel as well as diverse data such as economic
indicators and weather patterns to determine
demand, allocate resources and set prices at
the pumps.
Slide 75
Historically, oil companies relied on
operating models that focused on
functional excellence and clear hand-
offs from one function to the next.
This process takes time, and it breaks down when you have
to make decisions quickly.
Functional excellence isn’t something that
can be sacrificed, by any means – it’s just that
companies are going to have to leverage
technologies in more ways to accelerate the
decision making process.
Consider, for example, the new well delivery
process,
where performance metrics such as the time
from spud
to hookup or the dead time between steps
require visibility
into activity data from each function
involved. If
the functions (including land, regulatory, pad
construction,
drilling, completions and operations) run on
different
29. systems and rely on differently constructed
data
models, it becomes very difficult to have a
clear, integrated
view of what is happening in the field.
Slide 76
Each individual function may have a
wealth of data, but unless your model
can put it all in a single location,
analyze it, and place that information
in the right hands at the right time, it’s
difficult to improve performance.
(Bain Energy Report)
(and I’m paraphrasing)
Companies that build better analytics
capabilities concentrate
their efforts in three areas: technology
architecture,
interaction between IT and the business, and
hiring
and retaining strong analytic talent.
Slide 77
THANKS!
Any questions?