Presentation given at the Canadian Hydrographic Conference 2020
Dates: Mon., Feb. 24, 2020 – Thu., Feb. 27, 2020
Location: Quebec City, Canada
Authors: M. Smith, G. Masetti, L. Mayer, M. Malik, J.-M. Augustin, C. Poncelet, I. Parnum
3. Data Ch allen ges
Seafloor Backscatter
Images from Marine Geophysical Research Volume 39 Issue 1-2 (2018)
Hard to obtain quantitative and comparable
data sets..
• Calibration of multibeam
echosounder radiation pattern
• Results varies system to system
• Data is sensitive to changes in
parameters
• No standard for processing and
reporting the data
3
6. Ope nBST – Transpare nt Algori t hms and
Communi ty Discussi on
How do we address the problem?
6
7. But Really, How?
Python, python, HydrOffice and more python
• Back End: Python Library and NetCDF
based project structure
• Front End: Jupyter Notebook and
public api
7
8. Backend – Python Library and NetCDF
NetCDF with Nodal Processing Structure
• Unique hash identifies each processing step and
‘remembers’ prior step
• Allows for reuse of prior workflows
8
9. Front End – Jupyter Notebooks
Beam-width limited area
correction
Pulse length limited area
correction
Combined area correction
9
10. Front End – Application
Programming Interface (API)
Processing methods are independent
methods:
• Inputs are clearly and explicitly named.
Should be self descriptive
• Algorithm is written clearly, simple to
follow
• Outputs packaged in a self-descriptive
dictionary
The coded algorithm should be simple to
implement and merge with other
computing environments
10
11. 11
Where we are and where we are
going
• Version 0:
• To be realized on Github within the next 2 months
• Primarily the software architecture with limited functionality
• Works with Reson .s7k
• Version 1:
• Add more sonar data formats
• Add to the algorithm base
• Begin establishing trusted workflows for commonly used systems
• Add comparison tools to directly compare workflow results in a single notebook
12. 12
What does OpenBST mean for you?
• Collaboratively discuss and develop processing algorithms
• Software algorithms implemented in commercial tools can be bench tested
• Identify why differences in key processing stages vary between software suites
• Increase the confidence and understanding of backscatter processing results
Thank you and I am excited to start off the session - Backscatter: The dark side of data.
Seafloor backscatter data certainly has a dark-side, and how best to handle the collection and processing of backscatter data has become a topic of great interest in the hydrographic community.
With that said, I am hoping that what I present today can serve as a small light in the dark, and offer a way to address some of the issue we face with this data.
So, with that said, I would like to present our current work on the Open Backscatter Toolchain, a community-vetted workflow for backscatter processing.
When I talk about backscatter, I am referring to the measured acoustic intensity related to the scattering of the sound wave, by the seafloor, back in the direction of the MBES.
Backscatter has been shown to be a powerful tool in the characterization of the seafloor.
Standard backscatter products such a: backscatter mosaics and Angular response curves allow for the identification and segmentation of an environment into distinct acoustic fancies, as well as give first order estimates of the sediment classes present.
However, the dark side of backscatter is the data challenges we face in obtaining quantitative measurements.
Further, a lack of standards in both data collection and processing hampers the inter-comparison of surveys
However, the dark side of backscatter is the data challenges we face in obtaining quantitative and comparable measurements.
As the image on the right shows, backscatter surveys can be notoriously difficult to compare.
A quantitative dataset requires calibration of the MBES which is difficult at best
Collected data varies from system to system,
Changes in parameters have direct and observable effects,
Even using the same system, the end product can vary based on how the data was reported and how it was processed
Much of the effort spent on addressing the aforementioned issues has centered on the hardware.
How we calibrate MBES for their complex radiation patterns
How do we ensure clear and accurate reporting of backscatter measurements
However, how backscatter is processed and handled by various software suites had received little attention. However, there was a recent project to address this.
The BSIP was an international effort formed to check for consistency across backscatter products produced by various software suites.
The results of the project can best be explained in the figure shown here.
Common datasets were provided to various software manufacturers
The vendors were tasked with processing the data and outputting results at various stages of the processing chain
Show here is the results for:
BL0 or the first step where the data is raw decoded
BL3 after the data has been processed but before it have been made into an ara or mosaic
We can see that the result of BS0, raw decoding and the first stage of processing, there are median values ranging by over 5dB.
We can also see that by BL3 some results have moved in opposite directions at the end of the chain.
This presents a real issue when trying to use the data quantitatively. Further, why these discrepancies occur is unclear and which has the most accurate result is impossible to know as we do not know what is done at each step.
Our suggestion to help address and mitigate some of the highlighted discrepancies is OpenBT
At it’s core, OpenBST is a set of open source algorithms centered around the processing of backscatter data
The project is designed to be collaborative in nature and encourage vendors, researchers and end users to discuss the relative merits of different algorithms, as well as focus on the best processing methodologies/algorithms based on the current state-of-art
The project does not aim to be a commercial competitor and aims to provide a frame work for testing and implementing new algorithms
The project is being built as a python library within the HydrOffice framework and makes use of netcdf files for data storage, manipulation and metadata coupling
On the front end, the project is using jupyter notebooks and a public api will be available so the algorithms can be independently tested and used.
Expanding further on the backend:
We are very excited to use NetCDFs as a data management and organization structure.
The benefits:
The CF convention is easily understood by computing environments and software
Setup with georeferencing metadata -> easy drag and drop in GIS
Data is stored on disk and not in memory: read only what you need, modify, store, done
Extensive metadata coupling
Additionally, we recognize that the backscatter processing workflow aligns nicely with trees from graph theory.
We keep track of parent and child processes to minimize the amount of recalculation and allows the user to efficiently explore various algorithms and processing workflows..
What this ultimately means, is that OpenBST projects are easy to share and can be explored by the user in their preferred software/language
In lieu of a GUI, we decided to leverage the Jupyter notebooks:
Jupyter Notebooks are interactive computing environments
Code can be directly written and executed within the notebook on the fly
Makes adjusting parameters and running tests a simple matter
Extensive annotation ability allows for notating as various methods are tested
In line plotting makes it easy to visualize results and compare
We envision providing a notebook for each system for which we have well constrained test data.
Additionally, we want to provide an in-depth tutorial notebook and a simple template notebook for creating your own
We have designed the project so algorithms have an explicit design
Inputs are clearly named and follow convention
Algorithms are easily interpretable
Outputs are clearly labeled and packaged.
Adding algorithms or using the algorithms in your own software should be simple.
So currently we have been working on V0. V0 has primarily been focused on laying the groundwork of the project. We hope that in two months this version will be released.
Our sights are then set on V1. This is where we hope to add more sonar formats and more processing algorithms into the mix. This is where I hope the conversation and collaboration can really grow.
Speaking to collaboration,
What ultimately is OpenBST trying to do for you?
For the end-user, OpenBST is hoped to be:
an environment well suited to testing and developing new algorithms
A place to benchmark software algorithms
A tool to identify where and why discrepancies arise
And ultimately, a way to increase our confidence in backscatter processing results.
So the code is available on GitHub at this link.
Thank you for your time