Analyzing LiDAR and SAR data with Capella Space and TileDB (TileDB webinars, 04-12-22).pdf

Analyzing LiDAR & SAR data
with Capella Space and TileDB
TileDB webinars - April 12, 2022
Founder & CEO of TileDB, Inc.
Stavros Papadopoulos

Deep roots at the intersection of HPC, databases and data science
Traction with telecoms, pharmas, hospitals and other scientific organizations
45+ members with expertise across all applications and domains
Who we are
TileDB was spun out from MIT and Intel Labs in 2017
WHERE IT ALL STARTED
Raised over $20M from world-class investors
INVESTORS

The Problem
Low productivity for data analysts and scientists
Huge TCO for organizations
Organizations are drowning in a data infrastructure mess
Too many domain-specific file formats
Difficult to handle data beyond tables and SQL
Overly complex metadata handling and data sharing
Numerous vendors and in-house solutions
Difficult to govern all data holistically

The Solution | Universal Database
All Data. Faster. Cheaper.
Securely manage all your data assets and
supercharge your analytics, data science and
machine learning with a universal database
All data in one place
Superior performance, at a lower cost
Analytics, data science and ML
Holistic governance and collaboration

The Universal Database Pillars
All data in one place
Manage any type of data – tables, files,
images, video, genomics, ML features,
metadata, even flat files and folders – in a
single powerful database.
Superior performance,
at a lower cost
Structure all your data in a canonical,
multi-dimensional array format, which adapts
to any data shape and workload for
maximum performance and minimum cost.
Analytics, data science
and ML
Run data science and machine learning
workloads in a single platform that unifies
data management with analytics and
scientific workloads.
Holistic governance and
collaboration
Securely control the access over all your
data assets, and enable collaboration and
reproducibility, while monitoring all activity
in a centralized way.

The Secret Sauce | The Data Model
Dense array
Store everything as dense or sparse multi-dimensional arrays
Sparse array

Applications
What can be modeled as an array
LiDAR (3D sparse)
SAR (2D or 3D dense)
Population genomics (3D sparse)
Single-cell genomics (2D dense or sparse)
Biomedical imaging (2D or 3D dense) Even flat files!!! (1D dense)
Time series (ND dense or sparse)
Weather (2D or 3D dense)
Graphs (2D sparse)
Video (3D dense)
Key-values (1D or ND sparse)
Tables (1D dense or ND sparse)

How we built a Universal Database
SQL ML & Data Science
Distributed Computing
Applications
APIs
Access control and logging
Serverless SQL, UDFs, task graphs
Jupyter notebooks and dashboards
C L O U D
Parallel IO, rapid reads and writes
Columnar, cloud-optimized
Data versioning and time traveling
E M B E D D E D
Open-source interoperable
storage with a universal
open-spec array format
Unified data management
and easy serverless
compute at global scale
Efficient APIs and tool Integrations with zero-copy techniques

Superior
performance
Built in C++
Fully-parallelized
Columnar format
Multiple compressors
R-trees for sparse arrays
TileDB Embedded
https://github.com/TileDB-Inc/TileDB
Open source:
Rapid updates
& data versioning
Immutable writes
Lock-free
Parallel reader / writer model
Time traveling

TileDB Embedded
https://github.com/TileDB-Inc/TileDB
Open source:
Extreme
interoperability
Numerous APIs
Numerous integrations
All backends
Optimized
for the cloud
Immutable writes
Parallel IO
Minimization of requests

TileDB Cloud
Universal storage Universal tooling
Universal data
.las .cog .vcf .csv
Universal scale
Management. Collaboration. Scalability

TileDB Cloud
Works as SaaS: https://cloud.tiledb.com
Works on premises
Currently on AWS, soon on any cloud
Built to work anywhere
Slicing, SQL, UDFs, task graphs
It is completely serverless
On-demand JupyterHub instances
Can launch Jupyter notebooks
Compute sent to the data
It is geo-aware
Authentication, compliance, etc.
It is secure

TileDB Cloud
Full marketplace (via Stripe)
Everything is monetizable
Access control inside and outside your
organization
Make any data and code public
Discover any public data and code
(central catalog)
Everything is shareable at global scale
Jupyter notebooks
UDFs and task graphs
ML models
Everything is an array!
Dashboards (e.g., R shiny apps)
All types of data (even flat files)
Full auditability (data, code, any action)
Everything is logged

SAR in TileDB
SAR data is stored in TileDB as 3D dense arrays
Rapid dense array slicing via implicit indexing on dimensions
Width, height, time are the dimensions
Cloud-native (rapid writes and reads)
Versioning and time traveling
Integration with GDAL
Visualization on TileDB Cloud

LiDAR in TileDB
LiDAR data is stored in TileDB as 3D sparse arrays
Efficient indexing with R-trees and Hilbert curves
Native float indexing - e.g, A[123.34:124.22, 30.23:31.00, :]
Cloud-native (rapid writes and reads)
Versioning and time traveling
Schema evolution
Integration with PDAL and PCL
Visualization on TileDB Cloud

A slicing query would just traverse the tree
top-down, visiting only nodes/MBRs that
intersect the slice
Indexing
Given the non-empty domain, the space tile extents and the
tile order, we can find easily that this slice overlaps the
second and fourth tile
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
row-major tile order
2x2
space
tiles
MBR1
MBR2
MBR3
MBR4
col-major tile order
row-major cell order
2x2
space
tiles
capacity
2
R-tree
(stored in fragment metadata)
MBR1 MBR2 MBR3 MBR4

Machine Learning in TileDB
Fusion of SAR with LiDAR data in a single platform
Integration with TensorFlow, PyTorch and more
Storage of ML models on TileDB Cloud
A full-fledged platform for exploration, analytics and ML

The Universal Database
Thank you

SAR: A Window to See What Others Can't
Optical With SAR
Only observable
25% of the time
Observable
100% of the time
High Revisit
Low Latency
Cloud & Smoke
piercing visibility
Night Vision for
the planet’s activity

Capella Space is Changing Access to Earth Information
3
Any time,
Any Weather
Frequent Revisit Very High-
Resolution Imaging
Fastest From
Order to Delivery
High-cadencerevisit with
multiple imaging
opportunities per day at
various times of
day/night
Fully automated tasking
& data processing with
fastestorder-to-delivery
times available in market
Very High Resolution
(VHR) and
radiometrically enhanced
multi-looked imagery
with low noise
High-cadencerevisit with
multiple imaging
opportunities per day at
various times of
day/night

4
Capella SAR Imaging
Central Frequency X-Band
Polarization Single-Pol HH or VV
Imaging Bandwidth Up to 500 MHz
Acquisition Direction
Ascending+Descending Orbit Direction
Left+Right Look Direction
Imaging Modes
Spotlight
Sliding Spotlight
Stripmap
SAR Imagery Products
Spot (spotlight imaging mode)
Site (sliding spotlight imaging mode)
Strip (stripmap imaging mode)

7
Capella Console
Simple-to-Use GUI
Task or purchase archived imagery via coordinates, AOI creation tool or
shapefile upload.
Fully automated and secure operations: Satellite ops, SAR processing
and data storage are cloud based, fully confidential.
Real-Time Status Updates
New tasking scheduling in ≤ 15 minutes and users are
provided real-time status updates.
Predicted time of collection displayed to enable
timely post-imaging operations.

Capella API Integration
● Tip-and-cue scenario for immediate responsiveness via API integration. Existing systemalerts can push task requests.
● React to emergencies in real-time. Deliver data to teams on the ground hours after image capture.
8
Task the Capella Constellation
Queue from Existing Systems Pull Scenes & Metadata

Analyzing LiDAR and SAR data with Capella Space and TileDB (TileDB webinars, 04-12-22).pdf

Recommended

Recommended

More Related Content

Similar to Analyzing LiDAR and SAR data with Capella Space and TileDB (TileDB webinars, 04-12-22).pdf

Similar to Analyzing LiDAR and SAR data with Capella Space and TileDB (TileDB webinars, 04-12-22).pdf (20)

Recently uploaded

Recently uploaded (20)

Analyzing LiDAR and SAR data with Capella Space and TileDB (TileDB webinars, 04-12-22).pdf