Earthquake hazard estimation requires systematic investigation of past records as well as fundamental processes that cause the quake. However, detailed long-term records of earthquakes at all scales (magnitude, space and time) are not available. Hence a synthetic method based on first principals could be employed to generate such records to bridge this critical gap of missing data. RSQSim is such a simulator that generates seismic event catalogs for several thousand years at various scales. This synthetic catalog contains rich detail about the earthquake events and associated properties.
Exploring this data is of vital importance to validate the simulator as well as to identify features of interest such as quake time histories, conduct analyses such as calculating mean recurrence interval of events on each fault section. This work1 describes and demonstrates a prototype web based visual tool that enables domain scientists and students explore this rich dataset, as well as discusses refinement and streamlining of data management and analysis that is less error prone and scalable.
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
PEARC17: Visual exploration and analysis of time series earthquake data
1. Visual exploration and analysis of time series
earthquake data
Amit Chourasia1, Keith Richards-Dinger2, James Dietrich2 & Yifeng Cui1
1 SDSC/UC San Diego, 2 UC Riverside
PEARC 17, New Orleans, LA, Jul 12, 2017
3. Research motivation
Earthquake hazard estimation is an important problem.
Requires long term past records of earthquakes at all scales (magnitude,
space, time)
Problem: We neither have, nor could gather such data from the field
Solution: Generate synthetic records by method based on first principals
to bridge this critical gap of missing data
RSQSim visualization : Amit Chourasia, SDSC
4. RSQSim visualization : Amit Chourasia, SDSC
RSQsim
Generates synthetic quakes at all scales i.e. magnitude, space and time
Computation
• Code runs on O(1K) cores with O(100K) elements
• Future needs: O(1M) elements
• Memory usage scales as Nelem
2
Scaling and optimization work underway by Dmitry Pekurovsky (SDSC)
5. RSQSim visualization : Amit Chourasia, SDSC
Output data
Event (12 variables): csv file
Fault geometry (11 variables): csv file
Event action (8 variables + index) : binary files
The data is spread over 11 files (2 ASCII, 9 binary)
Sample data catalog
Size 2.4 GB
Files 2 ascii, 9 binary files
Catalog time duration From 50k to 90k years (40k years)
Number of events 5,970,621
Event variables 12
Number of fault patches 260,051
Fault patch variables 11
Number of event actions 19,127,461
Event action variables 8
6. RSQSim visualization : Amit Chourasia, SDSC
Early data exploration
• Number of events per year (frequency)
• Max magnitude for each year (maxM)
~ 40,000 records with year, frequency and maxM
8. Project motivation
• Scientist has a custom 3D interactive vis application
Lacks geographic context
Lacks multiuser capability
• Develop an interactive visualization in 2D with richer geographic
context, ideally usable from a web browser
RSQSim visualization : Amit Chourasia, SDSC
9. Challenges
Data wrangling
• Normalize index to 1 based
• Normalize variable names
• Translate 3D data to 2D
(triangle patches stored with vertex + 3D rotations)
(rectangle patches stored with center + length, width + 3D rotations)
• Add geographic projections from UTM to EPSG 4236
• Consider using raw data vs database
Visualization
• What to display?
• How to display? Which visualization idioms to use?
• Can it work in a web browser?
RSQSim visualization : Amit Chourasia, SDSC
10. Visualization goals
• Browse and show event of interest
• Show event chronology
• Show fault patches affected by an event
• Interactive
RSQSim visualization : Amit Chourasia, SDSC
11. Implementation
Developed a web app using the following
• Python
Flask framework : server/client brokering
Folium : map plotting
• JavaScript
Leaflet : mapping
C3JS : time series plots
noUISliderJS : mobile friendly sliders
• HTML
RSQSim visualization : Amit Chourasia, SDSC
12. Implementation Phase 1
• Build an app with raw data
• Data wrangling
Format switched from ASCII to binary
Need to also support rectangular patch elements stored with different scheme
Problems
Slow start up 20 min (data pickling brought it to 2min)
Considerable memory use
Tedious development/debugging
RSQSim visualization : Amit Chourasia, SDSC
13. Implementation Phase 2
• Transform data to database
Which database to use?
Develop database schema?
Verify data translation to database
Verify and validate geographic projection transformations
• Retool application to use database
Switch search patterns to queries
Optimize query performance
RSQSim visualization : Amit Chourasia, SDSC
14. RSQSim visualization : Amit Chourasia, SDSC
Implementation Phase 3
• Develop user interface
Select event by ID ( ~ 6M)
Select event by time ( 40,000 years at second granularity)
Other filters
• Link map display with time series
• UI needs to be mobile friendly
16. User interaction
• Selection
• Event
• Time
• Filters
• Magnitude
• Number of patches
• Trail events
• Map
• Layers
• Markers pop-ups
• Link with time series plot
• Time series
• Linked with map plot
• Enable filter constraints
RSQSim visualization : Amit Chourasia, SDSC
17. UI must be mobile friendly
Visual exploration and analysis of time series earthquake data.
18. Visualization design + interaction
Visual exploration and analysis of time series earthquake data.
19. Show location of selected event with ID: 108632
Visual exploration and analysis of time series earthquake data.
5 ≥ M < 6 6 ≥ M < 7 M ≥ 74 ≥ M < 5M < 4
20. Popup info for selected event with ID: 108632
Visual exploration and analysis of time series earthquake data.
5 ≥ M < 6 6 ≥ M < 7 M ≥ 74 ≥ M < 5M < 4
21. Show patches affected by event with ID: 108632
Visual exploration and analysis of time series earthquake data.
5 ≥ M < 6 6 ≥ M < 7 M ≥ 74 ≥ M < 5M < 4
22. Show events previous to event with ID: 108632
Visual exploration and analysis of time series earthquake data.
5 ≥ M < 6 6 ≥ M < 7 M ≥ 74 ≥ M < 5M < 4
23. Show chronology of previous events to event with ID: 108632
Visual exploration and analysis of time series earthquake data.
5 ≥ M < 6 6 ≥ M < 7 M ≥ 74 ≥ M < 5M < 4
24. Show events next to event with ID: 108632
Visual exploration and analysis of time series earthquake data.
5 ≥ M < 6 6 ≥ M < 7 M ≥ 74 ≥ M < 5M < 4
25. Show chronology of next events to event with ID: 108632
Visual exploration and analysis of time series earthquake data.
5 ≥ M < 6 6 ≥ M < 7 M ≥ 74 ≥ M < 5M < 4
26. Visual exploration and analysis of time series earthquake data.
Final map with all layers
5 ≥ M < 6 6 ≥ M < 7 M ≥ 74 ≥ M < 5M < 4
27. Show time series adjacent to selected event (± 100 events)
Visual exploration and analysis of time series earthquake data.
29. MRI calculation
MRI = Mean recurrence interval
Average interval a quake occurs on a fault sub section
Faults are divided into named subsections
For all fault sub sections (~2600)
• With nucleation only: 1,785 sec
• With nucleation + participation: 12,613 sec
• Verify and validate calculations
Benchmark hardware
MacPro workstation
2x 2.26 Ghz Quad Core Intel Xeon processor
16 GB memory
Visual exploration and analysis of time series earthquake data.
31. Conclusions
• Transformed data to a database
Normalized data from mixed index in source data
Validated data transformations
Easier data management/sharing, only one file
Scalable, as data does not need to be loaded into memory
Fast and easy querying
• Developed interactive visualization web application
Enables quake exploration in rich geographic context for a large catalog
Can support multiple concurrent users
• Analysis
Calculated mean recurrence interval
With nucleation
With nucleation + participation
Streamlined data management, analysis and visualization process
Visual exploration and analysis of time series earthquake data.
32. Deliverables to research team
• Translation script to create SQLite database from raw data
• Web based visualization application
• Analysis script and other query examples
• Documentation
Visual exploration and analysis of time series earthquake data.
33. Potential for research team
• Use less code for analysis (no parsing needed)
• Maintain less error prone code for analysis (queries simplify manual filtering)
• Write output data to SQLite database from simulation instead of ascii+binary files
• Scientists will need to learn writing SQL queries
• Query optimization could be an issue
Visual exploration and analysis of time series earthquake data.
34. Ancillary findings
• Python modules and framework
Found a bug in Folium with geojson data
• JavaScript libraries
Found a bug in popular heatmapjs plugin for Leaflet
• Responsive user interface design for web is still cumbersome
Visual exploration and analysis of time series earthquake data.
35. Future work
RSQSim gateway
• Data service
Allow web based querying
Allow download of filtered data
• Light visualizations
Allow spatial filtering
Add heat map
Visual exploration and analysis of time series earthquake data.
36. Acknowledgments
This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is
supported by National Science Foundation grant number ACI-1053575. The National Science
Foundation grant number EAR-1135455 and Keck Foundation grant number 005590-00001 also
supported this work.
Demonstration
http://vis.sdsc.edu:5555
Visual exploration and analysis of time series earthquake data.
Editor's Notes
RSQSim incorporates rate-state constitutive properties. Rate State Quotient?
RSQSim has unique capabilities to deterministically model short-term clustering together with long-term statistical properties of earthquakes [1, 2];
and to represent the different modes of slip observed in nature.
Can perform repeated simulations of long earthquake catalogs (100K to 10M events) with outer scales of the dimensions of regional plate boundaries, and sufficiently resolved inner scales to permit detailed simulations of the evolution of system state through occurrence of frequent small earthquakes.
RSQSim incorporates rate-state constitutive properties. Rate State Quotient?
RSQSim has unique capabilities to deterministically model short-term clustering together with long-term statistical properties of earthquakes [1, 2];
and to represent the different modes of slip observed in nature.
Can perform repeated simulations of long earthquake catalogs (100K to 10M events) with outer scales of the dimensions of regional plate boundaries, and sufficiently resolved inner scales to permit detailed simulations of the evolution of system state through occurrence of frequent small earthquakes.
Data rarely will need to change (written once)
Will be read many times for analysis
Scientists don’t have DB experience, simpler solution is better
DB needs to be portable across platform
Map visual idiom is suitable - As it enables us to show geographic context.
Color coded Red polygon shows the event and its location on surface
Pop up reveals further information about the event
Map visual idiom is suitable - As it enables us to show geographic context.
Color coded Red polygon shows the event and its location on surface
Pop up reveals further information about the event
Map visual idiom is suitable - As it enables us to show geographic context.
Color coded Red polygon shows the event and its location on surface
Pop up reveals further information about the event
Map visual idiom is suitable - As it enables us to show geographic context.
Color coded Red polygon shows the event and its location on surface
Pop up reveals further information about the event
Map visual idiom is suitable - As it enables us to show geographic context.
Color coded Red polygon shows the event and its location on surface
Pop up reveals further information about the event
Map visual idiom is suitable - As it enables us to show geographic context.
Color coded Red polygon shows the event and its location on surface
Pop up reveals further information about the event
Map visual idiom is suitable - As it enables us to show geographic context.
Color coded Red polygon shows the event and its location on surface
Pop up reveals further information about the event
Nucleation - When quake originates on a sub section
Participation – When quake originates elsewhere, but affects a sub section