SlideShare a Scribd company logo
1 of 30
Download to read offline
#ibmedge© 2016 IBM Corporation
SBD-1266
Introduction to IBM Spectrum Scale and
Its Use in Life Science
Sven Oehme, IBM Research
Konstantin Arnold, University of Basel
#ibmedge
1
#ibmedge
2
#ibmedge
Spectrum Scale Architecture Highlights: Scalability
3
#ibmedge
Spectrum Scale Architecture Highlights: HA/Reliability
4
#ibmedge
Spectrum Scale Software Local Read Only Cache (LROC)
5
• Many NAS workloads benefit from large read cache
• SPECsfs
• OpenStack, VMWare and other virtualization
• Database
• Augment the Spectrum Scale Node DRAM cache with SSD/NVMe
• Used to cache:
– Data
– Inodes
– Indirect blocks
• Cache consistency insured by standard Spectrum Scale tokens
• Assumes SSD device is unreliable, data is protected by checksum and verified on read
• Provide low-latency access to file system metadata and data
• Implement with consumer flash for maximum Cache/$
• Enabled by FLEA’s LSA (Data is written Sequential to Device, to eliminate wear leveling)
• Reach small File performance leadership compared to other NAS Devices
#ibmedge
LROC Example Speed Up
6
• Two consumer grade 200 GB SSDs cache a forty-eight 300 GB 10K SAS disk Spectrum Scale storage system
• Initially, with all data coming from the disk storage system, the client reads data from the SAS disks at ~ 3,000 IOPS
• As more data is cached in Flash, client performance increases to 33,000 IOPS while reducing the load on the disk subsystem by
more than 95%
#ibmedge
Spectrum Scale Raid features
7
#ibmedge
ESS (Spectrum Scale Raid Building Blocks)
• Elastic Storage Server (ESS) is a prepacked solution using on the Spectrum Scale Raid technology and
Commodity HW components
• SSD/10k SAS Models
• GS1, GS2, GS4,GS6
• 2 x High Volume Servers
• 1/2/4/6 x JBOD disk enclosures
• NL-SAS Models
• GL2, GL4,GL6
• 2 x High Volume Servers
• 2/4/6 x JBOD disk enclosures
8
#ibmedge
ESS : various models
9
#ibmedge
University of Basel, Switzerland
10
1460: First and only University in Switzerland
until 19th century
7 faculties: Humanities, Science, Medicine,
Law, Business and Economics,
Psychology, Theology
7600 undergraduate students
3700 postgraduate and doctoral students
1300 academic staff
358 Professors
#ibmedge
Scientific Computing @ University of Basel
• HPC Clusters – specialized for large IO (bioinformatics) and high-speed
interconnects (molecular simulations)
• Central systems administration
• Up-to-date scientific databases
• Up-to-date software stack
• Back-up service
• User training
• User support
• Developer support
(code version, issue tracking,
wiki, etc.)
• Dedicated 24/7 production server environment for
web services (SWISS-MODEL, Ismara, Mirz, etc.)
11
3.5 PB
storage
10'000
CPU
cores
HPC
compute
clusters
scientific
software
training
&
support
#ibmedge
Supporting research in Northwest Switzerland
12
• Hosting reference bioinformatics services
• 500 registered users
• 110 research groups
• Acknowledged in 70 life-science publications in 2016
From stellar astrophysics…
… to brain genomics…
… to structural biology … … to hosting reference services…
SWISS-MODEL
Major funding
#ibmedge
Scientific Storage and Computing Infrastructure
Once upon time …
13
HPC Cluster
NFS Server
#ibmedge
Scientific Storage and Computing Infrastructure
Cluster and storage grew bigger ...
14
HPC Cluster
NSD Server NSD ServerNSD Server
#ibmedge
Scientific Storage and Computing Infrastructure
15
SONAS
NSD Server
Spectrum Scale Data Hub Layer
NSD Server NSD Server
TSM-HSMLTFS-EE
HPC Cluster
Biomedical
Research
Life Sciences
Department
Physics
Department
Chemistry
Department
Psychology
Department
Microscopy
Facility
Economy
Department
…
Genome
Sequencing
Facility
#ibmedge
Cluster Export Services
High available file and object export services
- export/share configuration straight forward
- authentication against AD or LDAP
Important for planning:
- NFS and Apple OS X
- SMB1 not supported
- mixed workload and performance
- changes in authentication
16
NSD ServerNSD Server NSD Server
Protocol Nodes
Spectrum Scale Data Hub Layer
Active Directory
Authentication
CIFS NFS
#ibmedge
AFM for Data migration, Example: SONAS migration
Operational advantages:
- preparing and prefetching before switching clients
- migrate data while clients working on new share
- minimal downtime: 1min (AFM) for share 30TB, 30M inodes
vs. several months (using transfer host with robocopy)
Technical advantages:
- data transfer: observed up to 1TB/h
per gateway host
- ACL: transferred together with data
- Direct storage → storage migration,
no transfer host or copy software
needed (e.g. robocopy, rsync)
17
NSD ServerNSD Server NSD Server
SONAS
Gateway Nodes
Home Cluster
Spectrum Scale Data Hub Layer
#ibmedge
Example: Scientific web server
Protein sequences vs. protein structures
18
#ibmedge
Example: Scientific web server
Protein annotation: humans vs. machines
19
#ibmedge
Example: Scientific web server
Disaster recovery: AFM between two sites
- less work to develop data replication to DR site
Scientific pipeline speedup x8: big pagepool + LROC
- processing steps depend on bigger datasets, unchanged for 1 week
- update of datasets very simple,
no data distribution required
20
NSD Server
HPC Cluster
NSD Server NSD Server
200km
pagepool=128GB
LROC: 1TB SSD
AFM independent writer
(replication not speed critical)
Internet
#ibmedge
Information Lifecycle Management - HSM
Use of tape to lower cost of storage
Spectrum Archive EE (LTFS-EE):
- easy to manage, direct control of tape
- use of policies for fine grained placement
- well suited for data export
- not a full fledged backup system
Spectrum Protect for
Space Management
- integration with backup system
- requires TSM infrastructure
2121
Disk Pool
TS3500 TS3500
NSD Server
Spectrum Protect for Space ManagementSpectrum Archive EE
TSM Server
…
NSD Server
Spectrum Scale Data Hub Layer
ClientsClients
#ibmedge
Secure environment for biomedical research
Encryption
- encryption of data at rest and on network
- defined via policies
- possibility of fine grained access groups
- encryption keys managed by key
management software (IBM SKLM)
- integration with general research infrastructure
- suited for Biomedical data and processing
22
SKLM
Secure research environment
Login
HPC Cluster
NSD Server
General research environment
NSD Server
Clients
#ibmedge
Summary
23
SONAS
NSD Server
Spectrum Scale Data Hub Layer
NSD Server NSD Server
TSM-HSMLTFS-EE
HPC Cluster
Biomedical
Research
Life Sciences
Department
Physics
Department
Chemistry
Department
Psychology
Department
Microscopy
Facility
Economy
Department
…
Genome
Sequencing
Facility
SKLM
Secure research environment
Login
HPC Cluster
NSD Server
Remote Site
AFM
CES: CIFS,NFSEncryption
ILM, HSM
LROC
Remote Cluster
#ibmedge
Spectrum Scale User Group
• The Spectrum Scale User Group is free
to join and open to all using, interested
in using or integrating Spectrum Scale.
• Join the User Group activities to meet
your peers and get access to experts
from partners and IBM.
• Next meetings:
- APAC: October 14, Melbourne
- Global at SC16 : November 13 1pm to 5pm, Salt Lake City
• Web page: http://www.spectrumscale.org/
• Presentations: http://www.spectrumscale.org/presentations/
• Mailing list: http://www.spectrumscale.org/join/
• Contact: http://www.spectrumscale.org/committee/
• Meet Bob Oesterlin (US Co-Principal) at Edge2016: Robert.Oesterlin@nuance.com
#ibmedge
Session : Futures of IBM Spectrum Scale
NDA & Customers ONLY
• Who: IBM Spectrum Scale Offering Management
• Carl Zetie, Ron Riffe
• When: Tuesday, September 20, 2016
• 1pm to 2pm
• Where: MGM Grand, Signature Tower 3
• Meeting Room D
• Contact (if any questions)
• douglasof@us.ibm.com, cmukhya@us.ibm.com
25
#ibmedge
Session : How to apply Flash benefits to big data
analytics and unstructured data
NDA & Customers ONLY
• Who: IBM Elastic Storage Server Offering Management
• Alex Chen
• When: Thursday, September 22, 2016
• 1:15pm to 2:15pm
• Where: Grand Garden Arena, Lower Level, MGM, Studio 10
• Contact(if any questions)
• • cmukhya@us.ibm.com, douglasof@us.ibm.co
26
#ibmedge
Spectrum Scale Trial VM
• Download the IBM Spectrum Scale Trial VM from :
• http://www-03.ibm.com/systems/storage/spectrum/scale/trial.html
27
#ibmedge
Spectrum Scale Edge – Technical Sessions
• Just Search for “ Spectrum Scale” in the IBM Events mobile app. There
are 15+ sessions on various topics including Lab sessions.
Lab Sessions:
• Spectrum Scale Problem Determination Lab
Date: Sept 20th 2:15 PM – 3:15 PM
Location : MGM Grand , Room 317 Lab Center F
• Spectrum Scale Trail VM Lab
Date: Sept 20th 3:45PM – 4:45PM
Location: MGM Grand , Room 317 Lab Center F
• Booth on ESS , Spectrum Scale + TCT and DeepFlash
28
© 2016 IBM Corporation
#ibmedge
Thank You

More Related Content

What's hot

Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3R
Simon Huang
 

What's hot (20)

Data Sharing using Spectrum Scale Active File Management
Data Sharing using Spectrum Scale Active File ManagementData Sharing using Spectrum Scale Active File Management
Data Sharing using Spectrum Scale Active File Management
 
Ibm spectrum scale fundamentals workshop for americas part 8 spectrumscale ba...
Ibm spectrum scale fundamentals workshop for americas part 8 spectrumscale ba...Ibm spectrum scale fundamentals workshop for americas part 8 spectrumscale ba...
Ibm spectrum scale fundamentals workshop for americas part 8 spectrumscale ba...
 
Disaster Recovery using Spectrum Scale Active File Management
Disaster Recovery using Spectrum Scale Active File ManagementDisaster Recovery using Spectrum Scale Active File Management
Disaster Recovery using Spectrum Scale Active File Management
 
IBM GPFS
IBM GPFSIBM GPFS
IBM GPFS
 
Gpfs introandsetup
Gpfs introandsetupGpfs introandsetup
Gpfs introandsetup
 
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
Ibm spectrum scale fundamentals workshop for americas part 5 ess gnr-usecases...
 
Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive

Apache Kudu: Technical Deep Dive


Apache Kudu: Technical Deep Dive


 
Linux crontab
Linux crontabLinux crontab
Linux crontab
 
SAP HANA System Replication - Setup, Operations and HANA Monitoring
SAP HANA System Replication - Setup, Operations and HANA MonitoringSAP HANA System Replication - Setup, Operations and HANA Monitoring
SAP HANA System Replication - Setup, Operations and HANA Monitoring
 
IBM Spectrum Scale Security
IBM Spectrum Scale Security IBM Spectrum Scale Security
IBM Spectrum Scale Security
 
RocksDB compaction
RocksDB compactionRocksDB compaction
RocksDB compaction
 
Velocity 2015 linux perf tools
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
 
Backup using rsync
Backup using rsyncBackup using rsync
Backup using rsync
 
Ceph RBD Update - June 2021
Ceph RBD Update - June 2021Ceph RBD Update - June 2021
Ceph RBD Update - June 2021
 
Introduction to SLURM
Introduction to SLURMIntroduction to SLURM
Introduction to SLURM
 
Spark Performance Tuning .pdf
Spark Performance Tuning .pdfSpark Performance Tuning .pdf
Spark Performance Tuning .pdf
 
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
Spark SQL: Another 16x Faster After Tungsten: Spark Summit East talk by Brad ...
 
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
Ibm spectrum scale fundamentals workshop for americas part 4 spectrum scale_r...
 
Presentation oracle on power power advantages and license optimization
Presentation   oracle on power power advantages and license optimizationPresentation   oracle on power power advantages and license optimization
Presentation oracle on power power advantages and license optimization
 
Introduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3RIntroduction to NVMe Over Fabrics-V3R
Introduction to NVMe Over Fabrics-V3R
 

Viewers also liked

Spectrum Storage Suite LinkedIn
Spectrum Storage Suite LinkedInSpectrum Storage Suite LinkedIn
Spectrum Storage Suite LinkedIn
Steven Bauer
 

Viewers also liked (16)

Spectrum Storage Suite LinkedIn
Spectrum Storage Suite LinkedInSpectrum Storage Suite LinkedIn
Spectrum Storage Suite LinkedIn
 
IBM Platform Computing Elastic Storage
IBM Platform Computing  Elastic StorageIBM Platform Computing  Elastic Storage
IBM Platform Computing Elastic Storage
 
Spectrum scale-external-unified-file object
Spectrum scale-external-unified-file objectSpectrum scale-external-unified-file object
Spectrum scale-external-unified-file object
 
IBM Streams V4.2 Submission Time Fusion and Configuration
IBM Streams V4.2 Submission Time Fusion and ConfigurationIBM Streams V4.2 Submission Time Fusion and Configuration
IBM Streams V4.2 Submission Time Fusion and Configuration
 
Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.
Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.
Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.
 
IBM Streams Getting Started Resources
IBM Streams Getting Started ResourcesIBM Streams Getting Started Resources
IBM Streams Getting Started Resources
 
Installation and Setup for IBM InfoSphere Streams V4.0
Installation and Setup for IBM InfoSphere Streams V4.0Installation and Setup for IBM InfoSphere Streams V4.0
Installation and Setup for IBM InfoSphere Streams V4.0
 
Spectrum Scale Unified File and Object with WAN Caching
Spectrum Scale Unified File and Object with WAN CachingSpectrum Scale Unified File and Object with WAN Caching
Spectrum Scale Unified File and Object with WAN Caching
 
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral ProgramBig Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program
 
IBM Spectrum Scale and Its Use for Content Management
 IBM Spectrum Scale and Its Use for Content Management IBM Spectrum Scale and Its Use for Content Management
IBM Spectrum Scale and Its Use for Content Management
 
Introducing IBM Spectrum Scale 4.2 and Elastic Storage Server 3.5
Introducing IBM Spectrum Scale 4.2 and Elastic Storage Server 3.5Introducing IBM Spectrum Scale 4.2 and Elastic Storage Server 3.5
Introducing IBM Spectrum Scale 4.2 and Elastic Storage Server 3.5
 
IBM Spectrum Scale ECM - Winning Combination
IBM Spectrum Scale  ECM - Winning CombinationIBM Spectrum Scale  ECM - Winning Combination
IBM Spectrum Scale ECM - Winning Combination
 
Spectrum scale object analytics
Spectrum scale object analyticsSpectrum scale object analytics
Spectrum scale object analytics
 
Toolkits Overview for IBM Streams V4.2
Toolkits Overview for IBM Streams V4.2Toolkits Overview for IBM Streams V4.2
Toolkits Overview for IBM Streams V4.2
 
IBM ODM Rules Compiler support in IBM Streams V4.2.
IBM ODM Rules Compiler support in IBM Streams V4.2.IBM ODM Rules Compiler support in IBM Streams V4.2.
IBM ODM Rules Compiler support in IBM Streams V4.2.
 
Hadoop and Spark Analytics over Better Storage
Hadoop and Spark Analytics over Better StorageHadoop and Spark Analytics over Better Storage
Hadoop and Spark Analytics over Better Storage
 

Similar to Introduction to IBM Spectrum Scale and Its Use in Life Science

Similar to Introduction to IBM Spectrum Scale and Its Use in Life Science (20)

Building a High Performance Analytics Platform
Building a High Performance Analytics PlatformBuilding a High Performance Analytics Platform
Building a High Performance Analytics Platform
 
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
 
2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai2016 August POWER Up Your Insights - IBM System Summit Mumbai
2016 August POWER Up Your Insights - IBM System Summit Mumbai
 
OS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of MLOS for AI: Elastic Microservices & the Next Gen of ML
OS for AI: Elastic Microservices & the Next Gen of ML
 
Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810Using SAS GRID v 9 with Isilon F810
Using SAS GRID v 9 with Isilon F810
 
The Pendulum Swings Back: Converged and Hyperconverged Environments
The Pendulum Swings Back: Converged and Hyperconverged EnvironmentsThe Pendulum Swings Back: Converged and Hyperconverged Environments
The Pendulum Swings Back: Converged and Hyperconverged Environments
 
Design Choices for Cloud Data Platforms
Design Choices for Cloud Data PlatformsDesign Choices for Cloud Data Platforms
Design Choices for Cloud Data Platforms
 
Hpc lunch and learn
Hpc lunch and learnHpc lunch and learn
Hpc lunch and learn
 
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...
 
IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015IBM Spectrum Scale Overview november 2015
IBM Spectrum Scale Overview november 2015
 
Exadata architecture and internals presentation
Exadata architecture and internals presentationExadata architecture and internals presentation
Exadata architecture and internals presentation
 
S104873 nas-sizing-jburg-v1809d
S104873 nas-sizing-jburg-v1809dS104873 nas-sizing-jburg-v1809d
S104873 nas-sizing-jburg-v1809d
 
TDC2016SP - Trilha NoSQL
TDC2016SP - Trilha NoSQLTDC2016SP - Trilha NoSQL
TDC2016SP - Trilha NoSQL
 
Kafka & Hadoop in Rakuten
Kafka & Hadoop in RakutenKafka & Hadoop in Rakuten
Kafka & Hadoop in Rakuten
 
Calum McCrea, Software Engineer at Kx Systems, "Kx: How Wall Street Tech can ...
Calum McCrea, Software Engineer at Kx Systems, "Kx: How Wall Street Tech can ...Calum McCrea, Software Engineer at Kx Systems, "Kx: How Wall Street Tech can ...
Calum McCrea, Software Engineer at Kx Systems, "Kx: How Wall Street Tech can ...
 
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseTackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-Premise
 
Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8Webinar: High Performance MongoDB Applications with IBM POWER8
Webinar: High Performance MongoDB Applications with IBM POWER8
 
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageWebinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storage
 
Building real time data-driven products
Building real time data-driven productsBuilding real time data-driven products
Building real time data-driven products
 
Open Source Investments in Mainframe Through the Next Generation - Showcasing...
Open Source Investments in Mainframe Through the Next Generation - Showcasing...Open Source Investments in Mainframe Through the Next Generation - Showcasing...
Open Source Investments in Mainframe Through the Next Generation - Showcasing...
 

More from Sandeep Patil

More from Sandeep Patil (7)

Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...
Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...
Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...
 
IBM Spectrum Scale Secure- Secure Data in Motion and Rest
IBM Spectrum Scale Secure- Secure Data in Motion and RestIBM Spectrum Scale Secure- Secure Data in Motion and Rest
IBM Spectrum Scale Secure- Secure Data in Motion and Rest
 
Genomics Deployments - How to Get Right with Software Defined Storage
 Genomics Deployments -  How to Get Right with Software Defined Storage Genomics Deployments -  How to Get Right with Software Defined Storage
Genomics Deployments - How to Get Right with Software Defined Storage
 
Spectrum Scale Best Practices by Olaf Weiser
Spectrum Scale Best Practices by Olaf WeiserSpectrum Scale Best Practices by Olaf Weiser
Spectrum Scale Best Practices by Olaf Weiser
 
Analytics with unified file and object
Analytics with unified file and object Analytics with unified file and object
Analytics with unified file and object
 
IBM Spectrum Scale Networking Flow
IBM Spectrum Scale Networking FlowIBM Spectrum Scale Networking Flow
IBM Spectrum Scale Networking Flow
 
In Place Analytics For File and Object Data
In Place Analytics For File and Object DataIn Place Analytics For File and Object Data
In Place Analytics For File and Object Data
 

Recently uploaded

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 

Introduction to IBM Spectrum Scale and Its Use in Life Science

  • 1. #ibmedge© 2016 IBM Corporation SBD-1266 Introduction to IBM Spectrum Scale and Its Use in Life Science Sven Oehme, IBM Research Konstantin Arnold, University of Basel
  • 4. #ibmedge Spectrum Scale Architecture Highlights: Scalability 3
  • 5. #ibmedge Spectrum Scale Architecture Highlights: HA/Reliability 4
  • 6. #ibmedge Spectrum Scale Software Local Read Only Cache (LROC) 5 • Many NAS workloads benefit from large read cache • SPECsfs • OpenStack, VMWare and other virtualization • Database • Augment the Spectrum Scale Node DRAM cache with SSD/NVMe • Used to cache: – Data – Inodes – Indirect blocks • Cache consistency insured by standard Spectrum Scale tokens • Assumes SSD device is unreliable, data is protected by checksum and verified on read • Provide low-latency access to file system metadata and data • Implement with consumer flash for maximum Cache/$ • Enabled by FLEA’s LSA (Data is written Sequential to Device, to eliminate wear leveling) • Reach small File performance leadership compared to other NAS Devices
  • 7. #ibmedge LROC Example Speed Up 6 • Two consumer grade 200 GB SSDs cache a forty-eight 300 GB 10K SAS disk Spectrum Scale storage system • Initially, with all data coming from the disk storage system, the client reads data from the SAS disks at ~ 3,000 IOPS • As more data is cached in Flash, client performance increases to 33,000 IOPS while reducing the load on the disk subsystem by more than 95%
  • 9. #ibmedge ESS (Spectrum Scale Raid Building Blocks) • Elastic Storage Server (ESS) is a prepacked solution using on the Spectrum Scale Raid technology and Commodity HW components • SSD/10k SAS Models • GS1, GS2, GS4,GS6 • 2 x High Volume Servers • 1/2/4/6 x JBOD disk enclosures • NL-SAS Models • GL2, GL4,GL6 • 2 x High Volume Servers • 2/4/6 x JBOD disk enclosures 8
  • 11. #ibmedge University of Basel, Switzerland 10 1460: First and only University in Switzerland until 19th century 7 faculties: Humanities, Science, Medicine, Law, Business and Economics, Psychology, Theology 7600 undergraduate students 3700 postgraduate and doctoral students 1300 academic staff 358 Professors
  • 12. #ibmedge Scientific Computing @ University of Basel • HPC Clusters – specialized for large IO (bioinformatics) and high-speed interconnects (molecular simulations) • Central systems administration • Up-to-date scientific databases • Up-to-date software stack • Back-up service • User training • User support • Developer support (code version, issue tracking, wiki, etc.) • Dedicated 24/7 production server environment for web services (SWISS-MODEL, Ismara, Mirz, etc.) 11 3.5 PB storage 10'000 CPU cores HPC compute clusters scientific software training & support
  • 13. #ibmedge Supporting research in Northwest Switzerland 12 • Hosting reference bioinformatics services • 500 registered users • 110 research groups • Acknowledged in 70 life-science publications in 2016 From stellar astrophysics… … to brain genomics… … to structural biology … … to hosting reference services… SWISS-MODEL Major funding
  • 14. #ibmedge Scientific Storage and Computing Infrastructure Once upon time … 13 HPC Cluster NFS Server
  • 15. #ibmedge Scientific Storage and Computing Infrastructure Cluster and storage grew bigger ... 14 HPC Cluster NSD Server NSD ServerNSD Server
  • 16. #ibmedge Scientific Storage and Computing Infrastructure 15 SONAS NSD Server Spectrum Scale Data Hub Layer NSD Server NSD Server TSM-HSMLTFS-EE HPC Cluster Biomedical Research Life Sciences Department Physics Department Chemistry Department Psychology Department Microscopy Facility Economy Department … Genome Sequencing Facility
  • 17. #ibmedge Cluster Export Services High available file and object export services - export/share configuration straight forward - authentication against AD or LDAP Important for planning: - NFS and Apple OS X - SMB1 not supported - mixed workload and performance - changes in authentication 16 NSD ServerNSD Server NSD Server Protocol Nodes Spectrum Scale Data Hub Layer Active Directory Authentication CIFS NFS
  • 18. #ibmedge AFM for Data migration, Example: SONAS migration Operational advantages: - preparing and prefetching before switching clients - migrate data while clients working on new share - minimal downtime: 1min (AFM) for share 30TB, 30M inodes vs. several months (using transfer host with robocopy) Technical advantages: - data transfer: observed up to 1TB/h per gateway host - ACL: transferred together with data - Direct storage → storage migration, no transfer host or copy software needed (e.g. robocopy, rsync) 17 NSD ServerNSD Server NSD Server SONAS Gateway Nodes Home Cluster Spectrum Scale Data Hub Layer
  • 19. #ibmedge Example: Scientific web server Protein sequences vs. protein structures 18
  • 20. #ibmedge Example: Scientific web server Protein annotation: humans vs. machines 19
  • 21. #ibmedge Example: Scientific web server Disaster recovery: AFM between two sites - less work to develop data replication to DR site Scientific pipeline speedup x8: big pagepool + LROC - processing steps depend on bigger datasets, unchanged for 1 week - update of datasets very simple, no data distribution required 20 NSD Server HPC Cluster NSD Server NSD Server 200km pagepool=128GB LROC: 1TB SSD AFM independent writer (replication not speed critical) Internet
  • 22. #ibmedge Information Lifecycle Management - HSM Use of tape to lower cost of storage Spectrum Archive EE (LTFS-EE): - easy to manage, direct control of tape - use of policies for fine grained placement - well suited for data export - not a full fledged backup system Spectrum Protect for Space Management - integration with backup system - requires TSM infrastructure 2121 Disk Pool TS3500 TS3500 NSD Server Spectrum Protect for Space ManagementSpectrum Archive EE TSM Server … NSD Server Spectrum Scale Data Hub Layer ClientsClients
  • 23. #ibmedge Secure environment for biomedical research Encryption - encryption of data at rest and on network - defined via policies - possibility of fine grained access groups - encryption keys managed by key management software (IBM SKLM) - integration with general research infrastructure - suited for Biomedical data and processing 22 SKLM Secure research environment Login HPC Cluster NSD Server General research environment NSD Server Clients
  • 24. #ibmedge Summary 23 SONAS NSD Server Spectrum Scale Data Hub Layer NSD Server NSD Server TSM-HSMLTFS-EE HPC Cluster Biomedical Research Life Sciences Department Physics Department Chemistry Department Psychology Department Microscopy Facility Economy Department … Genome Sequencing Facility SKLM Secure research environment Login HPC Cluster NSD Server Remote Site AFM CES: CIFS,NFSEncryption ILM, HSM LROC Remote Cluster
  • 25. #ibmedge Spectrum Scale User Group • The Spectrum Scale User Group is free to join and open to all using, interested in using or integrating Spectrum Scale. • Join the User Group activities to meet your peers and get access to experts from partners and IBM. • Next meetings: - APAC: October 14, Melbourne - Global at SC16 : November 13 1pm to 5pm, Salt Lake City • Web page: http://www.spectrumscale.org/ • Presentations: http://www.spectrumscale.org/presentations/ • Mailing list: http://www.spectrumscale.org/join/ • Contact: http://www.spectrumscale.org/committee/ • Meet Bob Oesterlin (US Co-Principal) at Edge2016: Robert.Oesterlin@nuance.com
  • 26. #ibmedge Session : Futures of IBM Spectrum Scale NDA & Customers ONLY • Who: IBM Spectrum Scale Offering Management • Carl Zetie, Ron Riffe • When: Tuesday, September 20, 2016 • 1pm to 2pm • Where: MGM Grand, Signature Tower 3 • Meeting Room D • Contact (if any questions) • douglasof@us.ibm.com, cmukhya@us.ibm.com 25
  • 27. #ibmedge Session : How to apply Flash benefits to big data analytics and unstructured data NDA & Customers ONLY • Who: IBM Elastic Storage Server Offering Management • Alex Chen • When: Thursday, September 22, 2016 • 1:15pm to 2:15pm • Where: Grand Garden Arena, Lower Level, MGM, Studio 10 • Contact(if any questions) • • cmukhya@us.ibm.com, douglasof@us.ibm.co 26
  • 28. #ibmedge Spectrum Scale Trial VM • Download the IBM Spectrum Scale Trial VM from : • http://www-03.ibm.com/systems/storage/spectrum/scale/trial.html 27
  • 29. #ibmedge Spectrum Scale Edge – Technical Sessions • Just Search for “ Spectrum Scale” in the IBM Events mobile app. There are 15+ sessions on various topics including Lab sessions. Lab Sessions: • Spectrum Scale Problem Determination Lab Date: Sept 20th 2:15 PM – 3:15 PM Location : MGM Grand , Room 317 Lab Center F • Spectrum Scale Trail VM Lab Date: Sept 20th 3:45PM – 4:45PM Location: MGM Grand , Room 317 Lab Center F • Booth on ESS , Spectrum Scale + TCT and DeepFlash 28
  • 30. © 2016 IBM Corporation #ibmedge Thank You