IBM Spectrum Scale is a scalable file system that can be used to support life science research. It provides high scalability, high availability, and a software read cache called Local Read Only Cache (LROC) that uses SSDs to improve performance. The University of Basel uses Spectrum Scale in their scientific computing and storage infrastructure to support various research areas including bioinformatics, structural biology, and hosting reference services. It provides features such as cluster file systems, data migration, hierarchical storage management, encryption, and disaster recovery between two sites using asynchronous file migration.
Session from NCUG. Stockholm 12.06.2019.
Basic Domino Performance Tuning. Ideas how to improve performance, statistics how to get information that we have issues and how to fix them
AWS 클라우드를 활용하면 사용자의 트래픽에 따라 IT 인프라 아키텍처를 확장할 수 있습니다. 이번 강연에서는 서비스 초기의 작은 트래픽에 대응할 수 있는 단순한 아키텍처로 시작해 사업 성장 후의 수백만 사용자에 달하는 대규모 트래픽을 지탱할 수 있는 고확장성 아키텍처에 이르기까지의 단계별 아키텍처 구성 방법에 대해 소개해 드리고 컴퓨팅 및 데이터베이스 선택 및 사용자 증가에 따른 트래픽 경감 방법, 오토스케일링 및 모니터링과 자동화, DB 부하 분산, 고가용성 확보 등에 대한 다양한 모범사례를 알려드릴 예정입니다.
Session from NCUG. Stockholm 12.06.2019.
Basic Domino Performance Tuning. Ideas how to improve performance, statistics how to get information that we have issues and how to fix them
AWS 클라우드를 활용하면 사용자의 트래픽에 따라 IT 인프라 아키텍처를 확장할 수 있습니다. 이번 강연에서는 서비스 초기의 작은 트래픽에 대응할 수 있는 단순한 아키텍처로 시작해 사업 성장 후의 수백만 사용자에 달하는 대규모 트래픽을 지탱할 수 있는 고확장성 아키텍처에 이르기까지의 단계별 아키텍처 구성 방법에 대해 소개해 드리고 컴퓨팅 및 데이터베이스 선택 및 사용자 증가에 따른 트래픽 경감 방법, 오토스케일링 및 모니터링과 자동화, DB 부하 분산, 고가용성 확보 등에 대한 다양한 모범사례를 알려드릴 예정입니다.
Hardware IBM Servers Information in one PPT.
Definition of Server
Different Types of Server their Application and Benefits .
Client – Server Model
Hardware Components of Server
Types of RAID and their Concept
Server Processor Diagnostics
Created By:. Mitesh Vartak mvmiteshvartak133@gmail.com
[오픈소스컨설팅]클라우드기반U2L마이그레이션 전략 및 고려사항Ji-Woong Choi
Cloud 기반으로 U2C(Unix to Cloud),U2L(Unix to Linux) 마이그레이션에 대한 가이드 라인과 사이징 관련 고려 사항에 대해 설명한 자료입니다.
많은 전환 프로젝트에서 추출된 경험치가 들어가 있으며, 전환별 난이도 및 고려사항이 들어가 있습니다.
As technology and software design practices morph and change, Lowe’s Digital has had to do the same. Moving from a single monolithic web application to multiple mobile applications for both consumers and associates has forced us to look at how we manage our development lifecycle differently. This complex landscape has changed how we look at how we leverage Akamai and their array of solutions in both our lower and production level environments. In this presentation we will discuss where we started, the challenges we faced along the way, and how we are leveraging tools and Akamai API's to streamline our solutions delivery pipeline.
Cutting-edge Hadoop clusters are bound to need custom (add-on) services that are not available in the Hadoop distribution of their choice. Agility is crucial for companies to integrate any service into existing large-scale Hadoop clusters with ease.
Apache Ambari manages the Hadoop cluster and solves this problem by extending the stack with add-on services, which can be a new Apache project, different Hadoop file system, or internal tool. This talk covers how to create a service definition in Ambari to manage lifecycle commands and configs, plus advanced topics like packaging, installing from multiple repositories, recommending and validating configs using Service Advisor, running custom commands, defining dependencies on configs and other services, and more. We will also cover how to create custom metrics and dashboards using Ambari Metric System and Grafana, generating alerts, and enabling security by authenticating with Kerberos.
Further, we will discuss the future of service definitions and how Ambari 3.0 will support custom services through Management Packs to enable Hadoop vendors to release software faster.
Speaker
Jayush Luniya, Principal Software Engineer, Hortonworks
Hardware IBM Servers Information in one PPT.
Definition of Server
Different Types of Server their Application and Benefits .
Client – Server Model
Hardware Components of Server
Types of RAID and their Concept
Server Processor Diagnostics
Created By:. Mitesh Vartak mvmiteshvartak133@gmail.com
[오픈소스컨설팅]클라우드기반U2L마이그레이션 전략 및 고려사항Ji-Woong Choi
Cloud 기반으로 U2C(Unix to Cloud),U2L(Unix to Linux) 마이그레이션에 대한 가이드 라인과 사이징 관련 고려 사항에 대해 설명한 자료입니다.
많은 전환 프로젝트에서 추출된 경험치가 들어가 있으며, 전환별 난이도 및 고려사항이 들어가 있습니다.
As technology and software design practices morph and change, Lowe’s Digital has had to do the same. Moving from a single monolithic web application to multiple mobile applications for both consumers and associates has forced us to look at how we manage our development lifecycle differently. This complex landscape has changed how we look at how we leverage Akamai and their array of solutions in both our lower and production level environments. In this presentation we will discuss where we started, the challenges we faced along the way, and how we are leveraging tools and Akamai API's to streamline our solutions delivery pipeline.
Cutting-edge Hadoop clusters are bound to need custom (add-on) services that are not available in the Hadoop distribution of their choice. Agility is crucial for companies to integrate any service into existing large-scale Hadoop clusters with ease.
Apache Ambari manages the Hadoop cluster and solves this problem by extending the stack with add-on services, which can be a new Apache project, different Hadoop file system, or internal tool. This talk covers how to create a service definition in Ambari to manage lifecycle commands and configs, plus advanced topics like packaging, installing from multiple repositories, recommending and validating configs using Service Advisor, running custom commands, defining dependencies on configs and other services, and more. We will also cover how to create custom metrics and dashboards using Ambari Metric System and Grafana, generating alerts, and enabling security by authenticating with Kerberos.
Further, we will discuss the future of service definitions and how Ambari 3.0 will support custom services through Management Packs to enable Hadoop vendors to release software faster.
Speaker
Jayush Luniya, Principal Software Engineer, Hortonworks
IBM Streams V4.2 Submission Time Fusion and Configurationlisanl
Brad Fawcett, Queenie Ma, and Mary Komor are developers with IBM Streams. In their presentation, they cover the new Submission Time Fusion and Configuration support available in IBM Streams V4.2.
Non-Blocking Checkpointing for Consistent Regions in IBM Streams V4.2.lisanl
Fang Zheng is a developer with IBM Streams. In his presentation, Fang describes the enhancements related to consistent regions that are available in IBM Streams V4.2.
Samantha Chan is a community architect working in IBM Streams. In her presentation, Samantha covers a few of the getting started resources available to new users of IBM Streams V4.2.
Installation and Setup for IBM InfoSphere Streams V4.0lisanl
Laurie Williams is the Installation component lead on the InfoSphere Streams developement team. Her presentation describes the installation and setup of IBM InfoSphere Streams V4.0 in a multi-host environment.
View related presentations and recordings from the Streams V4.0 Developers Conference at:
https://developer.ibm.com/answers/questions/183353/ibm-infosphere-streams-40-developers-conference-on.html?smartspace=streamsdev
Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Programinside-BigData.com
In this video from the DDN User Group at SC16, Sven Oehme Chief Research Strategist, IBM, presents "Big Lab Problems Solved with Spectrum Scale: Innovations for the Coral Program."
Watch the video presentation: http://wp.me/p3RLHQ-g52
Sign up for our insideHPC Newsletter: http://wp.me/p3RLHQ-g52
IBM Spectrum Scale ECM - Winning CombinationSasikanth Eda
This presentation describes various deployment options to configure IBM enterprise content management (ECM) FileNet® Content Manager components to use IBM Spectrum Scale™ (formerly known as IBM GPFS™) as back-end storage. It also describes various IBM Spectrum Scale value-added features with FileNet Content Manager
to facilitate an efficient and effective data-management solution.
Samantha Chan is a community architect for IBM Streams. In her presentation, Samantha covers the new and updated toolkits available in Streams GitHub projects, as well as the enhancements to toolkits that ship with IBM Streams V4.2.
IBM ODM Rules Compiler support in IBM Streams V4.2.lisanl
Chris Recoskie and Ankit Pasricha are developers with IBM Streams. In their presentation, they will discuss the enhancements made to IBM ODM Rules support that is available in IBM Streams V4.2.
OS for AI: Elastic Microservices & the Next Gen of MLNordic APIs
AI has been a hot topic lately, with advances being made constantly in what is possible, there has not been as much discussion of the infrastructure and scaling challenges that come with it. How do you support dozens of different languages and frameworks, and make them interoperate invisibly? How do you scale to run abstract code from thousands of different developers, simultaneously and elastically, while maintaining less than 15ms of overhead?
At Algorithmia, we’ve built, deployed, and scaled thousands of algorithms and machine learning models, using every kind of framework (from scikit-learn to tensorflow). We’ve seen many of the challenges faced in this area, and in this talk I’ll share some insights into the problems you’re likely to face, and how to approach solving them.
In brief, we’ll examine the need for, and implementations of, a complete “Operating System for AI” – a common interface for different algorithms to be used and combined, and a general architecture for serverless machine learning which is discoverable, versioned, scalable and sharable.
Design Choices for Cloud Data PlatformsAshish Mrig
You have decided to migrate your workload to Cloud, congratulations ! Which database should be used to host and query your data ? Most people go default: AWS -> Redshift, GCP ->BigQuery, Azure -> Synapse and so on. This presentation will go over design considerations, guidelines and best practices to choose your data platform and will go beyond the default choices. We will talk about evolutions of databases, design, data modeling and how to minimize the cost.
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam...HostedbyConfluent
Running Production CDC Ingestion Pipelines With Balaji Varadarajan and Pritam K Dey | Current 2022
Robinhood’s mission is to democratize finance for all. Data driven decision making is key to achieving this goal. Data needed are hosted in various OLTP databases. Replicating this data near real time in a reliable fashion to data lakehouse powers many critical use cases for the company. In Robinhood, CDC is not only used for ingestion to data-lake but is also being adopted for inter-system message exchanges between different online micro services. .
In this talk, we will describe the evolution of change data capture based ingestion in Robinhood not only in terms of the scale of data stored and queries made, but also the use cases that it supports. We will go in-depth into the CDC architecture built around our Kafka ecosystem using open source system Debezium and Apache Hudi. We will cover online inter-system message exchange use-cases along with our experience running this service at scale in Robinhood along with lessons learned.
IBM introduces a new Cloud-based sizing and design tool called "File and Object Storage Design Engine" studio or FOS-DE. This tool can be used to size IBM Spectrum NAS, IBM Spectrum Scale and IBM Cloud Object Storage opportunities
Tackling Network Bottlenecks with Hardware Accelerations: Cloud vs. On-PremiseDatabricks
The ever-growing continuous influx of data causes every component in a system to burst at its seams. GPUs and ASICs are helping on the compute side, whereas in-memory and flash storage devices are utilized to keep up with those local IOPS. All of those can perform extremely well in smaller setups and under contained workloads. However, today's workloads require more and more power that directly translates into higher scale. Training major AI models can no longer fit into humble setups. Streaming ingestion systems are barely keeping up with the load. These are just a few examples of why enterprises require a massive versatile infrastructure, that continuously grows and scales. The problems start when workloads are then scaled out to reveal the hardships of traditional network infrastructures in coping with those bandwidth hungry and latency sensitive applications. In this talk, we are going to dive into how intelligent hardware offloads can mitigate network bottlenecks in Big Data and AI platforms, and compare the offering and performance of what's available in major public clouds, as well as a la carte on-premise solutions.
Webinar: High Performance MongoDB Applications with IBM POWER8MongoDB
Innovative companies are building Internet of Things, mobile, content management, single view, and big data apps on top of MongoDB. In this session, we'll explore how the IBM POWER8 platform brings new levels of performance and ease of configuration to these solutions which already benefit from easier and faster design and development using MongoDB.
Webinar: OpenEBS - Still Free and now FASTEST Kubernetes storageMayaData Inc
Webinar Session - https://youtu.be/_5MfGMf8PG4
In this webinar, we share how the Container Attached Storage pattern makes performance tuning more tractable, by giving each workload its own storage system, thereby decreasing the variables needed to understand and tune performance.
We then introduce MayaStor, a breakthrough in the use of containers and Kubernetes as a data plane. MayaStor is the first containerized data engine available that delivers near the theoretical maximum performance of underlying systems. MayaStor performance scales with the underlying hardware and has been shown, for example, to deliver in excess of 10 million IOPS in a particular environment.
This presentation will describe how to go beyond a "Hello world" stream application and build a real-time data-driven product. We will present architectural patterns, go through tradeoffs and considerations when deciding on technology and implementation strategy, and describe how to put the pieces together. We will also cover necessary practical pieces for building real products: testing streaming applications, and how to evolve products over time.
Presented at highloadstrategy.com 2016 by Øyvind Løkling (Schibsted Products & Technology), joint work with Lars Albertsson (independent, www.mapflat.com).
Open Source Investments in Mainframe Through the Next Generation - Showcasing...Open Mainframe Project
In it's 3rd year, the Open Mainframe Project continues to invest in the open source ecosystem on mainframe through it's summer internship program. This year's class focused on improving mainframe open source packaging and support of modern technologies such as Cloud Foundry and Kubernetes.
In this session, interns will present their work and experience in working in the internship program.
Similar to Introduction to IBM Spectrum Scale and Its Use in Life Science (20)
Proactive Threat Detection and Safeguarding of Data for Enhanced Cyber resili...Sandeep Patil
IBM Storages like IBM Spectrum Scale/IBM CLoud Object storage System integrate with leading SIEM like IBM QRadar / SPLUNK for proactive threat detection and Cyber Resiliency
Analytics with unified file and object Sandeep Patil
Presentation takes you through on way to achive in-place hadoop based analytics for your file and object data. Also give you example of storage integration with cloud congnitive services
In Place Analytics For File and Object DataSandeep Patil
Why would one want to get into this mess ? Would it not be nice to have a storage system that supports unified file and object, has inplace analytics support via Hadoop connectors, performance well, is scalable , has ability to seamlessly tier to other object stores or tape and is software defined. It sounds like a No Brainier !
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
6. #ibmedge
Spectrum Scale Software Local Read Only Cache (LROC)
5
• Many NAS workloads benefit from large read cache
• SPECsfs
• OpenStack, VMWare and other virtualization
• Database
• Augment the Spectrum Scale Node DRAM cache with SSD/NVMe
• Used to cache:
– Data
– Inodes
– Indirect blocks
• Cache consistency insured by standard Spectrum Scale tokens
• Assumes SSD device is unreliable, data is protected by checksum and verified on read
• Provide low-latency access to file system metadata and data
• Implement with consumer flash for maximum Cache/$
• Enabled by FLEA’s LSA (Data is written Sequential to Device, to eliminate wear leveling)
• Reach small File performance leadership compared to other NAS Devices
7. #ibmedge
LROC Example Speed Up
6
• Two consumer grade 200 GB SSDs cache a forty-eight 300 GB 10K SAS disk Spectrum Scale storage system
• Initially, with all data coming from the disk storage system, the client reads data from the SAS disks at ~ 3,000 IOPS
• As more data is cached in Flash, client performance increases to 33,000 IOPS while reducing the load on the disk subsystem by
more than 95%
9. #ibmedge
ESS (Spectrum Scale Raid Building Blocks)
• Elastic Storage Server (ESS) is a prepacked solution using on the Spectrum Scale Raid technology and
Commodity HW components
• SSD/10k SAS Models
• GS1, GS2, GS4,GS6
• 2 x High Volume Servers
• 1/2/4/6 x JBOD disk enclosures
• NL-SAS Models
• GL2, GL4,GL6
• 2 x High Volume Servers
• 2/4/6 x JBOD disk enclosures
8
11. #ibmedge
University of Basel, Switzerland
10
1460: First and only University in Switzerland
until 19th century
7 faculties: Humanities, Science, Medicine,
Law, Business and Economics,
Psychology, Theology
7600 undergraduate students
3700 postgraduate and doctoral students
1300 academic staff
358 Professors
12. #ibmedge
Scientific Computing @ University of Basel
• HPC Clusters – specialized for large IO (bioinformatics) and high-speed
interconnects (molecular simulations)
• Central systems administration
• Up-to-date scientific databases
• Up-to-date software stack
• Back-up service
• User training
• User support
• Developer support
(code version, issue tracking,
wiki, etc.)
• Dedicated 24/7 production server environment for
web services (SWISS-MODEL, Ismara, Mirz, etc.)
11
3.5 PB
storage
10'000
CPU
cores
HPC
compute
clusters
scientific
software
training
&
support
13. #ibmedge
Supporting research in Northwest Switzerland
12
• Hosting reference bioinformatics services
• 500 registered users
• 110 research groups
• Acknowledged in 70 life-science publications in 2016
From stellar astrophysics…
… to brain genomics…
… to structural biology … … to hosting reference services…
SWISS-MODEL
Major funding
15. #ibmedge
Scientific Storage and Computing Infrastructure
Cluster and storage grew bigger ...
14
HPC Cluster
NSD Server NSD ServerNSD Server
16. #ibmedge
Scientific Storage and Computing Infrastructure
15
SONAS
NSD Server
Spectrum Scale Data Hub Layer
NSD Server NSD Server
TSM-HSMLTFS-EE
HPC Cluster
Biomedical
Research
Life Sciences
Department
Physics
Department
Chemistry
Department
Psychology
Department
Microscopy
Facility
Economy
Department
…
Genome
Sequencing
Facility
17. #ibmedge
Cluster Export Services
High available file and object export services
- export/share configuration straight forward
- authentication against AD or LDAP
Important for planning:
- NFS and Apple OS X
- SMB1 not supported
- mixed workload and performance
- changes in authentication
16
NSD ServerNSD Server NSD Server
Protocol Nodes
Spectrum Scale Data Hub Layer
Active Directory
Authentication
CIFS NFS
18. #ibmedge
AFM for Data migration, Example: SONAS migration
Operational advantages:
- preparing and prefetching before switching clients
- migrate data while clients working on new share
- minimal downtime: 1min (AFM) for share 30TB, 30M inodes
vs. several months (using transfer host with robocopy)
Technical advantages:
- data transfer: observed up to 1TB/h
per gateway host
- ACL: transferred together with data
- Direct storage → storage migration,
no transfer host or copy software
needed (e.g. robocopy, rsync)
17
NSD ServerNSD Server NSD Server
SONAS
Gateway Nodes
Home Cluster
Spectrum Scale Data Hub Layer
21. #ibmedge
Example: Scientific web server
Disaster recovery: AFM between two sites
- less work to develop data replication to DR site
Scientific pipeline speedup x8: big pagepool + LROC
- processing steps depend on bigger datasets, unchanged for 1 week
- update of datasets very simple,
no data distribution required
20
NSD Server
HPC Cluster
NSD Server NSD Server
200km
pagepool=128GB
LROC: 1TB SSD
AFM independent writer
(replication not speed critical)
Internet
22. #ibmedge
Information Lifecycle Management - HSM
Use of tape to lower cost of storage
Spectrum Archive EE (LTFS-EE):
- easy to manage, direct control of tape
- use of policies for fine grained placement
- well suited for data export
- not a full fledged backup system
Spectrum Protect for
Space Management
- integration with backup system
- requires TSM infrastructure
2121
Disk Pool
TS3500 TS3500
NSD Server
Spectrum Protect for Space ManagementSpectrum Archive EE
TSM Server
…
NSD Server
Spectrum Scale Data Hub Layer
ClientsClients
23. #ibmedge
Secure environment for biomedical research
Encryption
- encryption of data at rest and on network
- defined via policies
- possibility of fine grained access groups
- encryption keys managed by key
management software (IBM SKLM)
- integration with general research infrastructure
- suited for Biomedical data and processing
22
SKLM
Secure research environment
Login
HPC Cluster
NSD Server
General research environment
NSD Server
Clients
24. #ibmedge
Summary
23
SONAS
NSD Server
Spectrum Scale Data Hub Layer
NSD Server NSD Server
TSM-HSMLTFS-EE
HPC Cluster
Biomedical
Research
Life Sciences
Department
Physics
Department
Chemistry
Department
Psychology
Department
Microscopy
Facility
Economy
Department
…
Genome
Sequencing
Facility
SKLM
Secure research environment
Login
HPC Cluster
NSD Server
Remote Site
AFM
CES: CIFS,NFSEncryption
ILM, HSM
LROC
Remote Cluster
25. #ibmedge
Spectrum Scale User Group
• The Spectrum Scale User Group is free
to join and open to all using, interested
in using or integrating Spectrum Scale.
• Join the User Group activities to meet
your peers and get access to experts
from partners and IBM.
• Next meetings:
- APAC: October 14, Melbourne
- Global at SC16 : November 13 1pm to 5pm, Salt Lake City
• Web page: http://www.spectrumscale.org/
• Presentations: http://www.spectrumscale.org/presentations/
• Mailing list: http://www.spectrumscale.org/join/
• Contact: http://www.spectrumscale.org/committee/
• Meet Bob Oesterlin (US Co-Principal) at Edge2016: Robert.Oesterlin@nuance.com
26. #ibmedge
Session : Futures of IBM Spectrum Scale
NDA & Customers ONLY
• Who: IBM Spectrum Scale Offering Management
• Carl Zetie, Ron Riffe
• When: Tuesday, September 20, 2016
• 1pm to 2pm
• Where: MGM Grand, Signature Tower 3
• Meeting Room D
• Contact (if any questions)
• douglasof@us.ibm.com, cmukhya@us.ibm.com
25
27. #ibmedge
Session : How to apply Flash benefits to big data
analytics and unstructured data
NDA & Customers ONLY
• Who: IBM Elastic Storage Server Offering Management
• Alex Chen
• When: Thursday, September 22, 2016
• 1:15pm to 2:15pm
• Where: Grand Garden Arena, Lower Level, MGM, Studio 10
• Contact(if any questions)
• • cmukhya@us.ibm.com, douglasof@us.ibm.co
26
28. #ibmedge
Spectrum Scale Trial VM
• Download the IBM Spectrum Scale Trial VM from :
• http://www-03.ibm.com/systems/storage/spectrum/scale/trial.html
27
29. #ibmedge
Spectrum Scale Edge – Technical Sessions
• Just Search for “ Spectrum Scale” in the IBM Events mobile app. There
are 15+ sessions on various topics including Lab sessions.
Lab Sessions:
• Spectrum Scale Problem Determination Lab
Date: Sept 20th 2:15 PM – 3:15 PM
Location : MGM Grand , Room 317 Lab Center F
• Spectrum Scale Trail VM Lab
Date: Sept 20th 3:45PM – 4:45PM
Location: MGM Grand , Room 317 Lab Center F
• Booth on ESS , Spectrum Scale + TCT and DeepFlash
28