This is a massive slide deck I used as the starting point for a 1.5 hour talk at the 2012 www.nerlscd.org conference. Mixture of old and (some) new slides from my usual stuff.
Talk slides from my annual address at the Bio-IT World Expo & Conference where I cover trends, best practices and emerging pain points for life science focused HPC, scientific computing and "research IT"
Email "chris@bioteam.net" if you want a PDF copy of these slides. I've disabled the raw powerpoint download option on slideshare.
Taming Big Science Data Growth with Converged InfrastructureThe BioTeam Inc.
2014 BioIT World Expo presentation
"Many of the largest NGS sites have identified IO bottlenecks as their number one concern in growing their infrastructure to support current and projected data growth rates. In this talk Aaron D. Gardner, Senior Scientific Consultant, BioTeam, Inc. will share real-world strategies and implementation details for building converged storage infrastructure to support the performance, scalability and collaborative requirements of today's NGS workflows. "
For a copy of this presentation please email: chris@bioteam.net
2014 BioIT World - Trends from the trenches - Annual presentationChris Dagdigian
Talk slides from the annual "trends from the trenches" address at BioITWorld Expo. 2014 Edition.
### Email chris@bioteam.net if you'd like a PDF copy of this deck ###
This is a custom "Bio IT trends/problems" deck that I did for a general but highly technical audience at the 2014 Internet2 Technology Exchange conference.
Download of the raw PPT is disabled; contact me at chris@bioteam.net if a direct copy or PDF of the presentation would be useful.
Bio-IT & Cloud Sobriety: 2013 Beyond The Genome MeetingChris Dagdigian
October 2013 "Beyond the Genome" presentation slides. Talk is mostly focused on issues around IaaS cloud usage for "Bio-IT" and life science informatics & scientific computing.
PDF SLIDES AVAILABLE DIRECTLY - PLEASE EMAIL "CHRIS@BIOTEAM.NET" FOR SLIDES
BioIT World 2016 - HPC Trends from the TrenchesChris Dagdigian
As presented at BioIT World 2016. In one of the more popular presentations of the Expo, Chris delivers a candid assessment of the best, the worthwhile, and the most overhyped information technologies (IT) for life sciences. He’ll cover what has changed (or not) in the past year around infrastructure, storage, computing, and networks. This presentation will help you understand IT to build and support data intensive science.
Video link from the presentation: biote.am/bs
[Note: email chris@bioteam.net if you would like a PDF copy of this presentation]
Facilitating Collaborative Life Science Research in Commercial & Enterprise E...Chris Dagdigian
This is a talk I put together for a http://www.neren.org/ seminar called "Bridging the Gap: Research Facilitation". Tried to give a biotech/pharma view for a mostly academic audience.
This is a very short slide deck I did for a 10-minute slot on a http://pistoiaalliance.org/ webinar. The slides do not fully cover what I intend to talk about so if the webinar is recorded and available afterwards I'll update this description with the recording URL.
PDF copy of the slides available upon request ("chris@bioteam.net")
Talk slides from my annual address at the Bio-IT World Expo & Conference where I cover trends, best practices and emerging pain points for life science focused HPC, scientific computing and "research IT"
Email "chris@bioteam.net" if you want a PDF copy of these slides. I've disabled the raw powerpoint download option on slideshare.
Taming Big Science Data Growth with Converged InfrastructureThe BioTeam Inc.
2014 BioIT World Expo presentation
"Many of the largest NGS sites have identified IO bottlenecks as their number one concern in growing their infrastructure to support current and projected data growth rates. In this talk Aaron D. Gardner, Senior Scientific Consultant, BioTeam, Inc. will share real-world strategies and implementation details for building converged storage infrastructure to support the performance, scalability and collaborative requirements of today's NGS workflows. "
For a copy of this presentation please email: chris@bioteam.net
2014 BioIT World - Trends from the trenches - Annual presentationChris Dagdigian
Talk slides from the annual "trends from the trenches" address at BioITWorld Expo. 2014 Edition.
### Email chris@bioteam.net if you'd like a PDF copy of this deck ###
This is a custom "Bio IT trends/problems" deck that I did for a general but highly technical audience at the 2014 Internet2 Technology Exchange conference.
Download of the raw PPT is disabled; contact me at chris@bioteam.net if a direct copy or PDF of the presentation would be useful.
Bio-IT & Cloud Sobriety: 2013 Beyond The Genome MeetingChris Dagdigian
October 2013 "Beyond the Genome" presentation slides. Talk is mostly focused on issues around IaaS cloud usage for "Bio-IT" and life science informatics & scientific computing.
PDF SLIDES AVAILABLE DIRECTLY - PLEASE EMAIL "CHRIS@BIOTEAM.NET" FOR SLIDES
BioIT World 2016 - HPC Trends from the TrenchesChris Dagdigian
As presented at BioIT World 2016. In one of the more popular presentations of the Expo, Chris delivers a candid assessment of the best, the worthwhile, and the most overhyped information technologies (IT) for life sciences. He’ll cover what has changed (or not) in the past year around infrastructure, storage, computing, and networks. This presentation will help you understand IT to build and support data intensive science.
Video link from the presentation: biote.am/bs
[Note: email chris@bioteam.net if you would like a PDF copy of this presentation]
Facilitating Collaborative Life Science Research in Commercial & Enterprise E...Chris Dagdigian
This is a talk I put together for a http://www.neren.org/ seminar called "Bridging the Gap: Research Facilitation". Tried to give a biotech/pharma view for a mostly academic audience.
This is a very short slide deck I did for a 10-minute slot on a http://pistoiaalliance.org/ webinar. The slides do not fully cover what I intend to talk about so if the webinar is recorded and available afterwards I'll update this description with the recording URL.
PDF copy of the slides available upon request ("chris@bioteam.net")
This was a 30 min talk intended as one of the opening/overview presentations before a full-day deep dive into ScienceDMZ design patterns and architectures.
Direct downloads are not enabled. Contact me directly (chris@bioteam.net) if you for some odd reason want a copy of this slide deck!
Annual address covering trends, emerging requirements, pain points and infrastructure issues in the "Bio-IT" aka life science informatics and HPC realm; Email me if you want a PDF of this talk - chris@bioteam.net
Big Data Meets HCI—How South African Insurance Provider King Price Gives Deve...Dana Gardner
Transcript of a discussion on how an insurance innovator built a modern hyperconverged infrastructure environment that rapidly replicates databases to accelerate developer agility.
Bi isn't big data and big data isn't BI (updated)mark madsen
Big data is hyped, but isn't hype. There are definite technical, process and business differences in the big data market when compared to BI and data warehousing, but they are often poorly understood or explained. BI isn't big data, and big data isn't BI. By distilling the technical and process realities of big data systems and projects we can separate fact from fiction. This session examines the underlying assumptions and abstractions we use in the BI and DW world, the abstractions that evolved in the big data world, and how they are different. Armed with this knowledge, you will be better able to make design and architecture decisions. The session is sometimes conceptual, sometimes detailed technical explorations of data, processing and technology, but promises to be entertaining regardless of the level.
Yes, it’s about the data normally called “big”, but it’s not Hadoop for the database crowd, despite the prominent role Hadoop plays. The session will be technical, but in a technology preview/overview fashion. I won’t be teaching you to write MapReduce jobs or anything of the sort.
The first part will be an overview of the types, formats and structures of data that aren’t normally in the data warehouse realm. The second part will cover some of the basic technology components, vendors and architecture.
The goal is to provide an overview of the extent of data available and some of the nuances or challenges in processing it, coupled with some examples of tools or vendors that may be a starting point if you are building in a particular area.
"Physiotherapy at home" considers a possible alternative to physiotherapy treatments at a healthcare facility.
Travelling to a healthcare facility can be expensive, time-consuming and create unnecessary exposure to other patients, among other factors.
Our proposal is to use consumer technology to allow patients to receive some forms of physiotherapy treatment through the web. Team Albatross created a system where healthcare professionals create exercises targeting specific parts of the body and monitor patients performing these exercises in real-time. We created this working prototype in 24-hours.
Team Albatross is George Goh and Liu Xiaohui.
IT Performance Management Handbook for CIOsVikram Ramesh
Learn why measuring performance on individual devices and systems often leaves admins flying blind when it comes to SLA management and identifying performance bottlenecks. This in-depth e-Guide talks about how VirtualWisdom4 can give administrators a live, up- to-the-second view across the system-wide IT infrastructure.
Inria Tech Talk : boostez la performance de vos objets connectés - Mercredi 2...FrenchTechCentral
Vous cherchez une solution IoT clé en main et fiable pour vos applications ? Ce TechTalk est fait pour vous !
Inria, institut national de recherche dédié au numérique, connecte à French Tech Central les entrepreneurs au meilleur de la recherche publiquefrançaise et vous convie à un Tech Talk dédié à l’IoT. Il sera suivi d’un ateliersur ce sujet au TechShop de Station F.
L’équipe EVA du centre de recherche Inria de Paris déploit plus de 1000 capteurs sur 4 continents, dans des applications d’agriculture connectée (www.savethepeaches.com), de ville intelligente (www.smartmarina.org) et de monitoring environnemental (www.snowhow.io).
Thomas Watteyne(membre de l’équipe Eva d’Inria) vous donnera toutes les clés de compréhension de cette technologie pour la ré-utiliser sur vos applications.
Quels bénéfices pour votre business ? c’est une solution IoT clé en main et fiable : elle allie une technologie de réseau mesh qui a fait ses preuves (99.999% de fiabilité bout-en-bout, 10 ans de durée de vie sur batteries), avec une solution cloud et une méthodologie de déploiement et monitoring réseau pour des performances inégalées, utilisable aussi pour l’industrie 4.0. Un must pour vos applications !
Top 5 Deep Learning and AI Stories - November 3, 2017NVIDIA
Read this week's top 5 news updates in deep learning and AI: Pentagon official says that AI and machine learning will revolutionize the US intelligence community; how AI could spot lung cancer faster; AI researchers can now access optimized deep learning framework containers through NVIDIA GPU Cloud; AI4ALL improves student access to AI resources by partnering with NVIDIA Deep Learning Institute; the Deep Learning Institute expands its courses to address the growing demand for AI talent.
Agents for Agility - The Just-in-Time Enterprise Has ArrivedInside Analysis
Hot Technologies with Krish Krishnan, Robin Bloor and EnterpriseWeb
Live Webcast Aug. 21, 2013
The demand for agility continues to motivate today's data-driven organizations. Competitors all over the globe are vying for faster time-to-insight, or even time-to-action. But there are other issues like governance and data quality that typically slow down key processes. Almost invariably, legacy systems that perform critical business processes are late to the party, resulting in enterprise inertia. However, a new wave of innovation is solving that problem by incorporating a late-binding approach for both analytics and operations.
Register for this episode of Hot Technologies to hear Analysts Krish Krishnan of Sixth Sense, and Dr. Robin Bloor of The Bloor Group, as they outline their competing visions for the architecture of a real-time enterprise. They'll be briefed by Dave Duggal of EnterpriseWeb, who will tout his company's platform for delivering robust enterprise functionality at the speed of the network. He'll discuss how EnterpriseWeb leverages the best ideas of service orientation, combined with intelligent agents that act as virtual hubs for the sharing of data, analytics, and mission-critical business processes.
Will the Cloud be your disaster, or will Cloud be your disaster recovery?Livingstone Advisory
Making real sense of enterprise Cloud computing in the context of your business is not always a trivial task. The volume, diversity and intensity of opinions on what cloud can do for your organization are relentless, as are the pressures to lower IT costs, speed up implementations, simplify enterprise IT and deliver more value in your own organizations.
Shifting your mission critical systems to the cloud presents a formidable range of challenges for many organizations, least of which the potential loss of control over your disaster recovery capability. Conversely, keeping your enterprise IT systems where you can see them, and using the cloud to manage your backups and disaster recovery may appear to run counter to the prevailing perception that the cloud is the ultimate destination for all IT systems.
In this presentation, Rob Livingstone will be covering off some of the key considerations of disaster recovery planning in the hybrid cloud environment and how, paradoxically, cloud could either be the cause of your disaster or has the potential to save you from one. He will be offering practical insights and tips on how you should approach the cloud when it comes to planning for the worst so that you come out looking your best.
The Big Data Scotland 2015 conference brought together business leaders and technologists from across the country to explore the value of Big Data & Analytics. The conference considered technological developments, market trends and business strategy; showcasing innovative examples of analytics being used effectively across a range of practical applications. The event offered a unique opportunity for technologists to come together for knowledge exchange, networking and debate.
Systematic Innovation in Software Using TRIZMichael Kalika
Someone somewhere has already solved your problem or a very similar problem, and all we need to do is apply the same principle to the current problem and solve it similarly…
TRIZ is Theory of the Resolution of Invention-related Tasks. It is a a problem-solving, analysis and forecasting tool/ framework derived from the study of patterns of invention in the global patent literature that was developed in USSR and “immigrated” to the West after “perestrojka” period in 1990s. It is a well-structured inventive problem-solving approach which replaces the unsystematic trial-and-error method used in the search for solutions. This helps in overcoming psychological inertia and “stuckness” which can impede reaching the best possible design.
As leaders, we are often facilitating discussions as a part of designing new products, architectures, system design or problem solving.
In this lecture you will learn about what TRIZ is and how to apply its fundamental principles in Software Engineering and Architecture world.
On 7th January 2016 Crafitti was invited to deliver a talk on Dr. Reddy's Labs Innovation day on "ALVIS for Innovation and Decision making" The talk was well received and these slides were used.
Slides used to guide the discussion during MESA workshop at ARC Europe Industry Forum in Amsterdam, March 3rd, 2016.
Includes notes of the discussion. Subjects: challenges in MOM/MES: complexity related to supply chain, manufacturing and new product introduction. Organizational alignment and governance.
How Sweco creates operationally high performance buildings and reduce client...Carita Kottila
Sweco Architects and Sweco PM’s Key Note Speech - Architectural Best Practices, BIM and Construction Management in Demanding Hospital Projects – in the global Graphisoft Key Customer Conference KCC 2014 ,June 5-7, 2014.
This was a 30 min talk intended as one of the opening/overview presentations before a full-day deep dive into ScienceDMZ design patterns and architectures.
Direct downloads are not enabled. Contact me directly (chris@bioteam.net) if you for some odd reason want a copy of this slide deck!
Annual address covering trends, emerging requirements, pain points and infrastructure issues in the "Bio-IT" aka life science informatics and HPC realm; Email me if you want a PDF of this talk - chris@bioteam.net
Big Data Meets HCI—How South African Insurance Provider King Price Gives Deve...Dana Gardner
Transcript of a discussion on how an insurance innovator built a modern hyperconverged infrastructure environment that rapidly replicates databases to accelerate developer agility.
Bi isn't big data and big data isn't BI (updated)mark madsen
Big data is hyped, but isn't hype. There are definite technical, process and business differences in the big data market when compared to BI and data warehousing, but they are often poorly understood or explained. BI isn't big data, and big data isn't BI. By distilling the technical and process realities of big data systems and projects we can separate fact from fiction. This session examines the underlying assumptions and abstractions we use in the BI and DW world, the abstractions that evolved in the big data world, and how they are different. Armed with this knowledge, you will be better able to make design and architecture decisions. The session is sometimes conceptual, sometimes detailed technical explorations of data, processing and technology, but promises to be entertaining regardless of the level.
Yes, it’s about the data normally called “big”, but it’s not Hadoop for the database crowd, despite the prominent role Hadoop plays. The session will be technical, but in a technology preview/overview fashion. I won’t be teaching you to write MapReduce jobs or anything of the sort.
The first part will be an overview of the types, formats and structures of data that aren’t normally in the data warehouse realm. The second part will cover some of the basic technology components, vendors and architecture.
The goal is to provide an overview of the extent of data available and some of the nuances or challenges in processing it, coupled with some examples of tools or vendors that may be a starting point if you are building in a particular area.
"Physiotherapy at home" considers a possible alternative to physiotherapy treatments at a healthcare facility.
Travelling to a healthcare facility can be expensive, time-consuming and create unnecessary exposure to other patients, among other factors.
Our proposal is to use consumer technology to allow patients to receive some forms of physiotherapy treatment through the web. Team Albatross created a system where healthcare professionals create exercises targeting specific parts of the body and monitor patients performing these exercises in real-time. We created this working prototype in 24-hours.
Team Albatross is George Goh and Liu Xiaohui.
IT Performance Management Handbook for CIOsVikram Ramesh
Learn why measuring performance on individual devices and systems often leaves admins flying blind when it comes to SLA management and identifying performance bottlenecks. This in-depth e-Guide talks about how VirtualWisdom4 can give administrators a live, up- to-the-second view across the system-wide IT infrastructure.
Inria Tech Talk : boostez la performance de vos objets connectés - Mercredi 2...FrenchTechCentral
Vous cherchez une solution IoT clé en main et fiable pour vos applications ? Ce TechTalk est fait pour vous !
Inria, institut national de recherche dédié au numérique, connecte à French Tech Central les entrepreneurs au meilleur de la recherche publiquefrançaise et vous convie à un Tech Talk dédié à l’IoT. Il sera suivi d’un ateliersur ce sujet au TechShop de Station F.
L’équipe EVA du centre de recherche Inria de Paris déploit plus de 1000 capteurs sur 4 continents, dans des applications d’agriculture connectée (www.savethepeaches.com), de ville intelligente (www.smartmarina.org) et de monitoring environnemental (www.snowhow.io).
Thomas Watteyne(membre de l’équipe Eva d’Inria) vous donnera toutes les clés de compréhension de cette technologie pour la ré-utiliser sur vos applications.
Quels bénéfices pour votre business ? c’est une solution IoT clé en main et fiable : elle allie une technologie de réseau mesh qui a fait ses preuves (99.999% de fiabilité bout-en-bout, 10 ans de durée de vie sur batteries), avec une solution cloud et une méthodologie de déploiement et monitoring réseau pour des performances inégalées, utilisable aussi pour l’industrie 4.0. Un must pour vos applications !
Top 5 Deep Learning and AI Stories - November 3, 2017NVIDIA
Read this week's top 5 news updates in deep learning and AI: Pentagon official says that AI and machine learning will revolutionize the US intelligence community; how AI could spot lung cancer faster; AI researchers can now access optimized deep learning framework containers through NVIDIA GPU Cloud; AI4ALL improves student access to AI resources by partnering with NVIDIA Deep Learning Institute; the Deep Learning Institute expands its courses to address the growing demand for AI talent.
Agents for Agility - The Just-in-Time Enterprise Has ArrivedInside Analysis
Hot Technologies with Krish Krishnan, Robin Bloor and EnterpriseWeb
Live Webcast Aug. 21, 2013
The demand for agility continues to motivate today's data-driven organizations. Competitors all over the globe are vying for faster time-to-insight, or even time-to-action. But there are other issues like governance and data quality that typically slow down key processes. Almost invariably, legacy systems that perform critical business processes are late to the party, resulting in enterprise inertia. However, a new wave of innovation is solving that problem by incorporating a late-binding approach for both analytics and operations.
Register for this episode of Hot Technologies to hear Analysts Krish Krishnan of Sixth Sense, and Dr. Robin Bloor of The Bloor Group, as they outline their competing visions for the architecture of a real-time enterprise. They'll be briefed by Dave Duggal of EnterpriseWeb, who will tout his company's platform for delivering robust enterprise functionality at the speed of the network. He'll discuss how EnterpriseWeb leverages the best ideas of service orientation, combined with intelligent agents that act as virtual hubs for the sharing of data, analytics, and mission-critical business processes.
Will the Cloud be your disaster, or will Cloud be your disaster recovery?Livingstone Advisory
Making real sense of enterprise Cloud computing in the context of your business is not always a trivial task. The volume, diversity and intensity of opinions on what cloud can do for your organization are relentless, as are the pressures to lower IT costs, speed up implementations, simplify enterprise IT and deliver more value in your own organizations.
Shifting your mission critical systems to the cloud presents a formidable range of challenges for many organizations, least of which the potential loss of control over your disaster recovery capability. Conversely, keeping your enterprise IT systems where you can see them, and using the cloud to manage your backups and disaster recovery may appear to run counter to the prevailing perception that the cloud is the ultimate destination for all IT systems.
In this presentation, Rob Livingstone will be covering off some of the key considerations of disaster recovery planning in the hybrid cloud environment and how, paradoxically, cloud could either be the cause of your disaster or has the potential to save you from one. He will be offering practical insights and tips on how you should approach the cloud when it comes to planning for the worst so that you come out looking your best.
The Big Data Scotland 2015 conference brought together business leaders and technologists from across the country to explore the value of Big Data & Analytics. The conference considered technological developments, market trends and business strategy; showcasing innovative examples of analytics being used effectively across a range of practical applications. The event offered a unique opportunity for technologists to come together for knowledge exchange, networking and debate.
Systematic Innovation in Software Using TRIZMichael Kalika
Someone somewhere has already solved your problem or a very similar problem, and all we need to do is apply the same principle to the current problem and solve it similarly…
TRIZ is Theory of the Resolution of Invention-related Tasks. It is a a problem-solving, analysis and forecasting tool/ framework derived from the study of patterns of invention in the global patent literature that was developed in USSR and “immigrated” to the West after “perestrojka” period in 1990s. It is a well-structured inventive problem-solving approach which replaces the unsystematic trial-and-error method used in the search for solutions. This helps in overcoming psychological inertia and “stuckness” which can impede reaching the best possible design.
As leaders, we are often facilitating discussions as a part of designing new products, architectures, system design or problem solving.
In this lecture you will learn about what TRIZ is and how to apply its fundamental principles in Software Engineering and Architecture world.
On 7th January 2016 Crafitti was invited to deliver a talk on Dr. Reddy's Labs Innovation day on "ALVIS for Innovation and Decision making" The talk was well received and these slides were used.
Slides used to guide the discussion during MESA workshop at ARC Europe Industry Forum in Amsterdam, March 3rd, 2016.
Includes notes of the discussion. Subjects: challenges in MOM/MES: complexity related to supply chain, manufacturing and new product introduction. Organizational alignment and governance.
How Sweco creates operationally high performance buildings and reduce client...Carita Kottila
Sweco Architects and Sweco PM’s Key Note Speech - Architectural Best Practices, BIM and Construction Management in Demanding Hospital Projects – in the global Graphisoft Key Customer Conference KCC 2014 ,June 5-7, 2014.
The Expansive Hospital is a board game that helps to understand the challenges of collaboration among construction and healthcare experts. The elements of the game are based on a set of interviews conducted with construction and healthcare professionals in the Region of Twente, Netherlands.
1- Mobile ad hoc networks are formed dynamically by an
autonomous system of mobile nodes that are connected
via wireless links.
2- Multihop communication- node communicate with the
help of two or more node from source to destination.
3- No existing fixed infrastructure or centralized administration –No base station.
4- Mobile nodes are free to move randomly-Network topology changes frequently
5- May Operate as standalone fashion or also can be connected to the larger internet.
6- Each node work as router
Primary Goals of Security in MANET
To assure a reliable data transfer over the communication networks and to protect the system resources a number of security services are classified in five categories:-
1-Authentication:- The process of identifying an individual , usually based on a username and password.
2- Confidentially:- Confidentiality aims at protecting the data from disclosure to unauthorized person.
Network attacks against confidentiality
* Packet capturing
Password attack
Port scanning
Dumpster Diving
Wiretapping
Phishing and Pharming
2-Non repudiation:- Integrity guarantees that a message being transferred is never corrupted.
3- Integrity:- Integrity guarantees that a message being transferred is never corrupted.
network attack against integrity
Salami attack
trust relationship attacks
Man in the middle attack
Session hijacking attacks
4- Availability:- Its ensure that data ,network resources or network services are available to legitimate user when required.
network attack against availability
Denial of services attacks
Distributed denial of services attack
SYN flood attacks and ICMP flood attacks
Electrical power attacks
Server Room environment attacks
Key management
The security in networking is in many cases dependent on proper key management.
Key management consists of various services, of which each is vital for the security
of the networking systems
* Trust model:-Its must determine how much different element in the network can trust each other.
* Cryptosystem:- Public and symmetric key mechanism can be applied .
* Key creation:- It must determine which parties are allowed to generate key to themselves.
* Key storage :- In adhoc network any network element may have to store its own key and possibly key of other element as well.
* Key distribution:- The key management service must ensure that the generated keys are securely distributed to their owners.
2nd Qatar BIM User Day Document Control and Collaboration TechnologiesBIM User Day
Speaker: Jeremy Shulman, Aconex Qatar
-Which technologies are changing the way project data is managed and distributed?
-What challenges are faced in adopting?
About the Qatar BIM User Day:
Qatar University, HOCHTIEF ViCon and Teesside University proudly take the initiative to facilitate modern and innovative methods in the Gulf construction industry. The focus is Building Information Modeling (BIM), and our aim is to establish a knowledge platform with government, research and industry experts. The User Day aims to help people to share knowledge, discuss new technologies, and identify new potentials for BIM. More information: www.bimuserday.com Follow BUD on Twitter @bimuserday
What does BIM mean to a maintenance technician? Beyond the hype, a practical experience
Mr. Simon Ng - Associate Director, WSP Hong Kong Limited
Mr. Bruno Lhopiteau – General Manager, Siveco China Limited
HKIBIM-CIC BIM Conference 2015
Date: 19-Nov-2015 (Thu)
Time: 9:00 a.m. – 5:00 p.m.
Venue: Theatre 2, Level 1, Hong Kong Convention and Exhibition Centre, Wan Chai, Hong Kong
Organizers:
The Hong Kong Institute of Building Information Modelling (HKIBIM)
http://www.hkibim.org
Co-organizer:
The Construction Industry Council
http://www.hkcic.org
The HKIBIM - CIC BIM Conference 2015 is the 6th Annual Conference organized by the Hong Kong Institute of Building Information Modelling (HKIBIM). It is the premier annual event for experienced AEC professionals to demonstrate the practical use of Building Information Modelling (BIM) processes using real cases. The speakers will illustrate lessons learned in practical projects so that others can improve their efficient use of BIM and advance practical knowledge.
Fujitsu & M2SYS Webinar - How Palm Vein Biometrics Can Strengthen PCI and Wor...M2SYS Technology
These slides are presented by Fujitsu and M2SYS Technology and cover how palm vein biometrics can help to strengthen PCI and Workforce Management (WFM) compliance. Topics covered include the cost of PCI and WFM non-compliance, how palm vein biometrics works and how this technology can help to strengthen compliance and lower costs for a business.
The Benefits of Using a Biometric Timeclock in Workforce ManagementM2SYS Technology
Biometric time clocks offer string advantages to employers who are seeking to reduce labor costs, increase accountability, reduce buddy punching, and increase employee productivity.
Learn more about what advantages a biometric time clock can offer and why it will result in a strong return on investment (ROI).
User generated data is an old problem. Systems and network telemetry, page analytics and application state combine to form an ever growing mountain of data collected by today's tools. Collecting and storing this data requires more than just a single application, having no single point where the user touches the system and gets an answer makes debugging a nightmare and reproducing the error intractable. Distributed systems require a clear perspective on production systems and access to data in real time to have any hope of solving complex problems related to state, all while not impacting user experience.
We will explain the problem, the pains and how we solved them. Develop in production; push code to development.
Softlayer Bluemix User Summit 2015 KeynoteJesse Proudman
Keynote presentation from Softlayer Bluemix User Summit 2015 hosted in Tokyo, Japan. A history of OpenStack distributions, an introduction to Private Cloud as a Service, and the Blue Box / IBM Cloud story.
In this session we’ll leave the need for performance a foregone conclusion and take a whirlwind tour through the complexity of modern Internet architectures. The complexities lead to evil optimization problems and significant challenges troubleshooting production issues to a speedy and successful end.
Starting with the simple facts that you can’t fix what you can’t see and you can’t improve what you can’t measure, we’ll discuss what needs monitoring and why. We’ll talk about unlikely allies in the fight for time and budget to instrument systems, applications and processes for observability.
You’ll leave the session with a better understanding of what it looks like to troubleshoot the storm of a malfunctioning large architecture and some tools and techniques you can use to not be swallowed by the Kraken.
Can we hack open source cloud platforms to help reduce emissions? cloudstack ...Tom Raftery
To over come the lack of transparency around energy and emissions in the cloud space, we need to hack the open source cloud platforms, and write that transparency in.
Bio-IT Trends From The Trenches (digital edition)Chris Dagdigian
Note: Contact me directly dag@bioteam.net if you would like a PDF download of these slides
This is Chris Dagdigian’s 10th year delivering his no holds barred, candid state of the industry address at BioIT World, and we are not going to let a pandemic stop him.
Instead of his typical talk, five distinguished panelists will join Chris for a spirited discussion on Current Events and Scientific Computing and the impacts of the COVID-19 Pandemic:
Tiny slide deck from a 5-min lightning talk covering a recent project involving live replication of 2-petabytes of scientific data.
Please leave feedback if you'd like to see this as a long-form technical blog article or conference talk, thanks!
Cloud Sobriety for Life Science IT Leadership (2018 Edition)Chris Dagdigian
Candid/blunt AWS advice for research IT and life science IT leadership. Hard lessons learned from many years of AWS consulting. Contact dag@bioteam.net if you want a PDF copy of this presentation
BioITWorld 2013 presentation - Best practices for building multi-tenant HPC clusters for Pharma/BioTech
Essentially a mini case study of a recent deployment of a multi-petabyte, 1000+ CPU core Linux cluster in the Boston area.
Please email me at: chris@bioteam.net if you would like the actual PDF file itself.
20 slides for a 10 minute talk!
Short presentation for the 2012 Amazon Web Services re:Invent conference. This briefly covers some Computer Aided Engineering (CAE) simulation work done on Amazon using CST Studio software. Interesting Linux/Windows/Cloudbursting use case.
(PDF available upon request). This is an updated version of the 2012 BioITWorld Boston talk that I gave 6 weeks later at Bio IT World Asia in June 2012. Some slide content was updated and revised and I also deleted a number of slides in an attempt to shorten the talk since I'm known to speak fast. There was legit concern I'd be unintelligible to non-native english speakers!
Talk slides as delivered at the 2012 Bio-IT World Conference in Boston, MA
This is my annual "state of the state" address that has become somewhat popular.
A presentation given at the 2011 Amazon AWS Genomics meeting held in Seattle, WA.
This is a 30 minute talk I gave focusing mainly on practical tools, tips and methods for bootstrapping and orchestration on the cloud.
Covers examples of:
Ubuntu Cloud Init
AWS Cloud Formation
Opscode Chef
MIT StarCluster
Mapping Life Science Informatics to the CloudChris Dagdigian
Infrastructure cloud platforms such as those offered by Amazon Web Services are not designed and built with scientific research as the primary use case. These presentation slides cover the current state of mapping life science research and HPC technique onto “the cloud” and how to work around the common engineering, orchestration and data movement problems.
[Note: I've replaced the 2011 version of this talk deck with a slightly updated version as delivered at the AIRI Petabyte Challenge Meeting]
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
UiPath Test Automation using UiPath Test Suite series, part 4DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 4. In this session, we will cover Test Manager overview along with SAP heatmap.
The UiPath Test Manager overview with SAP heatmap webinar offers a concise yet comprehensive exploration of the role of a Test Manager within SAP environments, coupled with the utilization of heatmaps for effective testing strategies.
Participants will gain insights into the responsibilities, challenges, and best practices associated with test management in SAP projects. Additionally, the webinar delves into the significance of heatmaps as a visual aid for identifying testing priorities, areas of risk, and resource allocation within SAP landscapes. Through this session, attendees can expect to enhance their understanding of test management principles while learning practical approaches to optimize testing processes in SAP environments using heatmap visualization techniques
What will you get from this session?
1. Insights into SAP testing best practices
2. Heatmap utilization for testing
3. Optimization of testing processes
4. Demo
Topics covered:
Execution from the test manager
Orchestrator execution result
Defect reporting
SAP heatmap example with demo
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
PHP Frameworks: I want to break free (IPC Berlin 2024)
Bio-IT for Core Facility Managers
1. Bio-IT For Core Facility Leaders
Tips, Tricks & Trends
2012 NERLCSD Meeting - www.nerlscd.org
1
Wednesday, October 31, 12
2. Intro 1
Meta-Issues (The Big Picture) 2
Infrastructure Tour 3
Compute & HPC 4
Storage 5
Cloud & Big Data 6
2
Wednesday, October 31, 12
3. I’m Chris.
I’m an infrastructure geek.
I work for the BioTeam.
@chris_dag 3
Wednesday, October 31, 12
4. BioTeam
Who, what & why
‣ Independent consulting shop
‣ Staffed by scientists forced to
learn IT, SW & HPC to get our
own research done
‣ 12+ years bridging the “gap”
between science, IT & high
performance computing
‣ www.bioteam.net
4
Wednesday, October 31, 12
5. Listen to me at your own risk
Seriously.
‣ Clever people find multiple
solutions to common issues
‣ I’m fairly blunt, burnt-out and
cynical in my advanced age
‣ Significant portion of my work
has been done in demanding
production Biotech & Pharma
environments
‣ Filter my words accordingly
5
Wednesday, October 31, 12
6. Intro 1
Meta-Issues (The Big Picture) 2
Infrastructure Tour 3
Compute & HPC 4
Storage 5
Cloud & Big Data 6
6
Wednesday, October 31, 12
7. Meta-Issues
Why you need to track this stuff ...
7
Wednesday, October 31, 12
8. Big Picture
Why this stuff matters ...
‣ HUGE revolution in the rate at which lab instruments are
being redesigned, improved & refreshed
• Example: CCD sensor upgrade on that confocal
microscopy rig just doubled your storage requirements
• Example: That 2D ultrasound imager is now a 3D imager
• Example: Illumina HiSeq upgrade just doubled the rate at
which you can acquire genomes. Massive downstream
increase in storage, compute & data movement needs
8
Wednesday, October 31, 12
9. The Central Problem Is ...
‣ Instrumentation & protocols are changing FAR FASTER
than we can refresh our Research-IT & Scientific
Computing infrastructure
• The science is changing month-to-month ...
• ... while our IT infrastructure only gets refreshed every 2-7
years
‣ We have to design systems TODAY that can support
unknown research requirements & workflows over many
years (gulp ...)
9
Wednesday, October 31, 12
10. The Central Problem Is ...
‣ The easy period is over
‣ 5 years ago you could toss inexpensive storage and
servers at the problem; even in a nearby closet or under
a lab bench if necessary
‣ That does not work any more; IT needs are too extreme
‣ 1000 CPU Linux clusters and petascale storage is the
new normal; try fitting THAT in a closet!
10
Wednesday, October 31, 12
11. The Take Home Lesson
What core facility leadership needs to understand
‣ The incredible rate of cost decreases & capability gains
seen in the lab instrumentation space is not mirrored
everywhere
‣ As gear gets cheaper/faster, scientists will simply do
more work and ask more questions. Nobody simply
banks the financial savings when an instrument gets
50% cheaper -- they just buy two of them!
‣ IT technology is not improving at the same rate; we also
can’t change our IT infrastructures all that rapidly
11
Wednesday, October 31, 12
12. If you get it wrong ...
‣ Lost opportunity
‣ Frustrated & very vocal researchers
‣ Problems in recruiting
‣ Publication problems
12
Wednesday, October 31, 12
13. Intro 1
Meta-Issues (The Big Picture) 2
Infrastructure Tour 3
Compute & HPC 4
Storage 5
Cloud & Big Data 6
13
Wednesday, October 31, 12
14. Infrastructure Tour
What does this stuff look like?
14
Wednesday, October 31, 12
39. Real world screenshot from earlier this month
16 monster compute nodes + 22 GPU nodes
Cost? 30 bucks an hour via AWS Spot Market
Yep. This counts.
39
Wednesday, October 31, 12
46. Intro 1
Meta-Issues (The Big Picture) 2
Infrastructure Tour 3
Compute & HPC 4
Storage 5
Cloud & Big Data 6
46
Wednesday, October 31, 12
47. Compute
Actually the easy bit ...
47
Wednesday, October 31, 12
48. Compute Power
Not a big deal in 2012 ...
‣ Compute power is largely a solved problem
‣ It’s just a commodity
‣ Cheap, simple & very easy to acquire
‣ Lets talk about what you need to know ...
48
Wednesday, October 31, 12
49. Compute Trends
Thinks you should be tracking ...
‣ Facility Issues
‣ “Fat Nodes” replacing Linux Clusters
‣ Increasing presence of serious “lab-local” IT
49
Wednesday, October 31, 12
50. Facility Stuff
‣ Compute & storage
requirements are getting
larger and larger
‣ We are packing more “stuff”
into smaller spaces
‣ This increases (radically)
electrical and cooling
requirements
50
Wednesday, October 31, 12
51. Facility Stuff - Core issue
‣ Facility & power issues can
take many months or years to
address
‣ Sometimes it may be
impossible to address (new
building required ...)
‣ If research IT footprint is
growing fast; you must be well
versed in your facility
planning/upgrade process
51
Wednesday, October 31, 12
52. Facility Stuff - One more thing
‣ Sometimes central IT will begin
facility upgrade efforts without
consulting with research users
• This was the reason behind one of
our more ‘interesting’ projects in
2012
‣ ... a client was weeks away from
signing off on a $MM datacenter
which would not have had enough
electricity to support current
research & faculty recruiting
commitments
52
Wednesday, October 31, 12
54. Fat Nodes - 1 box replacing a cluster
‣ This server has 64 CPU Cores
‣ .. and up to 1TB of RAM
‣ Fantastic Genomics/Chemistry
system
• A 256GB RAM version only
costs $13,000
‣ These single systems are
replacing small clusters in
some environments
54
Wednesday, October 31, 12
55. Fat Nodes - Clever Scale-out Packaging
‣ This 2U chassis contains 4
individual servers
‣ Systems like this get near
“blade” density without
the price premium seen
with proprietary blade
packaging
‣ These “shrink” clusters in
a major way or replace
small ones
55
Wednesday, October 31, 12
57. “Serious” IT now in your wet lab ...
‣ Instruments used to ship with a
Windows PC “instrument
control workstation”
‣ As instruments get more
powerful the “companion”
hardware is starting to scale-up
‣ End result: very significant stuff
that used to live in your
datacenter is now being rolled
into lab enviroments
57
Wednesday, October 31, 12
58. “Serious” IT now in your wet lab ...
‣ You may be surpised what
you find in your labs in ’12
‣ ... can be problematic for a
few reasons ...
1. IT support & backup
2. Power & cooling
3. Noise
4. Security
58
Wednesday, October 31, 12
59. Networking
Also not particularly worrisome ...
59
Wednesday, October 31, 12
60. Networking
‣ Networking is also not super complicated
‣ It’s also fairly cheap & commoditized in ’12
‣ There are three core uses for networks:
1. Communication between servers & services
2. Message passing within a single application
3. Sharing files and data between many clients
60
Wednesday, October 31, 12
61. Networking 1 - Servers & Services
‣ Ethernet. Period. Enough said.
‣ Your only decision is between 10-Gig and 1-Gig ethernet
‣ 1-Gig Ethernet is pervasive and dirt cheap
‣ 10-Gig Ethernet is getting cheaper and on it’s way to
becoming pervasive
61
Wednesday, October 31, 12
62. Networking 1 - Ethernet
‣ Everything speaks ethernet
‣ 1-Gig is still the common interconnect for most things
‣ 10-Gig is the standard now for the “core”
‣ 10-Gig is the standard for top-of-rack and “aggregation”
‣ 10-Gig connections to “special” servers is the norm
62
Wednesday, October 31, 12
63. Networking 2 - Message Passing
‣ Parallel applications can span many servers at once
‣ Communicate/coordinate via “message passing”
‣ Ethernet is fine for this but has a somewhat high latency
between message packets
‣ Many apps can tolerate Ethernet-level latency; some
applications clearly benefit from a message passing
network with lower latency
‣ There used to be many competing alternatives
‣ Clear 2012 winner is “Infiniband” 63
Wednesday, October 31, 12
64. Networking 2 - Message Passing
‣ The only things you need to know ...
‣ Infiniband is an expensive networking alternative that
offers much lower latency than Ethernet
‣ You would only pay for and deploy an IB fabric if you had
an application or use case that requires it.
‣ No big deal. It’s just “another” network.
64
Wednesday, October 31, 12
65. Networking 3 - File Sharing
‣ For ‘Omics this is the primary focus area
‣ Overwhelming need for shared read/write access to files
and data between instruments, HPC environment and
researcher desktops
‣ In HPC environments you will often have a separate
network just for file sharing traffic
65
Wednesday, October 31, 12
66. Networking 3 - File Sharing
‣ Generic file sharing uses familiar NFS or Windows fileshare
protocols. No big deal
‣ Always implemented over Ethernet although often a mixture
of 10-Gig and 1-Gig connections
• 10-Gig connections to the file servers, storage and edge switches;
1-gig connections to cluster nodes and user desktops
‣ Infiniband also has a presence here
• Many “parallel” or “cluster” filesystems may talk to the clients
via NFS-over-ethernet but internally the distributed components
may use a private Infiband network for metadata and
coordination.
66
Wednesday, October 31, 12
67. Storage.
(the hard bit ...)
67
Wednesday, October 31, 12
68. Storage
Setting the stage ...
‣ Life science is generating torrents of data
‣ Size and volume often dwarf all other research areas -
particularly with Bioinformatics & Genomics work
‣ Big/Fast storage is not cheap and is not commodity
‣ There are many vendors and many ways to spectacularly
waste tons of money
‣ And we still have an overwhelming need for storage that
can be shared concurrently between many different
users, systems and clients
68
Wednesday, October 31, 12
69. Life Science “Data Deluge”
‣ Scare stories and shocking graphs getting tiresome
‣ We’ve been dealing with terabyte-scale lab instruments
& data movement issues since 2004
• And somehow we’ve managed to survive ...
‣ Next few slides
• Try to explain why storage does not stress me out all that
much in 2012 ...
69
Wednesday, October 31, 12
70. The sky is not falling.
1. You are not the Broad Institute or Sanger Center
‣ Overwhelming majority of us do not operate at Broad/
Sanger levels
• These folks add 200+ TB a week in primary storage
‣ We still face challenges but the scale/scope is well
within the bounds of what traditional IT technologies can
handle
‣ We’ve been doing this for years
• Many vendors, best practices, “war stories”, proven methods
and just plain “people to talk to…”
70
Wednesday, October 31, 12
71. The sky is not falling.
2. Instrument Sanity Beckons
‣ Yesteryear: Terascale .TIFF Tsunami
‣ Yesterday: RTA, in-instrument data reduction
‣ Today: Basecalls, BAMs & Outsourcing
‣ Tomorrow: Write directly to the cloud
71
Wednesday, October 31, 12
72. The sky is not falling.
3. Peta-scale storage is not really exotic or unusual any more.
‣ Peta-scale storage has not been a risky exotic technology
gamble for years now
• A few years ago you’d be betting your career
‣ Today it’s just an engineering & budget exercise
• Multiple vendors don’t find petascale requirements particularly
troublesome and can deliver proven systems within weeks
• $1M (or less in ’12) will get you 1PB from several top vendors
‣ However, still HARD to do BIG, FAST & SAFE
• Hard but solvable; many resources & solutions out there
72
Wednesday, October 31, 12
73. On the other hand ...
73
Wednesday, October 31, 12
74. OMG! The Sky Is Falling!
Maybe a little panic is appropriate ...
74
Wednesday, October 31, 12
75. The sky IS falling!
1. Those @!*#&^@ Scientists ...
‣ As instrument output declines …
‣ Downstream storage consumption by
end-user researchers is increasing
rapidly
‣ Each new genome generates new
data mashups, experiments, data
interchange conversions, etc.
‣ MUCH harder to do capacity planning
against human beings vs.
instruments
75
Wednesday, October 31, 12
76. The sky IS falling!
2. @!*#&^@ Scientific Leadership ...
‣ Sequencing is already a
commodity
‣ NOBODY simply banks the
savings
‣ EVERYBODY buys or does
more
76
Wednesday, October 31, 12
77. The sky IS falling!
Gigabases vs. Moores Law
OMG!!
BIG SCARY GRAPH
2007 2008 2009 2010 2011 2012
: 77
Wednesday, October 31, 12
78. The sky IS falling!
3. Uncomfortable truths
‣ Cost of acquiring data (genomes)
falling faster than rate at which
industry is increasing drive capacity
‣ Human researchers downstream of
these datasets are also consuming
more storage (and less predictably)
‣ High-scale labs must react or
potentially have catastrophic issues
in 2012-2013
78
Wednesday, October 31, 12
79. The sky IS falling!
5. Something will have to break ...
‣ This is not sustainable
• Downstream consumption
exceeding instrument data
reduction
• Commoditization yielding
more platforms
• Chemistry moving faster
than IT infrastructure
• What the heck are we
doing with all this
sequence?
79
Wednesday, October 31, 12
81. The sky IS falling!
CRAM it in 2012 ...
‣ Minor improvements are useless; order-of-magnitude needed
‣ Some people are talking about radical new methods –
compressing against reference sequences and only storing the
diffs
• With a variable compression “quality budget” to spend on
lossless techniques in the areas you care about
‣ http://biote.am/5v - Ewan Birney on “Compressing DNA”
‣ http://biote.am/5w - The actual CRAM paper
‣ If CRAM takes off, storage landscape will change
81
Wednesday, October 31, 12
82. What comes next?
Next 18 months will be really fun...
82
Wednesday, October 31, 12
83. What comes next.
The same rules apply for 2012 and beyond ...
‣ Accept that science changes faster than IT infrastructure
‣ Be glad you are not Broad/Sanger
‣ Flexibility, scalability and agility become the key
requirements of research informatics platforms
• Tiered storage is in your future ...
‣ Shared/concurrent access is still the overwhelming
storage use case
• We’ll still continue to use clustered, parallel and scale-out
NAS solutions
83
Wednesday, October 31, 12
84. What comes next.
In the following year ...
‣ Many peta-scale capable systems deployed
• Most will operate in the hundreds-of-TBs range
‣ Far more aggressive “data triage”
• “.BAM only!”
‣ Genome compression via CRAM
‣ Even more data will sit untouched & unloved
‣ Growing need for tiers, HSM & even tape
84
Wednesday, October 31, 12
85. What comes next.
In the following year ...
‣ Broad, Sanger and others will pave the way with respect
to metadata-aware & policy driven storage frameworks
• And we’ll shamelessly copy a year or two later
‣ I’m still on my cloud storage kick
• Economics are inescapable; Will be built into storage
platforms, gateways & VMs
• Amazon S3 is only a HTTP RESTful call away
• Cloud will become “just another tier”
85
Wednesday, October 31, 12
86. What comes next.
Expect your storage to be smarter & more capable ...
‣ What do DDN, Panasas, Isilon,
BlueArc, etc. have in common?
• Under the hood they all run
Unix or Unix-like OS’s on
x86_64 architectures
‣ Some storage arrays can
already run applications natively
• More will follow
• Likely a big trend for 2012
86
Wednesday, October 31, 12
88. Still trying to avoid this.
(100TB scientific data, no RAID, unsecured on lab benchtops )
88
Wednesday, October 31, 12
89. Flops, Failures & Freakouts
Common storage mistakes ...
89
Wednesday, October 31, 12
90. Flops, Failures & Freakouts
#1 - Unchecked Enterprise Storage Architects
‣ Scientist: “My work is priceless,
I must be able to access it at all times”
‣ Corporate/Enterprise Storage Guru:
“Hmmm …you want high availability, huh?”
‣ System delivered:
• 40TB Enterprise SAN
• Asynchronous replication to remote site
• Can’t scale, can’t do NFS easily
• ~$500K per year in operational & maintenance costs
90
Wednesday, October 31, 12
91. Flops, Failures & Freakouts
#2 - Unchecked User Requirements
‣ Scientist:
“I do bioinformatics, I am rate limited by the speed of file
IO operations. Faster disk means faster science. “
‣ System delivered:
• Budget blown on top tier fastest-possible ‘Cadillac’ system
‣ Outcome:
• System fills to capacity in 9 months; zero budget left.
91
Wednesday, October 31, 12
92. Flops, Failures & Freakouts
#3 - D.I.Y Cluster & Parallel Filesystems
‣ Common source of storage unhappiness
‣ Root cause:
• Not enough pre-sales time spent on design and engineering
• Choosing Open Source over Common Sense
‣ System as built:
• Not enough metadata controllers
• Issues with interconnect fabric
• Poor selection & configuration of key components
‣ End result:
• Poor performance or availability
• High administrative/operational burden
92
Wednesday, October 31, 12
94. Flops, Failures & Freakouts
Hard Lessons Learned
‣ End-users are not precise with storage terms
• “Extremely reliable” means no data loss;
Not millions spent on 99.99999% high availability
‣ When true costs are explained:
• Many research users will trade a small amount of uptime or
availability for more capacity or capabilities
• … will also often trade some level of performance in
exchange for a huge win in capacity or capability
94
Wednesday, October 31, 12
95. Flops, Failures & Freakouts
Hard Lessons Learned
‣ End-users demand the world but are willing to
compromise
• Necessary for IT staff to really talk to them and understand
work, needs and priorities
• Also essential to explain true costs involved
‣ People demanding the “fastest” storage often don’t have
actual metrics to back their assertions
95
Wednesday, October 31, 12
96. Flops, Failures & Freakouts
Hard Lessons Learned
‣ Software-based parallel or clustered file systems are
non-trivial to correctly implement
• Essential to involve experts in the initial design phase
• Even if using ‘open source’ version …
‣ Commercial support is essential
• And I say this as an open source zealot …
96
Wednesday, October 31, 12
97. The road ahead
My $.02 for 2012...
97
Wednesday, October 31, 12
98. The Road Ahead
Storage Trends & Tips for 2012
‣ Peta-capable platforms required
‣ Scale-out NAS still the best fit
‣ Customers will no longer build one
big scale-out NAS tier
‣ My ‘hack’ of using nearline spec
storage as primary science tier is
probably obsolete in ’12
‣ Not everything is worth backing up
‣ Expect disruptive stuff
98
Wednesday, October 31, 12
99. The Road Ahead
Trends & Tips for 2012
‣ Monolithic tiers no longer cut it
• Changing science & instrument
output patterns are to blame
• We can’t get away with biasing
towards capacity over
performance any more
‣ pNFS should go mainstream in ’12
• { fantastic news }
‣ Tiered storage IS in your future
• Multiple vendors & types
99
Wednesday, October 31, 12
100. The Road Ahead
Trends & Tips for 2012
‣ Your storage will be able to run apps
• Dedupe, cloud gateways &
replication
• ‘CRAM’ or similar compression
• Storage Resource Brokers
(iRODS) & metadata servers
• HDFS/Hadoop hooks?
• Lab, Data management & LIMS
applications Drobo Appliance running
BioTeam MiniLIMS internally...
100
Wednesday, October 31, 12
101. The Road Ahead
Trends & Tips for 2012
‣ Hadoop / MapReduce / BigData
• Just like GRID and CLOUD back
in the day you’ll need a gas mask
to survive the smog of hype and
vendor press releases.
• You still need to think about it
• ... and have a roadmap for doing it
• Deep, deep ties to your storage
• Your users want/need it
• My $.02? Fantastic cloud use case
101
Wednesday, October 31, 12
105. Intro 1
Meta-Issues (The Big Picture) 2
Infrastructure Tour 3
Compute & HPC 4
Storage 5
Cloud & Big Data 6
105
Wednesday, October 31, 12
106. The ‘C’ word
Does a Bio-IT talk exist if it does not mention “the cloud”?
106
Wednesday, October 31, 12
107. Defining the “C-word”
‣ Just like “Grid Computing” the “cloud” word has been
diluted to almost uselessness thanks to hype, vendor
FUD and lunatic marketing minions
‣ Helpful to define terms before talking seriously
‣ There are three types of cloud
‣ “IAAS”, “SAAS” & “PAAS”
107
Wednesday, October 31, 12
108. Cloud Stuff
‣ Before I get nasty ...
‣ I am not an Amazon shill
‣ I am a jaded, cynical, zero-loyalty consumer of IT
services and products that let me get #%$^ done
‣ Because I only get paid when my #%$^ works, I am
picky about what tools I keep in my toolkit
‣ Amazon AWS is an infinitely cool tool
108
Wednesday, October 31, 12
109. Cloud Stuff - SAAS
‣ SAAS = “Software as a Service”
‣ Think:
‣ gmail.com
109
Wednesday, October 31, 12
110. Cloud Stuff - SAAS
‣ PAAS = “Platform as a Service”
‣ Think:
‣ https://basespace.illumina.com/
‣ salesforce.com
‣ MS office365.com, Apple iCloud, etc.
110
Wednesday, October 31, 12
111. Cloud Stuff - IAAS
‣ IAAS = “Infrastructure as a Service”
‣ Think:
‣ Amazon Web Services
‣ Microsoft Azure
111
Wednesday, October 31, 12
112. Cloud Stuff - IAAS
‣ When I talk “cloud” I mean IAAS
‣ And right now in 2012 Amazon IS the IAAS cloud
‣ ... everyone else is a pretender
112
Wednesday, October 31, 12
113. Cloud Stuff - Why IAAS
‣ IAAS clouds are the focal point for life science
informatics
• Although some vendors are now offering PAAS and SAAS
options ...
‣ The “infrastructure” clouds give us the “building blocks”
we can assemble into useful stuff
‣ Right now Amazon has the best & most powerful
collection of “building blocks”
‣ The competition is years behind ...
113
Wednesday, October 31, 12
114. A message for the
cloud pretenders…
Wednesday, October 31, 12
115. No APIs?
Not a cloud.
Wednesday, October 31, 12
117. Installing VMWare
& excreting a press release?
Not a cloud.
Wednesday, October 31, 12
118. I have to email a human?
Not a cloud.
Wednesday, October 31, 12
119. ~50% failure rate when launching
new servers?
Stupid cloud.
Wednesday, October 31, 12
120. Block storage
and virtual servers only?
(barely) a cloud;
Wednesday, October 31, 12
121. Private Clouds
My $.02 cents
121
Wednesday, October 31, 12
122. Private Clouds in 2012:
‣ I’m no longer dismissing them as “utter crap”
‣ Usable & useful in certain situations
‣ Hype vs. Reality ratio still wacky
‣ Sensible only for certain shops
• Have you seen what you have to do
to your networks & gear?
‣ There are easier ways
Wednesday, October 31, 12
123. Private Clouds: My Advice for ‘12
‣ Remain cynical (test vendor claims)
‣ Due Diligence still essential
‣ I personally would not deploy/buy anything that does not
explicitly provide Amazon API compatibility
Wednesday, October 31, 12
124. Private Clouds: My Advice for ‘12
Most people are better off:
1. Adding VM platforms to existing HPC clusters &
environments
2. Extending enterprise VM platforms to allow user self-
service & server catalogs
Wednesday, October 31, 12
125. Cloud Advice
My $.02 cents
125
Wednesday, October 31, 12
126. Cloud Advice
Don’t get left behind
‣ Research IT Organizations need a cloud strategy today
‣ Those that don’t will be bypassed by frustrated users
‣ IaaS cloud services are only a departmental credit card
away ... and some senior scientists are too big to be fired
for violating IT policy :)
126
Wednesday, October 31, 12
127. Cloud Advice
Design Patterns
‣ You actually need three tested cloud design patterns:
‣ (1) To handle ‘legacy’ scientific apps & workflows
‣ (2) The special stuff that is worth re-architecting
‣ (3) Hadoop & big data analytics
127
Wednesday, October 31, 12
128. Cloud Advice
Legacy HPC on the Cloud
‣ MIT StarCluster
• http://web.mit.edu/star/cluster/
‣ This is your baseline
‣ Extend as needed
128
Wednesday, October 31, 12
129. Cloud Advice
“Cloudy” HPC
‣ Some of our research workflows are important enough to
be rewritten for “the cloud” and the advantages that a
truly elastic & API-driven infrastructure can deliver
‣ This is where you have the most freedom
‣ Many published best practices you can borrow
‣ Amazon Simple Workflow Service (SWS) look sweet
‣ Good commercial options: Cycle Computing, etc.
129
Wednesday, October 31, 12
130. Hadoop & “Big Data”
‣ Hadoop and “big data” need to be on your radar
‣ Be careful though, you’ll need a gas mask to avoid the
smog of marketing and vapid hype
‣ The utility is real and this does represent the “future
path” for analysis of large data sets
130
Wednesday, October 31, 12
131. Cloud Advice - Hadoop & Big Data
Big Data HPC
‣ It’s gonna be a MapReduce world, get used to it
‣ Little need to roll your own Hadoop in 2012
‣ ISV & commercial ecosystem already healthy
‣ Multiple providers today; both onsite & cloud-based
‣ Often a slam-dunk cloud use case
131
Wednesday, October 31, 12
132. Hadoop & “Big Data”
What you need to know
‣ “Hadoop” and “Big Data” are now general terms
‣ You need to drill down to find out what people actually
mean
‣ We are still in the period where senior mgmt. may
demand “hadoop” or “big data” capability without any
actual business or scientific need
132
Wednesday, October 31, 12
133. Hadoop & “Big Data”
What you need to know
‣ In broad terms you can break “Big Data” down into two very
basic use cases:
1. Compute: Hadoop can be used as a very powerful platform for
the analysis of very large data sets. The google search term
here is “map reduce”
2. Data Stores: Hadoop is driving the development of very
sophisticated “no-SQL” “non-Relational” databases and data
query engines. The google search terms include “nosql”,
“couchdb”, “hive”, “pig” & “mongodb”, etc.
‣ Your job is to figure out which type applies for the groups
requesting “hadoop” or “big data” capability
133
Wednesday, October 31, 12
134. High Throughput Science
Hadoop vs traditional Linux Clusters
‣ Hadoop is a very complex beast
‣ It’s also the way of the future so you can’t ignore it
‣ Very tight dependency on moving the ‘compute’ as close
as possible to the ‘data’
‣ Hadoop clusters are just different enough that they do
not integrate cleanly with traditional Linux HPC system
‣ Often treated as separate silo or punted to the cloud
134
Wednesday, October 31, 12
135. Hadoop & “Big Data”
What you need to know
‣ Hadoop is being driven by a small group of academics
writing and releasing open source life science hadoop
applications;
‣ Your people will want to run these codes
‣ In some academic environments you may find people
wanting to develop on this platform
135
Wednesday, October 31, 12
137. Cloud Data Movement
‣ We’ve slung a ton of data in and out of the cloud
‣ We used to be big fans of physical media movement
‣ Remember these pictures?
‣ ...
137
Wednesday, October 31, 12
144. Cloud Data Movement
Wow!
‣ With a 1GbE internet connection ...
‣ and using Aspera software ....
‣ We sustained 700 MB/sec for more than 7 hours
freighting genomes into Amazon Web Services
‣ This is fast enough for many use cases, including
genome sequencing core facilities*
‣ Chris Dwan’s webinar on this topic:
http://biote.am/7e
144
Wednesday, October 31, 12
145. Cloud Data Movement
Wow!
‣ Results like this mean we now favor network-based data
movement over physical media movement
‣ Large-scale physical data movement carries a high
operational burden and consumes non-trivial staff time &
resources
145
Wednesday, October 31, 12
146. Cloud Data Movement
There are three ways to do network data movement ...
‣ Buy software from Aspera and be done with it
‣ Attend the annual SuperComputing conference & see
which student group wins the bandwidth challenge
contest; use their code
‣ Get GridFTP from the Globus folks
• Trend: At every single “data movement” talk I’ve been to in
2011 it seemed that any speaker who was NOT using Aspera
was a very happy user of GridFTP. #notCoincidence
146
Wednesday, October 31, 12
148. Wrapping up
IT may just be a means to an end but you need to get
your head wrapped around it
‣ (1) So you use/buy/request the correct ‘stuff’
‣ (2) So you don’t get cheated by a vendor
‣ (3) Because you need to understand your tools
‣ (4) Because trends in automation and orchestration
are blurring the line between scientist & sysadmin
148
Wednesday, October 31, 12
149. Wrapping up - Compute & Servers
‣ Servers and compute power are pretty straightforward
‣ You just need to know roughly what your preferred
compute building blocks look like
‣ ... and what special purpose resources you require (GPUs,
Large Memory, High Core Count, etc.)
‣ Some of you may also have to deal with sizing, cost and
facility (power, cooling, space) issues as well
149
Wednesday, October 31, 12
150. Wrapping up - Networking
‣ Networking is also not hugely painful thing
‣ Ethernet rules the land; you might have to pick and choose
between 1-Gig and 10-Gig Ethernet
‣ Understand that special networking technologies like
Infiniband offer advantages but they are expensive and need
to be applied carefully (if at all)
‣ Knowing if your MPI apps are latency sensitive will help
‣ And remember that networking is used for multiple things
(server communication, application message passing & file
and data sharing)
150
Wednesday, October 31, 12
151. Wrapping up - Storage
‣ If you are going to focus on one IT area, this is it
‣ It’s incredibly important for genomics and also incredibly
complicated. Many ways to waste money or buy the ‘wrong’ stuff
‣ You may only have one chance to get it correct and may have to
live with your decision for years
‣ Budget is finite. You have to balance “speed” vs “size” vs
“expansion capacity” vs “high availibility” and more ...
‣ “Petabyte-capable Scale-out NAS” is usually the best starting
point. You deviate away from NAS when scientific or technical
requirements demand “something else”.
151
Wednesday, October 31, 12
152. Wrapping up - Hadoop / Big Data
‣ Probably the way of the future for big-data analytics. It’s
worth spending time to study; especially if you intend to
develop software in the future
‣ Popular target for current and emerging high-scale
genomics tools. If you want to use those tools you need to
deploy Hadoop
‣ It’s complicated and still changing rapidly. It can be
difficult to integrate into existing setups
‣ Be cynical about hype & test vendor claims
152
Wednesday, October 31, 12
153. Wrapping up - Cloud
‣ Cloud is the future. The economics are inescapable and the
advantages are compelling.
‣ The main obstacle holding back genomics is terabyte
scale data movement. The cloud is horrible if you have to
move 2TB of data before you can run 2Hrs of compute!
‣ Your future core facility may involve a comp bio lab
without a datacenter at all. Some organizations are
already 100% virtual and 100% cloud-based
153
Wednesday, October 31, 12
154. The NGS cloud clincher.
700 mb/sec sustained for ~7 hours
West Coast to East Coast USA
154
Wednesday, October 31, 12
155. Wrapping up - Cloud, continued
‣ Understand that for the foreseeable future there are THREE distinct
cloud architectures and design patterns.
‣ Vendors who push “100% hadoop” or “legacy free” solutions are
idiots and should be shoved out the door. We will be running legacy
codes and workflows for many years to come
‣ Your three design patterns on the cloud:
• Legacy HPC systems
(replicate traditional clusters in the cloud)
• Hadoop
• Cloudy
(when you rewrite something to fully leverage cloud capability)
155
Wednesday, October 31, 12