SlideShare a Scribd company logo
Accidents Caused by Software Errors Software Kills!  15th Annual  American Industrial Hygiene Association -  Rocky Mountain Section (AIHA-RMS), and the  American Society of Safety Engineers  (ASSE) Colorado Chapter  FALL TECHNICAL CONFERENCE September 16th & 17th, 2009 Arvada Center    “Environmental Health & Safety  Broadening Our Alliances” Don Shafer, CSDP Chief Technology and HSE Officer Athens Group, Inc.
A Safety Minute – 17 September 2009 Safety - Dropped object fatality in the Keppel FELS shipyard  Since arriving in Singapore, I’ve be mildly shocked at the nonchalance shown around lifting operations.  It’s quite common to see crane operations lifting loads over active walkways; walkways that are not taped off and often there’s little notice by the workers using the walkways of the lifting operations occurring above them. This resulted in tragedy yesterday.  A basket of scrap cable was being transferred from a rig to the dock.  Reportedly, the crane was blowing its horn as per protocol for transfer operations.  The load shifted and a chunk of cable dropped, landing on the head of a dockworker underneath.  I don’t believe the dockworker was participating in the lifting operation. Even more disturbing – no one on the dock came to his aid, rig personnel ran over an attempted CPR.  The ambulance arrived without paramedics, rig personnel accompanied the dockworker to the hospital where he was pronounced dead. Be careful out there, your PPE will protect you from many things, but your awareness will save your life.
Presentation Outline Examples of Software Related Incidents Software you can “see” Software you cannot “see” Proven Practices to Reduce Software Risk Life Cycle Recognition Configuration Management FMECA
Those old Software Safety Chestnuts!
And, some not so old! Air France - What is known about the crash of an Air France airbus on 1 June bears similarities with the little-noticed loss much earlier of two computer-controlled passenger jets. Those two crashes raised questions of whether the pilots or systems were really in control. Airbus said this data showed that the pilots might have received conflicting information about their speed. There was a “divergence in airspeed measurement” by the onboard systems of the Air France aircraft. This is one of the matters being investigated, said Airbus. Data to the onboard computers about air speed came from sensors called pitot tubes, at least one of which was due for replacement. French authorities have suggested that inconsistent air speed readings are not dangerous.
Software you can “see” 6
Safety Incident: Injured Rig Hands Incident The elevators and bales of an older-model top drive reacted erratically to a rapid and erroneous user command. The vendor had released a software patch to that model to prevent this erratic behavior, but somehow it had not been communicated or installed on that drilling unit. There was little or no initial design and testing of the control software and the software interlock issue was not discovered. Little or no system requirements gathering were done on the control system and no FMECA was done on the control portion of the top drive. There was no consistent management of change treating software as an asset on the MODU between the supplying vendor and the operator.  Solution Result Initial requirements definition; FMECA of the control system and software change control protocols would have avoided this incident. The bales swung around and injured two of the rig hands, resulting in reportable LTIs. Estimated Lost Time: 5 days Day Rate: $310,000. Minimum Cost: $1,550,000.
Safety Incident: Potentially Deadly Mishap Incident A driller was performing a test with a riser joint suspended 70 feet (21 meters) above the drill floor.  Prior to leaving the drill cabin for a Job Risk Analysis meeting with the roughnecks, the driller selected “standby” mode on the drilling chair. While doing so, he inadvertently pressed the keypad button that activates Pipe Handling mode. In this mode, the drill control system sends a pressure monitoring command to the pipe elevator every three minutes. The driller stepped out onto the drill floor and three minutes later the pressure monitoring command was sent to the riser handling equipment which mistook it for an unlock command.  Solution Result The riser tool released the joint which fell through the well center into the ocean. The joint fell perfectly through the slips. Neither personal injury nor collateral equipment damage was experienced. Estimated Lost Time: 1.5  days Day Rate: $310,000 Minimum Cost: $465,000 An FMECA of the equipment covering operational states and message flow could have prevented this incident. 3 people 4 people
Safety Incident: Dropped Blocks Incident The semisubmersible MODU was in the final stages of pulling the BOP. The BOP was being lifted the last meter to gain clearance for access to the BOP transporter in the moonpool. With the travelling block at the uppermost limit, the Kinetic Energy Management System was ‘tripped’, and the resulting action was not as expected. The anti-bird nesting components were incorrectly installed thus limiting the 1200 psi used to function the service brake to 200psi. There was no operator error and the incident was a result of a disc brake system failure. Solution Result Traveling blocks, complete with riser and suspended BOP, descended approximately 50 meters in an uncontrolled manner, until the Top Drive impacted against the riser gimbal at the rig floor level.  Estimated Lost Time: 5 days Day Rate:  $477,000 Minimum Cost:  $2,385,000 An FMECA of the equipment covering operational states and message flow could have prevented this incident. Regression testing of software upgrades and formal change control should have taken place.
Safety Incident: Top Drive Out of Control Incident During the voyage to location, a technician was ‘tweaking’ the zone management parameters on a newbuild. A few minutes later the top drive started rotating by itself. The technician in his zeal to fix one thing had broken another – thereby introducing regression into the system. He was also unable to quickly recover to a previous known state as he wasn’t following software change control protocols.  Solution Result Following software change control and testing protocols would have prevented this. The technician and the team had to scramble to correct the issue. Fortunately there was no equipment damaged or personnel injured. Estimated Lost Time:  2  days Day Rate: $380,000 Minimum Cost: $760,000
Safety Incident: Generator Trip Incident A vendor arrived onboard a rig after having been officially requested to make changes to the rig’s automation system. While onboard, an unofficial request was made by a system operator regarding the numbering of main engine cooling system valves. The vendor either hadn’t completely understood the request or had been distracted and inadvertently  made the change to the wrong valve. Some time later a different operator attempted to give a close command to the valve in preparation for maintenance of the system.  Solution Result If formal control procedures had been adopted no unofficial change requests should have been carried out. Closing of the incorrect valve caused a generator trip. Estimated Lost Time: .5  days Day Rate: $310,000 Minimum Cost: $155,000
Safety Incident: Control System Reset Kills 4 Incident A control system failure occurred on a large, off-shore construction vessel. Two control units were restarted twice, unsuccessfully. A blinking red lamp on the PLC indicated that a memory reset was required, even though a memory reset had NEVER been requested by control system diagnostics during equipment operations.  As soon as the hydraulic power packs started, a loud bang was heard. A quadruple joint of pipe dropped approximately one meter to the welding deck below. A second quadruple joint of pipe in the pipe elevator was released (all clamps opened and the hydraulic safety stop swung away)  and fell the full length of the tower, smashing through a crowded access platform to the deck below. The initialization instruction was pre-loaded in PLC EPROM memory and the initialization included instructions to OPEN ALL CLAMPS. Solution Result An FMECA of the equipment covering operational states and message flow could have prevented this incident. Document  the impact of resetting control systems during operations. Eight personnel were injured – four         fatally. All were located on the access platform and several were thrown overboard by the impact.  Estimated Lost Time:  20 days Day Rate: $510K Minimum Cost: $10,200,000
What Did We Learn? Understand the impact of resetting control systems during operations ,[object Object]
Predefined, Fail-set or Fail-safe state?
Loss of communications caused revert to an unanticipated configuration – pipe rams opened unexpectedly and string lost in-holeKnown instances where systems were reset, as a matter of procedure on established intervals to prevent incidents ,[object Object],Statistically, most reset/reboot operations are completed without incident ,[object Object],[object Object]
Your IT Network is Safe? IT contractor indicted for sabotaging offshore rig management system, Company had refused to offer him a permanent job, feds say, March 18, 2009: Mario Azar, 28 of Upland, Calif., was charged with illegally accessing and compromising a computer system used by Pacific Energy Resources Ltd. (PER) to monitor offshore platforms in California and Anchorage and to detect oil leaks. The indictment papers allege that Azar's actions affected the "integrity and availability" of the system and resulted in it becoming temporarily unavailable. Though no oil spill or environmental hazard occurred while the system was compromised, Azar's actions caused thousands of dollars in damage, the indictment said.
Cyber criminals targeting energy – 15 March 2009 Based on an analysis of more than 240 billion requests for analysis by the corporate users, there was near 600% malware growth between like quarters in 2007 and 2008, and a 300% volume ratio increase from January 2008 through December 2008. A vertical industry analysis of malware growth found the energy and oil sector to rank in the top five targets in all threat categories. But energy and oil leads the pack by a long shot when it comes to one important category: encounters with unique new variants of data theft Trojans. With advances in the technology and sophistication of cyber attacks, malware delivered through the web can be remotely customized and configured once in place, based on the victim’s identity.
What do the Authorities Say? How to implement these activities and processes is not prescribed by the recommended practice. The recommended practice is primarily focused on the ‘what to do’. DNV RP D-201 Recommended Practice for Integrated Control Systems
How can Software become Safer? Awareness of Development Life Cycle Software Configuration Management (SCM) Failure Mode and Effects Criticality Analysis (FMECA)
Athens Group Deliverables Life Cycle Model  Design  Acceptance Construction Operation Contractual Software Standards Controls & Network Commissioning Vendor Software Process Assessment Integration and Test Startup Support FMECA FMECA FMECA Module Development Unit Test Detailed Design Hardware Path Vendor Management Preliminary Design Software Change Management Hardware Requirements Requirements Validation Design Verification Acceptance Planning Deployment Integrated System Testing Operations Maintenance System Requirements Concept Definition Activity Software Requirements Commissioning Planning Preliminary Design Alarm Management Detailed Design Software Path Factory Acceptance Testing Coding Unit Test Troubleshooting and Remediation Integration and Test
Software Change Management
Failure Mode and Effects Criticality Analysis (FMECA)
In Conclusion You can – and MUST - make Software Safer Awareness of Development Life Cycle Software Configuration Management (SCM) Failure Mode and Effects Criticality Analysis (FMECA)
Don Shafer, CSDP Chief Technology Officer 5608 Parkcrest Drive, Suite 200 Austin, Tx 78731 don.shafer@athensgroup.com www.athensgroup.com 512.345.0600 x117
References NORA Symposium 2008: Public Market for Ideas and Partnerships - http://www.cdc.gov/niosh/nora/symp08/posters/035.html Fatalities Among Oil and Gas Extraction Workers --- United States, 2003 - 2006 - http://www.cdc.gov/mmwr/preview/mmwrhtml/mm5716a3.htm Therac 25 - http://sunnyday.mit.edu/papers/therac.pdf Air France 2009 - http://www.computerweekly.com/Articles/2009/06/01/236245/air-france-crash-thought-to-be-caused-by-system-failure.htmhttp://www.computerweekly.com/Articles/2009/06/16/236447/air-france-airbus-pitot-sensor-linked-to-two-fatal-crashes.htm 8 Software Related Death Incidents - http://www.baselinemag.com/c/a/Projects-Processes/Eight-Fatal-SoftwareRelated-Accidents/
Speaker Bio Don Shafer, CSDP, developed Athens Group's oil and gas practice and leads Athens Group engineers in delivering superior rig software services and oil and gas exploration as well as production and pipeline monitoring systems for clients such as BP, Noble, Transocean, Maersk, ExxonMobil, Conoco Phillips and Shell. Prior to co-founding Athens Group, Don led groups developing and marketing hardware and software products for Motorola, AMD and Crystal Semiconductor. He was responsible for managing a $129 million-a-year PC product group that produced the award-winning audio components for Apple. From the development of low-level software drivers in yet-to-be-released Microsoft operating systems to the selection and monitoring of Taiwan semiconductor fabrication facilities, Don has led key product and process efforts.  Don earned a BS degree from the USAF Academy and an MBA from the University of Denver. Treasurer of the IEEE Computer Society Board of Governors, Past Editor-in-Chief of the IEEE Computer Society Press, IEEE Senior Member and software engineering book series author, Shafer is an adjunct professor in the Cockrell School of Engineering at the University of Texas at Austin.  An avid writer, Don has contributed to three books, written over 20 published articles, and is co-author of Quality Software Project Management, recently released by Prentice-Hall. He is a contributor to the 2010 edition of the multi-volume Encyclopedia of Software Engineering.  His latest patents are in state-based machine control.
Who We Are ,[object Object]
Offices in Houston and Austin, TX

More Related Content

What's hot

CS3STHLM_2019_krotofil_kopeytsev
CS3STHLM_2019_krotofil_kopeytsevCS3STHLM_2019_krotofil_kopeytsev
CS3STHLM_2019_krotofil_kopeytsev
Marina Krotofil
 
Practical Safety Instrumentation & Emergency Shutdown Systems for Process Ind...
Practical Safety Instrumentation & Emergency Shutdown Systems for Process Ind...Practical Safety Instrumentation & Emergency Shutdown Systems for Process Ind...
Practical Safety Instrumentation & Emergency Shutdown Systems for Process Ind...
Living Online
 
Ageing of aircraft
Ageing of aircraftAgeing of aircraft
Ageing of aircraft
Kanhaiya Kumar
 
Safety Instrumentation
Safety Instrumentation Safety Instrumentation
Safety Instrumentation Living Online
 
A Diet of Poisoned Fruit: Designing Implants & OT Payloads for ICS Embedded D...
A Diet of Poisoned Fruit: Designing Implants & OT Payloadsfor ICS Embedded D...A Diet of Poisoned Fruit: Designing Implants & OT Payloadsfor ICS Embedded D...
A Diet of Poisoned Fruit: Designing Implants & OT Payloads for ICS Embedded D...
Marina Krotofil
 
UAVs, mining and the other side of the story - John Challinor
UAVs, mining and the other side of the story - John ChallinorUAVs, mining and the other side of the story - John Challinor
UAVs, mining and the other side of the story - John Challinor
NSW Environment and Planning
 
Bill English, NTSB
Bill English, NTSBBill English, NTSB
Bill English, NTSB
sUAS News
 
Christopher_Collins_Resume__2016
Christopher_Collins_Resume__2016Christopher_Collins_Resume__2016
Christopher_Collins_Resume__2016Christopher Collins
 
Safety instrumented functions (sif) safety integrity level (sil) evaluation t...
Safety instrumented functions (sif) safety integrity level (sil) evaluation t...Safety instrumented functions (sif) safety integrity level (sil) evaluation t...
Safety instrumented functions (sif) safety integrity level (sil) evaluation t...
John Kingsley
 
Safety instrumented systems angela summers
Safety instrumented systems angela summers Safety instrumented systems angela summers
Safety instrumented systems angela summers Ahmed Gamal
 
35958867 safety-instrumented-systems
35958867 safety-instrumented-systems35958867 safety-instrumented-systems
35958867 safety-instrumented-systemsMowaten Masry
 

What's hot (13)

CS3STHLM_2019_krotofil_kopeytsev
CS3STHLM_2019_krotofil_kopeytsevCS3STHLM_2019_krotofil_kopeytsev
CS3STHLM_2019_krotofil_kopeytsev
 
Practical Safety Instrumentation & Emergency Shutdown Systems for Process Ind...
Practical Safety Instrumentation & Emergency Shutdown Systems for Process Ind...Practical Safety Instrumentation & Emergency Shutdown Systems for Process Ind...
Practical Safety Instrumentation & Emergency Shutdown Systems for Process Ind...
 
Ageing of aircraft
Ageing of aircraftAgeing of aircraft
Ageing of aircraft
 
UAV report vFINAL
UAV report vFINALUAV report vFINAL
UAV report vFINAL
 
Safety Instrumentation
Safety Instrumentation Safety Instrumentation
Safety Instrumentation
 
A Diet of Poisoned Fruit: Designing Implants & OT Payloads for ICS Embedded D...
A Diet of Poisoned Fruit: Designing Implants & OT Payloadsfor ICS Embedded D...A Diet of Poisoned Fruit: Designing Implants & OT Payloadsfor ICS Embedded D...
A Diet of Poisoned Fruit: Designing Implants & OT Payloads for ICS Embedded D...
 
UAVs, mining and the other side of the story - John Challinor
UAVs, mining and the other side of the story - John ChallinorUAVs, mining and the other side of the story - John Challinor
UAVs, mining and the other side of the story - John Challinor
 
Bill English, NTSB
Bill English, NTSBBill English, NTSB
Bill English, NTSB
 
ahmed.eldeib1
ahmed.eldeib1ahmed.eldeib1
ahmed.eldeib1
 
Christopher_Collins_Resume__2016
Christopher_Collins_Resume__2016Christopher_Collins_Resume__2016
Christopher_Collins_Resume__2016
 
Safety instrumented functions (sif) safety integrity level (sil) evaluation t...
Safety instrumented functions (sif) safety integrity level (sil) evaluation t...Safety instrumented functions (sif) safety integrity level (sil) evaluation t...
Safety instrumented functions (sif) safety integrity level (sil) evaluation t...
 
Safety instrumented systems angela summers
Safety instrumented systems angela summers Safety instrumented systems angela summers
Safety instrumented systems angela summers
 
35958867 safety-instrumented-systems
35958867 safety-instrumented-systems35958867 safety-instrumented-systems
35958867 safety-instrumented-systems
 

Viewers also liked

Assignment 2 Comtech
Assignment 2 ComtechAssignment 2 Comtech
Assignment 2 Comtechguesta33265
 
Deep in c# syntactic sugar
Deep in c# syntactic sugarDeep in c# syntactic sugar
Deep in c# syntactic sugar
Lanvige Jiang
 
Servicios terminologicos pag web
Servicios terminologicos pag webServicios terminologicos pag web
Servicios terminologicos pag webSarabusramante
 
Non For Profit Social Networking
Non For Profit Social NetworkingNon For Profit Social Networking
Non For Profit Social NetworkingEmilyKelps
 
Automatic Logo Replacement
Automatic Logo ReplacementAutomatic Logo Replacement
Automatic Logo ReplacementSaurabh Palan
 
Black Berry Server
Black Berry ServerBlack Berry Server
Black Berry ServerEmilyKelps
 

Viewers also liked (7)

Assignment 2 Comtech
Assignment 2 ComtechAssignment 2 Comtech
Assignment 2 Comtech
 
Deep in c# syntactic sugar
Deep in c# syntactic sugarDeep in c# syntactic sugar
Deep in c# syntactic sugar
 
Servicios terminologicos pag web
Servicios terminologicos pag webServicios terminologicos pag web
Servicios terminologicos pag web
 
You’Ve Got Mail!
You’Ve Got Mail!You’Ve Got Mail!
You’Ve Got Mail!
 
Non For Profit Social Networking
Non For Profit Social NetworkingNon For Profit Social Networking
Non For Profit Social Networking
 
Automatic Logo Replacement
Automatic Logo ReplacementAutomatic Logo Replacement
Automatic Logo Replacement
 
Black Berry Server
Black Berry ServerBlack Berry Server
Black Berry Server
 

Similar to Rocky Mtn Safety090917

Consequences of Errors in Aviation
Consequences of Errors in AviationConsequences of Errors in Aviation
Consequences of Errors in Aviation
Omar Hayat Khan, MSc
 
DHS ICS Security Presentation
DHS ICS Security PresentationDHS ICS Security Presentation
DHS ICS Security Presentation
guest85a34f
 
Human factors - Maintenance and inspection
Human factors - Maintenance and inspectionHuman factors - Maintenance and inspection
Human factors - Maintenance and inspection
Lahiru Dilshan
 
Atf marketing segment sorc meeting 14.05.2020 atf rev01
Atf marketing segment sorc meeting  14.05.2020   atf rev01Atf marketing segment sorc meeting  14.05.2020   atf rev01
Atf marketing segment sorc meeting 14.05.2020 atf rev01
Amardeep Jadeja
 
Fmea
FmeaFmea
Advances In Digital Automation Within Refining
Advances In Digital Automation Within RefiningAdvances In Digital Automation Within Refining
Advances In Digital Automation Within Refining
Jim Cahill
 
Maintenance Mistakes
Maintenance MistakesMaintenance Mistakes
Maintenance MistakesZulfiqar Ali
 
Abnormal Situation Management
Abnormal Situation ManagementAbnormal Situation Management
Abnormal Situation Management
manojchandrasekharan
 
Blackout Task Force Highlights SCADA System Woes
Blackout Task Force Highlights SCADA System WoesBlackout Task Force Highlights SCADA System Woes
Blackout Task Force Highlights SCADA System Woes
ARC Advisory Group
 
Thermal Techniques Brochure
Thermal Techniques BrochureThermal Techniques Brochure
Thermal Techniques Brochure
Thermal Techniques
 
Work Project 2-latest
Work Project 2-latestWork Project 2-latest
Work Project 2-latestRanjit David
 
20110204 alarm management seminar ureason v1 3
20110204 alarm management seminar ureason v1 320110204 alarm management seminar ureason v1 3
20110204 alarm management seminar ureason v1 3
UReasonChannel
 
NeuCo's MaintenanceOpt Great Catches
NeuCo's MaintenanceOpt Great Catches NeuCo's MaintenanceOpt Great Catches
NeuCo's MaintenanceOpt Great Catches
NeuCo, Inc
 
Getting NDT Right in MT and PT
Getting NDT Right in MT and PT Getting NDT Right in MT and PT
Getting NDT Right in MT and PT
Institution of Mechanical Engineers (IMechE)
 
geaazrhszegsr wrrathet eTETR Etrsfe deaFddaewe te3thr esesSEeee
geaazrhszegsr wrrathet eTETR Etrsfe deaFddaewe te3thr esesSEeeegeaazrhszegsr wrrathet eTETR Etrsfe deaFddaewe te3thr esesSEeee
geaazrhszegsr wrrathet eTETR Etrsfe deaFddaewe te3thr esesSEeee
mariogultom6
 
Engineers responsibility for safety
Engineers responsibility for safetyEngineers responsibility for safety
Engineers responsibility for safety
Bhupender Sharma
 

Similar to Rocky Mtn Safety090917 (20)

Consequences of Errors in Aviation
Consequences of Errors in AviationConsequences of Errors in Aviation
Consequences of Errors in Aviation
 
DHS ICS Security Presentation
DHS ICS Security PresentationDHS ICS Security Presentation
DHS ICS Security Presentation
 
Human factors - Maintenance and inspection
Human factors - Maintenance and inspectionHuman factors - Maintenance and inspection
Human factors - Maintenance and inspection
 
Atf marketing segment sorc meeting 14.05.2020 atf rev01
Atf marketing segment sorc meeting  14.05.2020   atf rev01Atf marketing segment sorc meeting  14.05.2020   atf rev01
Atf marketing segment sorc meeting 14.05.2020 atf rev01
 
Fmea
FmeaFmea
Fmea
 
Advances In Digital Automation Within Refining
Advances In Digital Automation Within RefiningAdvances In Digital Automation Within Refining
Advances In Digital Automation Within Refining
 
Maintenance Mistakes
Maintenance MistakesMaintenance Mistakes
Maintenance Mistakes
 
Intro to InfraMarine
Intro to InfraMarineIntro to InfraMarine
Intro to InfraMarine
 
Abnormal Situation Management
Abnormal Situation ManagementAbnormal Situation Management
Abnormal Situation Management
 
COMPL OF WORKS1
COMPL OF WORKS1COMPL OF WORKS1
COMPL OF WORKS1
 
HIPPS
HIPPSHIPPS
HIPPS
 
Blackout Task Force Highlights SCADA System Woes
Blackout Task Force Highlights SCADA System WoesBlackout Task Force Highlights SCADA System Woes
Blackout Task Force Highlights SCADA System Woes
 
Thermal Techniques Brochure
Thermal Techniques BrochureThermal Techniques Brochure
Thermal Techniques Brochure
 
Work Project 2-latest
Work Project 2-latestWork Project 2-latest
Work Project 2-latest
 
20110204 alarm management seminar ureason v1 3
20110204 alarm management seminar ureason v1 320110204 alarm management seminar ureason v1 3
20110204 alarm management seminar ureason v1 3
 
NeuCo's MaintenanceOpt Great Catches
NeuCo's MaintenanceOpt Great Catches NeuCo's MaintenanceOpt Great Catches
NeuCo's MaintenanceOpt Great Catches
 
Connor, Justin CV
Connor, Justin CVConnor, Justin CV
Connor, Justin CV
 
Getting NDT Right in MT and PT
Getting NDT Right in MT and PT Getting NDT Right in MT and PT
Getting NDT Right in MT and PT
 
geaazrhszegsr wrrathet eTETR Etrsfe deaFddaewe te3thr esesSEeee
geaazrhszegsr wrrathet eTETR Etrsfe deaFddaewe te3thr esesSEeeegeaazrhszegsr wrrathet eTETR Etrsfe deaFddaewe te3thr esesSEeee
geaazrhszegsr wrrathet eTETR Etrsfe deaFddaewe te3thr esesSEeee
 
Engineers responsibility for safety
Engineers responsibility for safetyEngineers responsibility for safety
Engineers responsibility for safety
 

Rocky Mtn Safety090917

  • 1. Accidents Caused by Software Errors Software Kills!  15th Annual  American Industrial Hygiene Association -  Rocky Mountain Section (AIHA-RMS), and the  American Society of Safety Engineers  (ASSE) Colorado Chapter  FALL TECHNICAL CONFERENCE September 16th & 17th, 2009 Arvada Center    “Environmental Health & Safety  Broadening Our Alliances” Don Shafer, CSDP Chief Technology and HSE Officer Athens Group, Inc.
  • 2. A Safety Minute – 17 September 2009 Safety - Dropped object fatality in the Keppel FELS shipyard Since arriving in Singapore, I’ve be mildly shocked at the nonchalance shown around lifting operations. It’s quite common to see crane operations lifting loads over active walkways; walkways that are not taped off and often there’s little notice by the workers using the walkways of the lifting operations occurring above them. This resulted in tragedy yesterday. A basket of scrap cable was being transferred from a rig to the dock. Reportedly, the crane was blowing its horn as per protocol for transfer operations. The load shifted and a chunk of cable dropped, landing on the head of a dockworker underneath. I don’t believe the dockworker was participating in the lifting operation. Even more disturbing – no one on the dock came to his aid, rig personnel ran over an attempted CPR. The ambulance arrived without paramedics, rig personnel accompanied the dockworker to the hospital where he was pronounced dead. Be careful out there, your PPE will protect you from many things, but your awareness will save your life.
  • 3. Presentation Outline Examples of Software Related Incidents Software you can “see” Software you cannot “see” Proven Practices to Reduce Software Risk Life Cycle Recognition Configuration Management FMECA
  • 4. Those old Software Safety Chestnuts!
  • 5. And, some not so old! Air France - What is known about the crash of an Air France airbus on 1 June bears similarities with the little-noticed loss much earlier of two computer-controlled passenger jets. Those two crashes raised questions of whether the pilots or systems were really in control. Airbus said this data showed that the pilots might have received conflicting information about their speed. There was a “divergence in airspeed measurement” by the onboard systems of the Air France aircraft. This is one of the matters being investigated, said Airbus. Data to the onboard computers about air speed came from sensors called pitot tubes, at least one of which was due for replacement. French authorities have suggested that inconsistent air speed readings are not dangerous.
  • 6. Software you can “see” 6
  • 7. Safety Incident: Injured Rig Hands Incident The elevators and bales of an older-model top drive reacted erratically to a rapid and erroneous user command. The vendor had released a software patch to that model to prevent this erratic behavior, but somehow it had not been communicated or installed on that drilling unit. There was little or no initial design and testing of the control software and the software interlock issue was not discovered. Little or no system requirements gathering were done on the control system and no FMECA was done on the control portion of the top drive. There was no consistent management of change treating software as an asset on the MODU between the supplying vendor and the operator. Solution Result Initial requirements definition; FMECA of the control system and software change control protocols would have avoided this incident. The bales swung around and injured two of the rig hands, resulting in reportable LTIs. Estimated Lost Time: 5 days Day Rate: $310,000. Minimum Cost: $1,550,000.
  • 8. Safety Incident: Potentially Deadly Mishap Incident A driller was performing a test with a riser joint suspended 70 feet (21 meters) above the drill floor.  Prior to leaving the drill cabin for a Job Risk Analysis meeting with the roughnecks, the driller selected “standby” mode on the drilling chair. While doing so, he inadvertently pressed the keypad button that activates Pipe Handling mode. In this mode, the drill control system sends a pressure monitoring command to the pipe elevator every three minutes. The driller stepped out onto the drill floor and three minutes later the pressure monitoring command was sent to the riser handling equipment which mistook it for an unlock command.  Solution Result The riser tool released the joint which fell through the well center into the ocean. The joint fell perfectly through the slips. Neither personal injury nor collateral equipment damage was experienced. Estimated Lost Time: 1.5 days Day Rate: $310,000 Minimum Cost: $465,000 An FMECA of the equipment covering operational states and message flow could have prevented this incident. 3 people 4 people
  • 9. Safety Incident: Dropped Blocks Incident The semisubmersible MODU was in the final stages of pulling the BOP. The BOP was being lifted the last meter to gain clearance for access to the BOP transporter in the moonpool. With the travelling block at the uppermost limit, the Kinetic Energy Management System was ‘tripped’, and the resulting action was not as expected. The anti-bird nesting components were incorrectly installed thus limiting the 1200 psi used to function the service brake to 200psi. There was no operator error and the incident was a result of a disc brake system failure. Solution Result Traveling blocks, complete with riser and suspended BOP, descended approximately 50 meters in an uncontrolled manner, until the Top Drive impacted against the riser gimbal at the rig floor level. Estimated Lost Time: 5 days Day Rate: $477,000 Minimum Cost: $2,385,000 An FMECA of the equipment covering operational states and message flow could have prevented this incident. Regression testing of software upgrades and formal change control should have taken place.
  • 10. Safety Incident: Top Drive Out of Control Incident During the voyage to location, a technician was ‘tweaking’ the zone management parameters on a newbuild. A few minutes later the top drive started rotating by itself. The technician in his zeal to fix one thing had broken another – thereby introducing regression into the system. He was also unable to quickly recover to a previous known state as he wasn’t following software change control protocols. Solution Result Following software change control and testing protocols would have prevented this. The technician and the team had to scramble to correct the issue. Fortunately there was no equipment damaged or personnel injured. Estimated Lost Time: 2 days Day Rate: $380,000 Minimum Cost: $760,000
  • 11. Safety Incident: Generator Trip Incident A vendor arrived onboard a rig after having been officially requested to make changes to the rig’s automation system. While onboard, an unofficial request was made by a system operator regarding the numbering of main engine cooling system valves. The vendor either hadn’t completely understood the request or had been distracted and inadvertently made the change to the wrong valve. Some time later a different operator attempted to give a close command to the valve in preparation for maintenance of the system. Solution Result If formal control procedures had been adopted no unofficial change requests should have been carried out. Closing of the incorrect valve caused a generator trip. Estimated Lost Time: .5 days Day Rate: $310,000 Minimum Cost: $155,000
  • 12. Safety Incident: Control System Reset Kills 4 Incident A control system failure occurred on a large, off-shore construction vessel. Two control units were restarted twice, unsuccessfully. A blinking red lamp on the PLC indicated that a memory reset was required, even though a memory reset had NEVER been requested by control system diagnostics during equipment operations. As soon as the hydraulic power packs started, a loud bang was heard. A quadruple joint of pipe dropped approximately one meter to the welding deck below. A second quadruple joint of pipe in the pipe elevator was released (all clamps opened and the hydraulic safety stop swung away) and fell the full length of the tower, smashing through a crowded access platform to the deck below. The initialization instruction was pre-loaded in PLC EPROM memory and the initialization included instructions to OPEN ALL CLAMPS. Solution Result An FMECA of the equipment covering operational states and message flow could have prevented this incident. Document the impact of resetting control systems during operations. Eight personnel were injured – four fatally. All were located on the access platform and several were thrown overboard by the impact. Estimated Lost Time: 20 days Day Rate: $510K Minimum Cost: $10,200,000
  • 13.
  • 14. Predefined, Fail-set or Fail-safe state?
  • 15.
  • 16. Your IT Network is Safe? IT contractor indicted for sabotaging offshore rig management system, Company had refused to offer him a permanent job, feds say, March 18, 2009: Mario Azar, 28 of Upland, Calif., was charged with illegally accessing and compromising a computer system used by Pacific Energy Resources Ltd. (PER) to monitor offshore platforms in California and Anchorage and to detect oil leaks. The indictment papers allege that Azar's actions affected the "integrity and availability" of the system and resulted in it becoming temporarily unavailable. Though no oil spill or environmental hazard occurred while the system was compromised, Azar's actions caused thousands of dollars in damage, the indictment said.
  • 17. Cyber criminals targeting energy – 15 March 2009 Based on an analysis of more than 240 billion requests for analysis by the corporate users, there was near 600% malware growth between like quarters in 2007 and 2008, and a 300% volume ratio increase from January 2008 through December 2008. A vertical industry analysis of malware growth found the energy and oil sector to rank in the top five targets in all threat categories. But energy and oil leads the pack by a long shot when it comes to one important category: encounters with unique new variants of data theft Trojans. With advances in the technology and sophistication of cyber attacks, malware delivered through the web can be remotely customized and configured once in place, based on the victim’s identity.
  • 18. What do the Authorities Say? How to implement these activities and processes is not prescribed by the recommended practice. The recommended practice is primarily focused on the ‘what to do’. DNV RP D-201 Recommended Practice for Integrated Control Systems
  • 19. How can Software become Safer? Awareness of Development Life Cycle Software Configuration Management (SCM) Failure Mode and Effects Criticality Analysis (FMECA)
  • 20. Athens Group Deliverables Life Cycle Model Design Acceptance Construction Operation Contractual Software Standards Controls & Network Commissioning Vendor Software Process Assessment Integration and Test Startup Support FMECA FMECA FMECA Module Development Unit Test Detailed Design Hardware Path Vendor Management Preliminary Design Software Change Management Hardware Requirements Requirements Validation Design Verification Acceptance Planning Deployment Integrated System Testing Operations Maintenance System Requirements Concept Definition Activity Software Requirements Commissioning Planning Preliminary Design Alarm Management Detailed Design Software Path Factory Acceptance Testing Coding Unit Test Troubleshooting and Remediation Integration and Test
  • 22. Failure Mode and Effects Criticality Analysis (FMECA)
  • 23. In Conclusion You can – and MUST - make Software Safer Awareness of Development Life Cycle Software Configuration Management (SCM) Failure Mode and Effects Criticality Analysis (FMECA)
  • 24. Don Shafer, CSDP Chief Technology Officer 5608 Parkcrest Drive, Suite 200 Austin, Tx 78731 don.shafer@athensgroup.com www.athensgroup.com 512.345.0600 x117
  • 25. References NORA Symposium 2008: Public Market for Ideas and Partnerships - http://www.cdc.gov/niosh/nora/symp08/posters/035.html Fatalities Among Oil and Gas Extraction Workers --- United States, 2003 - 2006 - http://www.cdc.gov/mmwr/preview/mmwrhtml/mm5716a3.htm Therac 25 - http://sunnyday.mit.edu/papers/therac.pdf Air France 2009 - http://www.computerweekly.com/Articles/2009/06/01/236245/air-france-crash-thought-to-be-caused-by-system-failure.htmhttp://www.computerweekly.com/Articles/2009/06/16/236447/air-france-airbus-pitot-sensor-linked-to-two-fatal-crashes.htm 8 Software Related Death Incidents - http://www.baselinemag.com/c/a/Projects-Processes/Eight-Fatal-SoftwareRelated-Accidents/
  • 26. Speaker Bio Don Shafer, CSDP, developed Athens Group's oil and gas practice and leads Athens Group engineers in delivering superior rig software services and oil and gas exploration as well as production and pipeline monitoring systems for clients such as BP, Noble, Transocean, Maersk, ExxonMobil, Conoco Phillips and Shell. Prior to co-founding Athens Group, Don led groups developing and marketing hardware and software products for Motorola, AMD and Crystal Semiconductor. He was responsible for managing a $129 million-a-year PC product group that produced the award-winning audio components for Apple. From the development of low-level software drivers in yet-to-be-released Microsoft operating systems to the selection and monitoring of Taiwan semiconductor fabrication facilities, Don has led key product and process efforts. Don earned a BS degree from the USAF Academy and an MBA from the University of Denver. Treasurer of the IEEE Computer Society Board of Governors, Past Editor-in-Chief of the IEEE Computer Society Press, IEEE Senior Member and software engineering book series author, Shafer is an adjunct professor in the Cockrell School of Engineering at the University of Texas at Austin. An avid writer, Don has contributed to three books, written over 20 published articles, and is co-author of Quality Software Project Management, recently released by Prentice-Hall. He is a contributor to the 2010 edition of the multi-volume Encyclopedia of Software Engineering. His latest patents are in state-based machine control.
  • 27.
  • 28. Offices in Houston and Austin, TX
  • 29. Pioneered Drilling Technology AssuranceSM (DTA) Services
  • 31. Over 70% have completed more than one project with us
  • 32.

Editor's Notes

  1. The fourth generation semi submersible had be in commissioning for six months when there was an unforeseen amount of network latency. Neither network simulation nor FMEA had been done prior to build. Of the five rig networks, none had compatible timings nor packet sizes.
  2. http://blogs.epmag.com/kevin/2009/03/15/why-are-cyber-criminals-targeting-energy-and-oil/Why are cyber criminals targeting energy and oil?March 15th, 2009 kparker Posted in Uncategorized | Recent evidence indicates that the nature of cyber crime is changing, even as it increases, and may pose a particular threat to intellectual property and confidential data in the energy and oil sectors. Less clear is what the intent of these malware attacks actually is.The ScanSafe Annual Global Threat Report 2008 finds, based on an analysis of more than 240 billion requests for analysis by the company’s corporate customers, that there was near 600% malware growth between like quarters in 2007 and 2008, and a 300% volume ratio increase from January 2008 through December 2008.Moreover, a vertical industry analysis of malware growth found the energy and oil sector to rank in the top five targets in all threat categories. But energy and oil leads the pack by a long shot when it comes to one important category: encounters with unique new variants of data theft trojans.On average, companies included in the analysis, said ScanTech, encountered 57 unique new variants of data theft Trojans in the first three quarters of 2008. In the energy and oil sector, however, that number was 213, an elevated exposure of nearly 400%.Most malware gains entrance to a corporate network through user visits to compromised sites, which sites are increasingly harder to detect.It is ScanTech’s thesis that as the global economy trends downward, cyber crime is trending sharply upward. And while providing little evidence other than these alarming statistics, it further asserts that “today’s malware can only be described as a massive criminal data harvesting operation, designed to steal intellectual property or confidential data and sell it to the highest bidder.”While the most obvious targets of these type attacks are confidential employee or customer information, the vertical industry distribution noted by ScanTech — with other leading industry targets including engineering & construction, manufacturing, and IT & telecommunications — might suggest otherwise.Criminals have gained another powerful advantage. With advances in the technology and sophistication of cyber attacks, malware delivered through the web can be remotely customized and configured once in place, based on the victim’s identity. “For the enterprise,” says ScanTech, “such an infection will likely be configured to steal intellectual property and potentially to eavesdrop on all network transmissions via ARP poisoning or other man-in-the-middle attacks.”Unfortunately, while in some countries, including the US, companies are legally obligated to disclose evidence of cyber theft related to employee or customer information, they are not obliged to do so as it relates to intellectual property and other type sensitive business information. That makes it difficult to know how much and what type business information is being stolen across the energy and oil industry.In general, the specifics of criminally constructed malware, its nature and intent, have become the realm of highly trained specialists, with a vocabulary impenetrable to the educated generalist. To take just one example, categories of malware include exploit & Iframe; backdoor & PWS; download dropper; rogue scanner; Trojan-general; redirector; and virus & worm.