Achieving Capital Asset Reliability, Availability and Maintainability
Achieving Reliability, Availability and Maintainability (RAM) requires a deep engagement by the owner and its engineering teams from concept through completion. This article describes some of the systems and processes necessary to help assure an owner of the highest possible performance and return on its capital investment.
According to independent studies, more than half of large-scale projects including oil and gas facilities, refineries, mines, drilling platforms, chemical plants, large dams, bridges and other civil works have poor results: billions of dollars in cost overruns, significant delays in design and construction, and poor operability once they are finally completed.1 Reliability is not only a measure of uptime or downtime of an operating asset. It is also a measure of the predictability of the financial performance of a capital asset investment, which is directly impacted by the time and cost required to place that asset into service, and to ramp-up its production.
Adherence to a project budget and schedule are two main project performance indicators. However, the reliability and maintainability of the completed project is also intertwined with the project’s overall performance. Achieving Reliability, Availability and Maintainability (RAM) requires a deep engagement by the owner and its engineering teams from concept through completion. This article describes some of the systems and processes necessary to help assure an owner of the highest possible performance and return on its capital investment. The end goal is to replace reactive failure management with proactive failure analysis.
A focused approach to achieving RAM during the design and construction phases can help prevent costly changes and delays during the project. It also helps achieve a seamless integration between engineering, construction, commissioning and ramp-up.
Reliability is the ability of a system or component to function under stated conditions for a specified period. Quality, reliability, and safety are not measured and evaluated by mathematics and statistics alone. Quantitative methods cannot always predict or assess the magnitude of a failure.
Designing for reliability is a function of requirement specifications, hardware and software design, functional failure analysis, testing and analyzing, manufacturing, maintenance, transport, storage, spare parts stocking, operations research, human factors and technical documentation.
Traditional reliability engineering focuses on cost of failure caused by system downtime, cost of spares, repair equipment, personnel and cost of warranty claims. Safety engineering normally focuses not on cost, but on preserving life and nature, and therefore deals only with particularly dangerous system failure modes. High reliability levels are also the result of good engineering, attention to detail, and almost never the result of only reactive failure management.
2. PRE-FEASIBILITY AND FEASIBILITY PHASE
The owner’s operational group should engage with its internal and external engineering teams to perform conceptual level pre-feasibility and feasibility phase analyses as required to support the corporate objectives to be met by the project. These analyses may include:
- Creating a managerial and business platform for the project;
- Developing a project charter, project objectives, project goals and/or project values statement that will serve as a guide to all subsequent activities on the project;
- Evaluating and utilizing the most effective project contracting methods for engaging engineering, procurement, construction and commissioning services, taking into consideration the type and size of the project, market conditions for these services, appropriate risk allocation between the project parties, and capabilities of the owner’s organization. Evaluating alternative approaches such as design-bid-build; engineering, procurement and construction (EPC); engineering, procurement and construction management (EPCM); mixed approaches, and others. Consider risk allocation practices and methods such as lump sum, cost plus, and guaranteed maximum price, and the inclusion of an operating performance guarantee; and
- Establishing best practices to use for various project processes, governing structures, stage-gated process execution, effective project organization structures, and effective use of a project governing board or steering committee, to the extent these processes and systems are not already in place.
3. PRE-DESIGN PHASE
During the pre-design phase, a reliability program plan should be established, and a design intent document should be developed.
3.1 ESTABLISH A PLAN FOR A RELIABILITY PROGRAM
A reliability program plan should be established to document the best practices that will be used to evaluate various systems during design of the project and clarify the owner’s requirements for reliability assessments related to facility uptime. The program should identify the various inputs to the reliability assessment during project design and development, such as: quantification of downtime costs in terms of lost income and increased expenses; determination of the value proposition for quantifying spare parts levels and redundant systems and components; and the provision of “fail safe” components and systems within the project. The reliability program plan should also include definitions of the quantitative system reliability parameters, such as the values and assumptions for mean time between failures (MTBF), mean time to failure (MTTF), and probability of failure on demand (PFD), as well as reliability parameters for categories of components to be later included in the reliability block diagrams.
The tasks should be performed continuously and at such intervals during the design progress to avoid major delays in the progress of project design. The reliability program plan should identify which tasks will be performed continuously during the design, and which tasks will be performed at intervals. Tasks are often performed after certain engineering milestones, such as upon:
- Definition of base data for the project inputs (ore, crude, fuel or other raw material);
- Completion of laboratory or bench tests on material samples;
- Completion of process flow diagrams;
- Completion of piping and instrumentation diagrams (P&IDs);
- Completion of general site arrangement drawings;
- Completion of equipment specifications and receipt of vendor data;
- Completion of electrical one-line diagrams;
- Completion of design development; and
- Completion of final design.
The reliability program plan should include an estimate of the duration and labor hours required for each reliability review. The plan should also define what steps will be taken during the design and development of the project to help assure the reliability of the project execution program to deliver the desired financial performance to the owner, including a sensitivity analysis which relates project time and cost to their impact on the desired project return on investment (ROI). The elements of this plan should outline steps and processes to assure that the project itself will be completed on time and within budget. Capital cost reliability, schedule reliability, and performance reliability (the ability of the project to deliver output at the design level) are interrelated with the entire life cycle of the project, and its expected ROI.
3.2 DESIGN INTENT DOCUMENT
The owner and its internal and external engineering teams should develop a design intent document (DID) which identifies the inputs, outputs, and performance expectations of the project, including a quantitative definition of its expected reliability or availability. This narrative should explain how the proposed designs will respond to the owner’s project requirements and how the systems will operate. The DID typically includes quantifiable systems analysis. Design values should be updated as modifications are made throughout the planning and project delivery processes.
4. DESIGN PHASE
Reliability planning during the design phase may include project investment readiness evaluation, reliability modeling, software reliability testing, project integration, and risk assessments.
4.1 PROJECT INVESTMENT READINESS EVALUATIONS
At appropriate intervals during the feasibility and design phases, project evaluations should be conducted against predefined owner criteria to determine project readiness for the next design phase, or full funding and authorization. The measures and milestones of project readiness should be identified in the approved reliability program plan and should be used by the owner and its supporting team to focus the resources where and when needed to accomplish the project goals, and to complete the pre-authorization activities in the most efficient and effective manner.
4.2 RELIABILITY MODELING
During the design phase, reliability modeling is used to assess the complete system availability at the earliest possible time in the design process, when the necessary information becomes available. At a minimum, this task utilizes fault tree analysis and reliability block diagram processes. The software platform and quantitative methods used for this analysis should be identified in the reliability program plan. Using the owner-defined parameters and requirements, the owner’s engineer or consultant recommends design changes to address issues where redundancy, in-line surge capacity, compartmentalization, design safety factors, or equipment selection can be modified to achieve the optimum balance between capital cost, uptime, and ROI. The results of these analyses should be evaluated, reviewed, approved, and submitted before the engineering progress moves to the next stage in the sequence. Access to and collaboration between the owner, engineers, equipment vendors and other providers involved in the project is needed to perform the reliability modeling during the design phase.
4.3 SOFTWARE RELIABILITY TESTING
When a process or system involves software and instrumentation, the software should undergo reliability testing as a distinct task within the commissioning umbrella. This testing is usually performed by specialized engineering personnel.
4.4 PROJECT INTEGRATION
Project integration is related to the interface between the owner’s operating group and project management group. It addresses the communication management between engineering disciplines and geographies of the project. Project integration involves the management of communication and mutual expectations between these groups within and between the owner and engineer organizations, and it impacts the ability of the engineering and construction organization(s) to meet the overall goals and objectives of the project. During project development, it is necessary to identify proactive steps to enhance integration and monitor the quality of project integration to assure that reliability risks are mitigated.
4.5 RISK ASSESSMENTS, RISK AND HAZARD REDUCTION PLANS
Establishing safety integrity levels (SILs) for various elements and systems during design is a key project element. The owner’s operating group, and its internal and external engineers, should identify safety hazards in the design through a hazards and operability study (HAZOP), and assist in the review of hazard mitigation options including: (i) elimination; (ii) substitution; (iii) engineering; (iv) administration; and (v) personal protective equipment (PPE).
The parties should utilize applicable standards such as ANSI/ISA 84.00.01-2004 Parts 1-3 (IEC 61511 Mod) and OSHA’s Process Safety Management Standard, 29 CFR 1910.119, to review and comment on the design of safety instrument systems (SIS), engineered hazard mitigations, management plans, substitutions, and PPE options. Utilizing applicable environmental standards such as EPA 40 CFR 68, a Chemical Accident Prevention Provision/Risk Management Plan should be prepared, which should also be integrated into the design and asset operational plan.
Based upon the HAZOP, the owner and its engineering team should audit, assess, identify and recommend where safety critical elements (SCEs) are needed to prevent, control, mitigate or respond to a major accident event, in accordance with the SIL guidelines for the project. Once the SCEs are identified, the engineering team should review, prepare and implement detailed performance standards, critical function tests, maintenance plans, documentation, training, compliance strategies, and other tasks to integrate the SCEs and SISs with other maintenance and operations tasks.
4.6 SUMMARY OF RELIABILITY STRATEGIES
During the above design phase tasks, options to improve reliability and mitigate risks in the design, include, but are not limited to, the following:
- Spares and spare parts inventory based on lead time availability;
- Design safety factors;
- Safety instrument systems (SIS);
- Block and bypass valves;
- Corrosion resistance standards; and
- Surge capacity / storage.
5. CONSTRUCTION AND COMMISSIONING PHASE
The owner should make a careful assessment of its internal resources and select a strategy for Pre‑Operational Verification, Commissioning, and Start-up, considering options for a third-party commissioning service provider. By integrating commissioning with reliability engineering in a third-party arrangement, an owner that may lack an experienced internal project team can obtain the highest assurance of performance according to project objectives. The commissioning work commences at the beginning of the project, and typically includes the following tasks, goals, and deliverables:2
- Review the construction documents at the various stages of development for consistency with the approved guidelines, schematics, process flow diagrams, P&IDs, and general arrangement drawings. Flag all areas where there are critical inconsistencies that have not been approved through the change process. The relevant documents typically include the project manual (specifications), plans (drawings), and general terms and conditions of the design contract.
- Prepare a commissioning plan that outlines the organization, schedule, allocation of resources, and documentation requirements of the commissioning process. The plan should demonstrate a set of strategies and tactics to verify and document that the facility and all its systems and assemblies are planned, designed, installed, tested, operated, and maintained to meet the owner’s project requirements.
- Issue formal acceptance opinions and actions to declare that the predefined aspects of the project meet defined requirements, thus permitting subsequent activities to proceed.
- Develop and record the concepts, calculations, decisions, and product selections to meet the owner’s project requirements and satisfy applicable regulatory requirements. The written record should include both narrative descriptions and lists of individual items that support the design process.
- Develop and monitor verification checklists during all phases of the commissioning process to verify that the owner’s project requirements are being achieved. This includes checklists for general verification, testing, training, and other specific requirements.
- Prepare periodic commissioning progress reports which detail the activities completed as part of the commissioning process and significant findings from those activities. The commissioning status should be continuously updated during the project.
- Prepare a final commissioning report, which includes records of the activities and results of the commissioning process in accordance with the commissioning plan.
- Verify that the DID is being followed to assure that the proposed designs respond to the owner’s project requirements and how the project is to operate. The DID includes quantifiable systems analysis and design values, and should be updated as design modifications are made throughout the project delivery process.
- Verify that specific documents, components, equipment, assemblies, systems, and interfaces among systems are confirmed to comply with the criteria described in the owner’s project requirements.
- Create an ongoing record of any problems or concerns that have been raised by members of the commissioning team, as well as problem resolutions.
- Develop and implement a process for sampling and evaluating the efficacy of the contractor’s or engineer’s quality tests and measurements in accordance with the project’s quality assurance/quality control program. Review the program for consistency with the reliability goals of the project. The sample size, location and frequency are based upon: a known or estimated probability distribution of expected values; an assumed statistical distribution based upon data from a similar product, assembly, or system; or a random sampling that has a scientific statistical basis.
- Review the systems manuals supplied by the equipment vendors, and integrate them into a system-focused composite document that includes the operation manual, maintenance manual, and additional information of use to the owner during occupancy and operations.
- Prepare and oversee the execution of test procedures, which are written protocols that define methods, personnel, and expectations for tests conducted on components, equipment, assemblies, systems, and interfaces among systems.
- Prepare and deliver training programs, including written documents that detail the expectations, schedule, budget, and deliverables of the commissioning process related to training of the operating and maintenance personnel.
6. POST-COMMISSIONING AND PRODUCTION RAMP-UP
Reliability processes should be continued throughout the operations to verify that a project continues to meet current and evolving requirements. In best practice, commissioning process activities should be ongoing over the life of the facility.
6.1 PERFORMANCE OPTIMIZATION – LEAN SIX SIGMA
During production ramp-up, the owner and engineer should implement performance optimization management strategies including, but not limited to, the following tasks:
- Provide the owner’s maintenance and operations team with operator training, in conjunction with the major equipment and process vendors.
- Review and refine the goals related to operational readiness, operational production ramp-up, and operational reliability, and implement programs to assure that the production goals are met.
- Conduct performance optimization studies and develop processes to achieve the optimum production rate and throughput, by eliminating non-value-added activities, streamlining operational processes, and implementing human resource strategies to align accountabilities with operational processes.
- As needed, perform failure modes and effects analysis (FMEA) and failure modes, effects and criticality analysis (FMECA) to identify and eliminate unanticipated bottlenecks and process issues which may arise during ramp up.
6.2 ISO STANDARDS
During the design and construction phase of petroleum, natural gas and petrochemical projects, reliability and maintenance data should be collected and catalogued in accordance with the International Organization for Standardization (ISO) Standard 14224. This data can be used to support the development of operations and maintenance protocols during operation.
The owner should also perform the asset management functions in accordance with ISO Standard 55000, launched in 2014.
Achieving capital asset Reliability, Availability and Maintainability (RAM) requires significant effort by the owner and its engineering teams from project concept through completion. Utilizing the systems and processes described in this article can help the owner to achieve higher project performance and return on its capital investment by replacing reactive failure management with proactive failure analysis.
About the Author
Michael J. Vallez, P.E. M.B.A., is a Senior Principal with Long International and has over 40 years of hands-on and leadership experience in project management, engineering/construction management, cost and schedule control, change management, claims, dispute resolution, and mine and process engineering. He has served in executive management roles in industry, including both the owner and contractor sides with companies and contractors working on world-class projects for oil and gas companies, power companies, international mining companies, and other institutions. He has a proven ability to organize and integrate the work of multi-disciplined technical specialists and project construction teams to achieve corporate financial goals and objectives of ROI, safety, operational performance, cost, and time. In all, he has provided leadership on several billion dollars’ worth of projects in the mining, power, oil and gas, industrial, heavy civil and commercial sectors. Mr. Vallez has written several books on the subjects of construction management, safety, and effective project leadership. Mr. Vallez is based in the Salt Lake City, Utah area and can be contacted at firstname.lastname@example.org or (801) 502-0951.
1 See, e.g., Correcting the Course of Capital Projects: Plan ahead to avoid time and cost overruns down the road, Price Waterhouse Coopers, 2013.
2 These steps have been verified against the International Code Council (ICC) 1000 Standards for Commissioning.
Copyright © 2022 Long International, Inc.