Meridium is a registered trademark of Meridium, Inc. All trade names referenced are the service mark, trademark, or registered trademark of the respective manufacturer. Lack of predictability negatively affects company performance because equipment failures directly affect production costs.
Description originale:
Titre original
Role Of IT In Event Data Collectin For Reliability Analysis.pdf
Meridium is a registered trademark of Meridium, Inc. All trade names referenced are the service mark, trademark, or registered trademark of the respective manufacturer. Lack of predictability negatively affects company performance because equipment failures directly affect production costs.
Meridium is a registered trademark of Meridium, Inc. All trade names referenced are the service mark, trademark, or registered trademark of the respective manufacturer. Lack of predictability negatively affects company performance because equipment failures directly affect production costs.
Role of IT in Event Data Collection for Reliability Analysis
Copyright Meridium, Inc. 2004 All rights reserved. Printed in the U.S.A.
All rights including reproduction by photographic or electronic process and translation into other languages of this material are fully reserved under copyright laws. Reproduction or use of this material in whole or in part in any manner without written permission from Meridium, Inc. is strictly prohibited.
Meridium is a registered trademark of Meridium, Inc. All trade names referenced are the service mark, trademark, or registered trademark of the respective manufacturer. MERIDIUM | Role of IT in Event Data Collection for Reliability Analysis 1
Introduction
Many companies today are striving to become more competitive by optimizing costs and increasing productivity through the improvement of work processes. This corporate streamlining is changing the way that people work, how they communicate, how decisions are made and examining the daily activities of workers are scheduled. Improvements over the years in information technology have revolutionized both worker productivity and maximum machinery up-time. The role of information technology is critical for this maintenance optimization revolution because it allows decision makers a common framework for fact- based analysis aligned to corporate goals and strategies.
Another key aspect of maintenance optimization is uniformity in the content and methodology of decision making. This uniformity can only occur when information technologies are used efficiently and effectively for condition monitoring, reliability analysis, and prognostic evaluation of data. With increased software capability and information availability, new and more efficient ways of querying data for analysis become increasingly important. Lack of predictability negatively affects company performance because equipment failures directly affect production costs. Production costs are, in many industries, the dominant cost category that controls a companys competitive position in the marketplace. Managements ability to control these costs relies on accurate data that is properly interpreted and acted upon.
Many companies have found that the barrier to this analysis is accessing the data already collected, and combining it with data collected from other parts of the organization. Capacity utilization improvements come from a deeper understanding of the causes and impact of unreliability. Maintenance optimization is an evaluation process that examines current functions, tasks, and activities to achieve the proper investment balance between production goals, cost optimization and safety/risk. In doing so manufacturers must become experts in predicting and preventing failures. This can only be achieved through a fundamental understanding of the predominant failure modes of the equipment. By understanding the failure mode and looking at the probability of failure for a particular sub-component the best judgment is made regarding the appropriate long-term and short-term corrective actions. The term optimization implies a single point or goal of maximum plant production capacity at minimum cost. The goal of maintenance optimization would be to achieve the highest level of reliability for the least investment in parts and labor. By leveraging against the investment in information technology, visibility into the bad actors is increased and predictable production begins to take shape.
To attain continuous improvement status, information systems need to provide sufficient analysis capability. However before this analysis can take place there has to be sufficient access to data necessary to understand equipment reliability and develop initial asset management strategies. MERIDIUM | Role of IT in Event Data Collection for Reliability Analysis 2
Business Challenge
Collecting event data is a minimum requirement to measure the effectiveness of your current asset strategies. Without this information, those on the front line tend to allocate resources that focus on the problem of the day and as a result, do not have a systematic approach to removing failures from the production lines of their manufacturing environments. Having a complete set of event data for all equipment allows a much better defined view of how operators allocate human and capital resources to help operators break out of the reactive work cycle. What emerges is a systematic approach to eliminating failure that, over time, dramatically increases productivity and profitability.
Event data can come from a variety of sources such as a maintenance management system, predictive and inspection systems, as well as production systems. For this discussion, we will focus our attention primarily on collecting event data related to equipment that resides in a maintenance management system.
It is important to understand the reasoning behind the data collection effort before getting into the details of how it is actually accomplished. The collection of event data has a double benefit. The primary benefit of comprehensive event data is to alert process owners as to whether their asset strategies are effective. Once we identify ineffective strategies, we can use the same event data to drill down and determine what might be the cause(s) of the ineffective strategies.
Collecting Reliability Event Data to Predict and Prevent Failures
When an event occurs on a piece of equipment, it is critical to record what type of event actually occurred. For instance, was it a failure, repair or a PM? What was the condition of the equipment at the time of the event? Once the work is completed, we need to record the technical finding such as the failed item, failure mode, cause and several other data elements that will be discussed in further detail. Some of the most critical information on the recording of any event is the date and time stamps related to the event and the costs associated with that event (e.g. labor, material, contractor, production losses).
MERIDIUM | Role of IT in Event Data Collection for Reliability Analysis 3
Different types of reliability event data:
Work events that occur on equipment Type of work performed Conditions found at the time of work Technical findings after work is completed Dates/time associated with the work Cost associated with performing the work
Below is a list of the data and supplemental descriptions that are recommended to collect on a given event. This data will be used as the basis for compiling a balanced scorecard as well the information required to find the underlying causes. A companys balanced scorecard is comprised of standardized, enterprisewide performance measurements related to production assets. A balanced scorecard provides a holistic view of key performance indicators (KPIs) spanning multiple plants and allows management to make strategic, fact-based decisions with greater confidence.
Identification History Dates Consequence Event ID Functional Loss Event Date Maintenance Cost Event Type Functional Failure (ISO Failure Mode) Mechanically Unavailable Date/Time Production Cost CMMS ID Effect Mechanically Available Date/Time Functional Location Maintainable Item Mechanical Downtime Functional Location Hierarchy Condition Maintenance Start Date/Time Level 1 (Site) Cause Maintenance End Date/Time Level 2 (Area) Maintenance Action Time to Repair Level 3 (Unit) Narrative Level Level n (System) Equipment ID Equipment Name Equipment Category (Rotating) Equipment Class (Pump) Equipment Type (Centrifugal)
Event ID - This is the unique identifier for each failure event.
CMMS ID This is useful if you are using a CMMS system as the base data collection system for failure events.
Functional Location - The functional location is typically a "smart" ID that represents what function takes place at a given location. (Pump 01-G-0001 must move liquid X from point A to point B)
Functional Location Hierarchy - Functional hierarchy to roll up metrics at various levels Level 1 Level 2 MERIDIUM | Role of IT in Event Data Collection for Reliability Analysis 4
Level 3 Level Level n (System)
Equipment ID - The Equipment ID is usually a randomly generated ID that reflects the asset that is in service at the functional location. The reason for a separate Equipment ID and Functional Location is that assets can move from place to place and functional locations
Equipment Name - Name or description of Equipment for Identification purposes
Equipment Category (e.g. Rotating) - Indicates the category of equipment the work was performed on. Generally by discipline (Rotating, Fixed, Electrical, Instrument)
Equipment Class (e.g. Pump) - Indicates the class of equipment the work was performed on. Failure Codes can be dependent on this value
Equipment Type (e.g. Centrifugal) - Indicates the type of equipment the work was performed on. Failure Codes can be dependent on this value
Functional Loss - This indicates whether the equipment experienced a functional loss as part of this event. A functional loss can be defined as any of the following three types: (1) Complete Loss of Function, (2) Partial Loss of Function, (3) Potential Loss of Function
Functional Failure (ISO Failure Mode) - Basically the symptoms of a failure if one has occurred. Any physical asset is installed to fulfill a number of functions. The functional failure describes which function the asset no longer is able to fulfill.
Effect - The effect of the event on production, safety environmental, or quality.
Maintainable Item - This is the actual component that was identified as causing the asset to lose it ability to serve. (e.g. bearing)
Condition - This indicates the type of damage found to the maintainable item, in some cases this also tends to indicate failure mechanism as well.
Cause - The general cause of the condition, this is not the root cause. It is recommended to use RCFA to assess root causes. MERIDIUM | Role of IT in Event Data Collection for Reliability Analysis 5
Maintenance Action - Corrective action performed to mitigate the damaged item.
Narrative - Long text description of work and suggestions for improvements. Event Date - This is the date that the event was first observed and documented.
Mechanically Unavailable Date/Time - This is the date/time that the equipment was actually taken out of service either due to a failure or to the repair work.
Mechanically Available Date/Time - This is the date/time that the equipment was available for service after the repair work had been completed.
Mechanical Downtime - Difference between Mechanically Unavailability Date and Mechanically Available Date (in hours).
Maintenance Start Date/Time - This is the date/time that the equipment was actually being worked on by maintenance.
Maintenance End Date/Time - This is the date/time that the equipment was actually finished being worked on by maintenance.
Time to Repair - This is the total maintenance time to repair the equipment.
Maintenance Cost - This is the total maintenance expenditure to rectify the failure. This could be company or contractor cost. This cost could be broken out into categories such as Material, Labor, Contractor, etc.
Production Cost - This is the amount of business loss associated with not having the assets in service. This cost includes Lost Opportunity, when an asset fails to perform its intended function and there is no spare asset or capability to make up the loss.
Utilizing Effective Event Recording Codes
Having a work process to collect event information is only the first step in gathering accurate event history. Without a standardized list of codes to use in your event recording, it will be almost impossible to use for analysis. There are various resources for event recording codes that range from company specific MERIDIUM | Role of IT in Event Data Collection for Reliability Analysis 6
codes to international industry standards, including the one provided by ISO 14224. This is a standard that was developed for the oil and gas industry and was based on work done by the Offshore Reliability Data group OREDA.
This standard focuses on equipment as well as failure and maintenance data. It describes details related to equipment classes, types and boundaries. With respect to event recording, this standard defines codes, time stamps and remarks.
ISO 14224 covers a subset of equipment classes within the oil and gas industry, which are provided in the table below.
Combustion Engine Heat Exchanger Compressor Process Sensor Control Logic Unit Pump Electric Generator Turboexpander Electric Motor Valve Fire and Gas Detector Gas Turbine Vessel
Within these classes of equipment, there are specific codes that can be utilized to record equipment events:
Method of detection Functional loss Failure mode Maintainable item Failure cause Maintenance activity
While these codes and equipment classes are an excellent start, there are additional equipment classes and code categories that are useful in fully documenting an equipment event. Therefore, additional equipment classes are offered below as supplements to the ISO 14224 standard.
Agitator Boiler Fan-Blower Fired Heater Gas Turbine General Equipment MERIDIUM | Role of IT in Event Data Collection for Reliability Analysis 7
Meter Instrumentation Piping NPV (Tank) Relief Device Steam Turbine Power Distribution
Similarly, additional code categories are offered to supplement the code categories within ISO 14224:
Condition Effect
Below are example Activity codes derived from ISO 14224:
Code Description ADJ Adjust CHK Check CMB Combination INS Inspection MOD Modify OTH Other OVH Overhaul REP Repair RFT Refit RPL Replace SVC Service TST Test
Uses of Reliability Analysis for Maintenance Optimization
Reliability analysis conducted on the equipment failure data results in calculated values that are used to characterize plant equipment reliability. These values are used in many ways to improve and optimize asset performance:
MERIDIUM | Role of IT in Event Data Collection for Reliability Analysis 8
Mean Time Between Failures (MTBF) - MTBF Analysis gives users information about the typical life of the machinery in the population which is compared with manufacturers expected values, other plants or even benchmark values from other companies. Reliability allows users to model changes in MTBF through the calculation of growth. Growth modeling also allows for the prediction of future failure, thus allowing users to set an interval for failure prevention and intervention.
Weibull Analysis - Weibull parameters gives clues as to the type of failure type (wear-in, wear- out, end-of life, random failures) and also gives indication of mixed mode populations so that analysts can isolate different causes of failure. By isolating the individual causes, individual solutions can be implemented.
Root Cause Analysis Reliability analysis of individual failure modes gives evidence to support the identification of root cause. Parts that have distinctive wear out failure modes have a different root cause than parts that exhibit infant mortality. As the root causes of failure are identified using reliability methods and as problems are corrected, the value of MTBF increases over time.
Identification of Vibration Related Failures Failures caused by excessive vibration are identified through the use of lognormal distribution analysis. Lognormal analysis is a good fit for stress induced failures where the fault mechanism increases as the severity of vibration increases.
Identification of Machine Design Problems Queries of failure modes by equipment types lead to the identification of commonly failed components among a population of similar equipment.
Identification of System Design Problems Sometimes the wrong piece of equipment is used in the design of plant system and frequent failures of this equipment occur as a result. Failures of similar systems can be subjected to the same analysis procedures that are conducted at the asset level. Problem systems are identified by low values for MTBF and compared with other similar systems.
Identification of Equipment Material Problems In some cases, the reliability analysis points to a deficiency in materials or in material selection. These problems often behave in an early wearout failure mode, which is easily identified with a Weibull analysis.
MERIDIUM | Role of IT in Event Data Collection for Reliability Analysis 9
Identification of Construction Problems Problems sometimes occur during a start-up (after a repair period, turnaround or outage) and are often related to the repair activities. These problems occur as a result of inadequate or improper construction techniques and material failures. An example of this type of problem is an improperly poured foundation that prevents proper operation of a machine or system. These problems sometimes show up in the Weibull analysis as infant mortality failures, with low values for MTBF.
Identification of Unsatisfactory Maintenance Procedures Like construction problems, inadequate or unsatisfactory maintenance procedures are identified and separated by comparing similar components between systems maintained by different crews. The level of training, adherence to standard procedure and attention to detail all play a role in the quality of repairs provided by the maintenance crew.
Identification of Improper Operating Procedures - Wide temperature swings and inadequate level control leads to reduced equipment life. Failures caused by inadequate operating procedures manifest themselves as premature wear-out modes easily identified through a Weibull Analysis.
Inadequate Preventive Maintenance Activities Maintenance preventable failures are identified through sorting of work order backlogs and analyzing of spare parts usage. While usage of spare parts does not ensure their correct installation, inadequate PM activities shows up in a reliability analysis as uncharacteristically low values for MTBF for equipment of this type, as compared with manufacturers or industry standards.
Inadequate Inspection Routines unexpected equipment failures cause serious environmental and safety issues. Ruptured pressure vessels, leakage and fugitive emissions caused by cracks, weld failures and seal failures cause components to fail unexpectedly. Understanding the reliability behavior of equipment prone to these kinds of faults allows users to schedule inspections at appropriate intervals.
PM Optimization Weibull analysis is used to estimate the optimum time for preventive maintenance procedures based upon a ratio of cost functions associated with planned repairs and unplanned failures. Future failure probability can also be estimated from the reliability data.
By combining design, construction, engineering, operation, maintenance and inspection data into a single asset management system and applying statistical analysis tools to failure data, problems that relate to technical as well as procedural issues are addressed. The reliability of individual plant components is MERIDIUM | Role of IT in Event Data Collection for Reliability Analysis 10
only improved once current levels of reliability are identified and tracked. An Asset Performance Management system, like Meridium, makes this task manageable.
Conclusion A key element of a successful asset performance management process is the collection of event data required for analysis. This is especially true if you consider that without event data it is impossible to determine where your problems reside, what strategies are effective or ineffective and where we need to focus our resources for the largest improvements. Beyond the ability to measure performance it gives us the baseline data to perform detailed reliability analysis. These techniques are very powerful when coupled with accurate and complete event data and can drive proactive behavior within the organization. The combination of quality event data, comprehensive analysis and disciplined follow-through can be the catalysts to meeting your corporations strategic goals.
Corporate Offices 10 South Jefferson Street Roanoke, Virginia 24011 540.344.9205 540.345.7083 fax