Integrating Reliability into an Asset Integrity Dashboard

By Troy Schwartz, PE, CRE

Balancing Asset Integrity Risk

Rig operation, maintenance, and management involves fast turnaround, high-pressure work with critical tasks being performed by diversely skilled personnel on highly complex and specialized systems in variable weather conditions and geographically dispersed locations. Tie that together with massive amounts of inconsistent, often unusable or inaccessible data and it is very easy for priorities to be misdirected and critical information to fall through the cracks leading to a reactive culture rather than making value-based, risk-informed decisions. A key component for achieving operational excellence is to establish a culture of continuous improvement and innovation company-wide from roughneck to the CEO that builds upon a strong foundation focused on asset integrity.

Drilling Operations and Maintenance (O&M) departments have a direct influence on the culture, always striving to establish the highest standards of performance and reliability in their equipment and personnel. To support this, companies often collect and store information on hardware performance and drilling environments from rig sites through both automated and manual logging means. Although extremely valuable, this information often has limited impact on equipment operations, the maintenance program, or business decisions due to lack of accessibility, lack of analyst expertise, and limited communication to those that could use the data.

The Foundation: Inherent Reliability

Once a rig has been commissioned and deployed to the field, the reliability of that rig is essentially fixed from a design and engineering perspective. This is referred to as the inherent reliability of the rig. Inherent rig reliability represents the maximum reliability performance expected of a rig based on the overall design and selection of equipment. However, in order to achieve this level of engineered robustness, rig operation and maintenance functions must be fully engaged such that they can effectively manage both the reliability and availability of the rig.

For critical assets where upfront expenses are a driver, or where an asset cannot be made more reliable, one can choose to designate a part as “life limited” and develop preventive maintenance (PM) plans for specified operation times and/or number of cycles for repairs or replacement, rather than waiting for an untimely failure to occur or running to failure. Unplanned failures lead to corrective maintenance (CM) actions, which result in downtime and potentially higher costs due to delays in drilling or major incidents. It is important to understand the relationship between inherent reliability and the maintenance attributes of an asset, because it directly affects performance, cost, schedule, and overall life cycle cost.

By combining reliability and maintainability factors the availability of critical assets can be determined. Availability is the percentage of time that the asset is capable of performing its function (operating time plus standby time) divided by the total period (whether in service or not). In other words availability is the probability that the asset has not failed or is undergoing a maintenance or repair function when it needs to be used. Availability is typically shown as a percentage of uptime and downtime, i.e. Uptime/(Uptime + Downtime) x 100 or MTBF/(MTBF + MTTR) x 100. There are other variations, but these are the common formulas. Therefore, a smoke detector that only requires 50 hours of preventive maintenance downtime each year has 99.4% availability for that year.

Performance standards for drilling equipment typically use these definitions for availability determination. The ultimate availability goal is to not only minimize failures, but also minimize the downtime needed to restore the system functionality either by leveraging redundancies, critical spares, repair, or replacing failed items. For regulatory compliance, the availability of critical assets is evaluated against the proportion of time it is actually required to operate. Availability is strictly used as a measure of operational readiness of an item – will it work when needed?

Rig Asset Integrity Dashboard

Due to the high intensity of drilling operations, rigs are seeing higher levels of stress requiring additional assurances that the rigs are being adequately maintained and inspected. Most in-service mechanical equipment requires some form of periodic inspection and maintenance to monitor their integrity and health condition in order to prolong the life cycle or to prevent catastrophic failures. Without adequate insight into the fundamental failure mechanisms of equipment, it becomes a nearly impossible undertaking to determine accurately which tasks require the most scrutiny and when these actions should be performed. Combining the results of a Failure Modes and Effects Analysis (FMEA) with real-time supporting field data provides the framework and early warning system for potential failures that might occur. This information aids the development of mitigation plans to reduce or eliminate the probable effect of each failure on the other system components and overall operational success. Leveraging FMEA, criticality assignments and data collected using the guidelines found in ISO 14224, Petroleum, petrochemical and natural gas industries — Collection and exchange of reliability and maintenance data for equipment, reliability, maintainability, and availability metrics can be developed, thus beginning the transformation of raw data into process intelligence.

Once developed, this valuable process intelligence can be automated and communicated in a portable, easily accessible format. An automated diagnostic system or “Asset Integrity Dashboard” that is well-designed allows operations personnel and maintainers to easily make strategic decisions at a glance based not only on their real-life experiences, but on actual supporting operational and cost data to effectively eliminate and manage existing problems. This transparent dashboard is capable of monitoring drilling parameters, predicting critical asset failures, and providing insight to the overall health of the rig with immediate feedback to O&M personnel and management. Such a dashboard can determine the equipment’s health from monitoring critical asset key failure modes identified and triggering only when maintenance is actually necessary or the business value justifies increased priority. With the capability to delineate rig and area performance a top-level dashboard can provide increased visibility at various levels (geographic, rig, subsystem, asset, etc.) to prioritize and tailor maintenance policies, minimize costs and human error, and recover lost revenue.

Basic Steps to Create an Asset Integrity Dashboard

  1. Identify Critical Assets (top drive, drawworks, engines, etc.)
  2. Identify Key O&M Parameters of Interest to Monitor (asset attributes, etc.)
  3. Identify Key Business Cost Factors (uptime, rig dayrate, labor rates, etc.)
  4. Collect Data (gather, assess data integrity, etc.)
  5. Analyze Data (transform into viable intelligence)
  6. Visualize and Communicate (web-based interface)
  7. Refine and Improve

An asset integrity dashboard provides visibility into rig operations and permits timely analysis for critical, informed and profitable maintenance actions.

A dashboard will encompass a variety of capabilities to view key real-time data collected from sensing hardware all the way to graphical presentation of the asset condition with recommended maintenance actions within a given timeframe. Additionally, a dashboard specifically developed for use with drilling equipment from the time of commission through decommission will aid in business decisions. As technologies are developed to reliably instrument and monitor damage in critical assets such as top drives or blowout preventers, diagnosis will become more and more routine.

Making Asset Integrity Routine

Dashboards for real-time data are nothing new, especially for operators. However, in the maintenance world, access to this data is often limited and does not always factor into strategies for maintaining asset integrity. Maintainers default to manufacturer-predetermined hourly limits (often arbitrarily established) or run-to-failure mentalities. Incorporating dashboards for use by O&M personnel as part of the daily routine allows problems to be identified faster or preempted entirely.

The single most important characteristic of any rig or piece of equipment in terms of determining overall availability performance is inherent reliability. The inherent reliability of a system or device is determined by its configuration and component selection. For instance, if a rig has redundant engines or blowout preventers, this will greatly affect the inherent reliability and increase availability. If a design fails to consider extreme environment conditions where the rig will be operated, the rig may not perform as expected. Since inherent reliability of a rig is typically fixed once a design is complete, business focus must then shift to the maintenance department to regain lost revenue.

Based on the asset health attributes, analysis techniques can be performed to determine the useful or safe life remaining for an asset determined by its condition. A simple example can be seen during a monthly megger check of a top drive drilling motor by an electrician. Although measuring the insulation resistance of the motor is typically a manual measurement, the values can still be recorded and trended. Without trending the recorded readings, the electrician would not be able to identify the impending failure of the insulation resulting in unexpected failure.

Monitoring and trending critical operating parameters allows for early diagnosis and initiation of proactive actions to avoid unexpected failure.

This essential transformation of raw data to process intelligence is necessary to add the real insight obtained from reliability trend analysis: the number of failures that are expected in an interval and the investigative information about the failure modes of the assets. This additional data can then be used to direct the maintenance to where and when it is needed, providing the agility and foresight needed to meet customer expectations.

Bringing the Team Together

As defined by the International Standard for Asset Management ISO 55000, an asset is an item, thing or entity that has potential or actual value to an organization. An asset has value that is tangible or intangible, financial or non-financial, and includes consideration of risks and liabilities. Physical assets usually refer to equipment, inventory and properties owned by the organization. Asset management is a coordinated activity of an organization to realize value from assets. Realization of actual value will normally involve a balancing of costs, risks, opportunities and performance benefits. This requires all stakeholders to work cohesively as a team to minimize risk and maximize cost avoidance.

Inherent reliability is a measure of the overall “robustness” of a rig or asset. It provides an upper limit to the reliability and availability that can be achieved. In other words, no matter how many inspections or maintenance tasks are performed, the rig will never exceed the inherent reliability. If operated, maintained, and inspected as well as possible, a rig will be able to realize all of the inherent reliability. Working independently, operators, maintainers and management all have separate goals and targets which could negatively impact the value stream as a whole. If there are gaps in the operating, maintenance or inspection practices, benefit from only some of the inherent reliability will be realized. Working together with management to identify and prioritize high risk items a cooperative partnership between operations, maintenance and management promotes transparency and collaboration necessary to manage critical assets in the most cost-effective way.

Maximizing Limited Resources

In managing any complex drilling system, we can maximize use of available maintenance resources if we identify reliability concerns and mitigate them proactively. After applying reliability engineering techniques, our next step is to intelligently interpret the results and recommend maintenance actions to improve overall fleet health. Such improvements might include increasing (or reducing) the frequency of existing inspections, developing new tests (and possibly test points), modifying maintenance procedures to record specific data, implementing crosschecks between different rigs, and improving the observability of some failures to minimize the effects of root causes or false failures.

The dashboard can provide a comprehensive picture of the real impacts to the value stream by taking the technical-focused analysis results and factoring in additional key parameters such as:

  • Cost (e.g., maximizing maintenance effectiveness)
  • Safety (e.g., avoiding injury, fatality, or damage of key assets)
  • Schedule (e.g., avoiding delays resulting in lost opportunities)

This will allow the decision-makers real insight to determine where to deploy maintenance resources to make the biggest impacts to the value stream.

Limited maintenance resources can be maximized by using dedicated rig dashboards that are focused on the maintenance-relevant aspects to provide an overall comprehensive, reliable, real-time integrated assessment of a rig and critical asset health. An asset integrity dashboard that summarizes all key information in one graphical display creates better situational awareness so teams can manage assets and business more effectively and make smarter, more intelligent decisions. Access to real-time information and diagnostic data trending allows a higher level of performance, innovation, and maintenance resource optimization that cannot be achieved with segregated silos of inaccessible static historical data. The significant value of integrating and analyzing this information is apparent when used in conjunction with critical assets for risk identification, rig prioritization, health monitoring, predictive maintenance and other purposes.

Continuous Improvement Investment

Reliability engineering provides the structured foundation for identifying and prioritizing critical assets. Since nearly all rigs require some form of inspection and maintenance procedures to monitor their integrity and health condition to prolong life or to prevent catastrophic failures, the potential applications of an automated diagnostic system are very broad. Taken one rig at a time, an asset integrity dashboard can be methodically expanded and used to monitor all types of rig configurations in various geographic locations. Combining this approach with financial data, businesses can make risk-informed decisions to provide the most value through:

  • Reduced non-productive time (NPT)
  • Improved efficiency
  • Improved performance
  • Increased drilling safety

A dashboard eliminates any blind spots creating transparency between operator, contractor and customer by giving instant visibility into the health of each rig.

By making strategic investments to develop critical knowledge, technologies, and capabilities to extend rig operations in a more efficient and safe manner, asset integrity dashboards based on sound reliability engineering best practices can increase overall asset value to the company. Standardization across a fleet of rigs, when combined with data-management systems that allow electronic collection and transfer of reliability data that can be analyzed resulting in improved quality of intelligence data, will enhance operations and optimize maintenance activities. A cost-effective way to optimize the interpretation of the data is through cooperation and open communication between all the stakeholders, e.g. contractors, operators, owners, maintainers, and management across all rig locations globally.

Reliability and asset integrity is always a hot topic – who doesn’t want reliable equipment? However, operators and maintainers often have little to no control over the inherent (or built-in) reliability of the equipment they are charged with maintaining. Proper preventive maintenance (in just the right amount) is the key to achieving the inherent reliability of the equipment. Without adequate maintenance and prioritization of resources, the equipment is doomed to fail at the least opportune time. Balancing high reliability, proper operation, and maintenance efficiency, accompanied by an automated diagnostic support system displayed in an asset integrity dashboard will provide affordable and effective rig fleet management of any size. The end result is the ability to deliver safety excellence, continuously improve performance, and maintain rigs to the highest standards.


Troy Schwartz has extensive experience providing unique and specialized services in the aviation, aerospace, oil and energy industries to minimize corporate exposure to risk during all phases of program and product development. Prior to joining LCE, Troy worked for Halliburton Energy Services (HES) where he served as their Principal Reliability Engineer. He has also worked with organizations such as L-3 Communications, SAIC, NASA and Boeing. Troy has a BS in Industrial Engineering from Texas A&M University and a Master of Aeronautical Science (MAS) from Embry-Riddle Aeronautical University (ERAU). In addition to being a Licensed Professional Engineer (PE) in the State of Texas, Troy is a Certified Reliability Engineer (CRE), Certified Safety Professional (CSP), Certified Quality Engineer (CQE), Certified Six Sigma Black Belt (CSSBB), Certified Change Management Professional (CMP), and Certified Project Management Professional (PMP).

© Life Cycle Engineering, Inc.


For More Information

843.744.7110 |


Share This

Share on Facebook Share on Twitter Share on LinkedIn Share via email