ESP run life reliability analysis

Abstract

Although reliability analysis has been used for many years in different industries, the more rigorous analyses are relatively new applications in oil fields. Reliability analysis is using statistical analysis for analyzing the expected duration of time until one or more events happen. This technique is best applied in areas with single variable or changing parameter like corrosion rate. ESPs are engineered electromechanical systems having multiple components and materials. Very often, in ESP system, of multiple components and different quality; parametric analysis may not be applicable. This is due to the fact that the efficiency of an electric submersible pump system is function of the multiplied efficiencies of its components, i.e. motor, pump, seal, cable, etc. and hence this is still an area of debate in oil industry. Hence, censoring and data scrutiny, with projection of operational knowledge, would add more sense to the collected data. The use of reliability analysis for the estimation of run life of electrical submersible pump (ESP) has been sparsely applied in the oil field to predict the availability of a mechanical system such ESP. This is of a paramount importance to predict failure and plan for the availability of workover rig to replace ESP's.This paper present the results of a comparative statistical analysis applied to evaluate the run lives of electric submersible pumps in some Saudi fields using different reliability methods. This was carried out after data scrutinizing and screening to help analyze pump performance and optimize future installations. ESP design system was used to screen out improperly designed applications.The objective of this study was to investigate the use of statistical analyses in ESP run life calculation, identity what distribution functions best model ESP failure, estimate ESPs' run lives, identify potential improvement in ESP run life (inherent reliability vs. practical), estimate ESP replacement requirements, and compare results with ESP failure reports to explain changes in ESP life's.

Introduction

Electrical Submersible Pumping (ESP) has been predominantly used in oil fields for lifting fluid to surface or for flow assurance purposes in cases where production network with multiple wells exist. A submersible pumping unit consists of an electric motor, a seal section, an intake section, a multistage centrifugal pump where each stage consists of a rotating impeller and a stationary diffuser, an electric cable, a surface installed switchboard, a junction box and transformers. Submersible pumps.Estimating run life of an ESP is of great importance in economic justification and maintenance planning, in addition it gives a great insight into systems failure to optimize system design/selection and prolong life subsequently avoid maintenance cost and unnecessary production lose. Statistical techniques can provide additional insight into failure modes and root causes. Different models were used to estimate run life of an ESP, different assumption were made, that sometimes make these models not suitable due to the compounded effect of failure/operation.

Field engineers are keen to come up with a reliable and representative way to have an easy, yet reliable means of calculating ESP run life.

Definition of a Failure

Technically speaking, failure is the termination of the ability of a system to perform a required function or when the system does not meet its desired objectives. Given the fact that ESPs are failure prone systems but normally do not wear out; sometimes sudden catastrophic events causes failures, this means failures may not be dependents on the length of the service hence cannot be modeled. These models assume that the units considered have the same probability of failure, or same mortality, rate which is not the case due to different manufacturer, different wellbore conditions and reservoir responses. ESP is considerably affected by reservoir performance and mutually interact, hence ESP operation optimization is function of reservoir performance. Additionally, surface production network that may operate the pump outside its recommended range and make it suitable to failure.

It is crucially important to realize that it will never happen to have a similar conditions and exposure of these ESP's as each pump suffer from different reservoir behavior, production scenarios, and hence different wellbore conditions .

Due to the complex nature of possible failure as these pumps comprise of multiple components augmented in each other and have different operating condition, the reliability is not very intuitive. ESP is a system of multiple components and hence its reliability is a multiplier of those component reliabilities. The statistics is always a supportive science which can never substitute the technical competency, and knowledge of operating conditions.

Failure Metrics

Metrics such as mean time between failures (MTBF), mean time to failure (MTTF), mean time to recover (MTTR), and others are commonly used to calculate failure rate and risk analysis. These metrics are important to understand when evaluating the failure rate of ESP's.

Mean Time to Failure (MTTF) is the time, on average, that you would expect a piece of system to fail when it has been running or the mean time that a system will operate before the first failure occur. It is a simple indicator of system reliability. Mean Time between Failure (MTBF) is the time, on average, that you would expect a system to fail including time lost whilst repairs are undertaken. It is an indicator of the combined reliability and maintenance effectiveness/efficiency. The industry typically uses the MTBF approach for evaluating components for frequency of failure and also a good number of ESP users and suppliers use it for determining expected system life. The benefit of using MTBF is that it statistically considers both running and failed ESP systems. Simply stated, the calculation for MTBF involves the summing of the operating time for all ESP installations in a target group of wells and then dividing the total cumulative operating time by the number of failed ESP systems during that same period. The mean time to recover (MTTR) identifies the average (the arithmetic mean) time it takes to restore a failed system. In some cases, people interpret MTTR as the mean time to repair, and both mean essentially the same thing.

1.Mean time between failures (MTBF) – provides a measure of a system's reliability and identifies the average time between failures. It is often used to predict potential outages with critical systems.

2.Mean time to  failures (MTTF)- The Length of the time you can expect a device to remain in operation before it fails . It indicates failures is permanent ,while MTBF indicates it can be repaired.

3. Mean time to repair(MTTR)-The average time it takes to restore a failed system.

4.MTBF = MTTF + MTTR

5. Availability of the system=MTTF/(MTTF + MTTR)

ESP failure calculations

Different operators developed their own formulae to calculate the run lives of ESP's. Calculating the mean of failure rate, with some field knowledge and scrutiny, may give a quick and good estimate sometime. The MTBF is the sum or average of the operational periods divided by the number of observed failures.

The MTTR is the sum or average of the failure periods divided by the number of observed failures.

Mean time to failure (MTTF) – the length of time you can expect a device to remain in operation before it fails.

The MTBF can be defined in terms of the expected value of the density function

Where f(t) is the density function of time until failure – satisfying the standard requirement of density.

In this context (of reliability) is density function f(t) also often referred as reliability function R (t).

Modeling proces

Typically, estimates of the ESP's expected run life are usually based on a combination of information provided by the vendors and the operator. There are two types of test data and consequently different types of analysis. Parametric data that has an underlying normal distribution which allows for more conclusions to be drawn as the shape can be mathematically described, and anything else is non-parametric. On the other hand, nonparametric tests don't assume that your data follow a specific distribution, which is deemed to be more applicable for the ESP case. It is important, however, to point out that this method is used best as an instantaneous calculation and evaluated over successive time intervals as a directional indicator. The reason for this is that the results are negatively and unfairly biased by wells that are new applications of ESPs and have never failed but have yet had no chance to achieve their true run-life potential.

Although less frequently used, a growing trend is for developing more sophisticated approaches for predicting ESP run life by using historical run-life data for a particular ESP population and then the data to one of several mathematical distributions, the most common being Weibull and Exponential distributions. This method allows for the definition of different classes of failures based on the data and it makes it possible to forecast failure rates.The main benefit of this approach is the ability both to quantify and forecast future failures based on classes. On the downside, it requires diligence in acquiring and maintaining the data as well as sophistication in developing and using the calculation and simulation program.;The key for understanding the reliability of a system is the statistical distribution associated with the time to failure. The reliability itself is defined as the probability that the system will be functional until some period of working duty. Reliability is the probability that a device or a system will operate or survive without failure/in a satisfactory manner (survive) for a given period and under given operating conditions., its probability of survival from time 0 to t, or reliability R(t) is given by 

if the failure rate λ(t)constant, the expression reduces to :

Esp System Reliability

Reliability of overall system = f(the reliability of individual subsystems)

if these three failures are random samples from a population and failure times of this population follow a distribution with a probability density function of f(t),then the population MTTF can be mathematically calculated by: 

MTTF is a common measure of system's reliability .

It is desirable to have a single mathematical model that represents the failure rate of a device over its entire lift time .

1.Non-repairable systems (replaceable)

 - interested in probability of first and only failure(hazard rate)

-Analyze MTTF(Mean time to failure)

2. Repairable systems

-Interested in probability that a failure will occur over some period time

-Analyse MTBF(Mean time between failures)

To describe failure patterns in order to predict future issues , we need statistical distributions

1. Common distribution for non repairable devices 

-Weibull

1.Exponential is special case where failure rate is constant 

- Log normal 

- Normal 

2.Common distributions for repairable devices

-poisson 

-Nonhomogenous poisson process(NHPP)

Exponential Distributions

Patterson utilized the exponential distribution to model ESP run life. He presented failure data covering a wide range of equipment and operating conditions as a series of plots of run life versus the logarithm of number of installations remaining. He substantiated that this distribution can be a good model and showed that lower run times the plots are approximately linear, indicating a near constant failure rate and hence in this interval, the data appears to be a good fit to the exponential model.

Weibull Distribution

Oliviera et. al. modelled ESP failures using the two parameter Weibull distribution. Sawaryn et. al. added an additional term to the Weibull distribution to represent ESPs that failed to start. Depending on the magnitude of the shape factor, the Weibull distribution can be used to model either monotonically increasing, constant or monotonically decreasing failure rates.

Field Case

130 of oil wells and water supply wells were equipped with electrical submersible pumps in the field of study. The wells produce from un-consolidated sand, and relatively low productivity and low pressure reservoir. Changing reservoir pressure is seen due to the variation of production different location/time. The population of ESP's consists of eight sizes and manufactured by three different suppliers with the major portion supplied by a specific vendor. The original installation does not include variable speed drives. The power costs, flow ranges at the most of the wells could not justify incurring the additional cost of installing variable speed drives based on the anticipated operating conditions.

Statistical Analysis Results

The comparative study was conducted on the wells equipped with ESP's gradually and failure happened in these wells and units were replaced. Only normal failures were considered in this study, meaning that premature failures due to catastrophic events like operation events were excluded. Additionally, findings from teardown reports were also considered. The study compared results of three methods, namely survival, Weibull, and mean time between failures. The study was conducted on historical data for 8 years, and showed that the reliability of these ESP's can reasonably calculated if the data from intuitive failures even if they do not follow any distribution model.

This was carried out after data scrutinizing and screening to help analyze pump performance and optimize future installations. ESP design system was used to screen out improperly designed applications. The paper concludes that reliability analysis, suitably applied to properly censored data, is essentially a reliable method of evaluating ESP system performance. It also shows the reflection of operation quality and improvement in ESP management in the field. Special data censoring and scrutiny is applied to enable unbiased and conclusive appraisal of ESP performance. In addition, it account for partitioning of levels of each factor. Types/vendor related specifics are also accounted for.

The ESP population in the field of study is shown in figure 1. More ESP's are basically run in wells and added to the tally. The fluctuations in the plot at later stage represent the failing units and replacement time delay due to the availability on the rig.Figure 2 shows the breakdown in yearly basis. The different plots exhibit different mortality rate and change of failure rate at later time. Fig 3 shows the results of survival function and Weibull distribution function for the data in study. Both indicated a run lives of roughly two years. Fig. 4 shows the yearly comparative study for run life estimation using the three methods. Despite the variation as function of time, the reliability calculation by the survival and Weibull functions as very comparable and sometime close to the values by the MTBF. The yearly variation in failure rate reflects the improvement made in operation and on the other hand the aging of some pumps. Fig. 5 shows the yearly total installed and failed units to the date of study. Results shows that the percentage of surviving units may have a dwindling trend but in some years do not follow that trend. Which requires data scrutiny to be considered in the study.



Conclusions

1. Statistical analysis helps estimate ESP's run life. Unlike arithmetic mean, other statistical analysis indicated consistent ESP reliability.

2. An improvement in ESP design and handling resulted in longer run lives of ESP's. Meticulous record keeping of the reliability data is necessary. Field knowledge and gained experience of operating ESP's can be used to provide qualitative supplemental information.

3. Analysis indicated the effect of changing flow parameters. Operation of the artificial lift system with ESP's require constant learning as well as monitoring by telemetry and communications systems on real time basis. Effective downhole monitoring also enhanced the operation.

4. Tear down inspections are essential to gain thorough knowledge of equipment design, configuration and component deficiencies to correct the systems and improve the system operation. Failure analysis can be crosschecked with ESP teardown reports to correct results.

5. The uncertainties inherent in the continuous replacement of ESPs and the estimation of their run lives may make the estimates of the number of failures across an entire field to be numbers with considerable uncertainty. Difference between consistent reliability and mean values of run lives represents the potential improvement in ESP handling.

6. Parameter estimates of the adopted models need to be updated frequently during all times. Formal reliability analysis methods should be used to differentiate alternative technologies and manufacturers.

7. The causes of failure can analyzed using Fault Tree Analysis (FTA), as future work. The combination of FTA with fuzzy logic will help make more sense of data by artificially generating the unavailable data.