Enhancing Pipe Failure Predictions with Monte Carlo Simulation Using Python

Comprehensive Exploration of Monte Carlo Simulation for Accurate Pipe Failure Projections


In the realm of data analysis, estimation, forecasting, and prediction play crucial roles in driving informed decision-making. However, these projections often carry inherent uncertainties, influenced by assumptions, biases, past results, unforeseen events, and other factors. To mitigate the risks and uncertainties associated with such projections, is there a way to introduce variability and randomness that align with real-world complexities? Can we minimize ambiguity and enhance confidence in decision-making? The answer lies in Monte Carlo Methods/Simulations, and this blog delves into their application through a practical industry use case implemented in Python.

Monte Carlo Simulation offers a powerful technique for incorporating risk into quantitative analysis and decision-making processes. By introducing a wide range of values that follow a similar distribution to actual data, it amplifies variability and generates multiple possible outcomes. Through thousands of iterations, each producing measurable outcomes, Monte Carlo Simulation empowers decision-makers to rely on data-driven insights that surpass mere intuition or gut feelings. With the ability to mimic diverse business scenarios, simulation outcomes, backed by domain expertise, provide valuable support for making well-informed decisions.

Overview of Hoop Stress and pipe failure

The failure of pipes can happen due to various reasons and the mode of failure depends on the kind of stresses applied to the pipe.

There are three types of stress as shown below

1) Diameter — the internal pressure

2) Longitudinal stress– which increases the length of the pipe

3) Hoop stress- which increases the diameter of the pipe. The Hoop stress is also known as circumferential stress.

Source: trenchlesspedia

The pipe will fail if the Hoop stress becomes greater than its yield strength. In this blog, we will simulate this scenario hundreds or maybe a thousand times, and depending on the result, the decision can be made to either continue to the current design or to review and redesign the pipe. Here is the mathematical way to define the hoop stress.

Source: Author

Project environment

We will create a virtual environment for our project and work on it for the rest of the blog. Here are the steps

Step 1: Install the virtual environment if not already done

pip install virtualenv

Step 2: Create a virtual environment by the name eg: hoop_stress

virtualenv hoop_stress

Step 3: We can switch to hoop_stress environment and activate it using the below command.


Once, the environment is ready, we will load the libraries.

Loading the libraries

Please refer to the list of libraries from requirements.txt

pip install -r requirements.txt

Define configs: We will define a YAML file that will have default values for all the variables. It will be easy to modify the values in configuration and trigger the simulation process later. Refer config.YAML for complete code.

Generation of random values for a given distribution

We have three variables with respective mean and standard deviations, which we have defined in the config file. There are two possible scenarios.

Scenario 1: If the values of all three variables are fixed, then we will always have the same outcome. This is called the deterministic model/approach, where there is no uncertainty in the factors at play.

Deterministic Model | | Source: Author

Scenario 2: If the values of one or all three variables are uncertain, meaning values are not fixed, unlike in scenario 1 then the problem at hand becomes more complex. We will not be as confident of the pipe’s strength or failure as in scenario 1. The approach taken to address the uncertainty part is called the probabilistic approach/model.

Probabilistic Model | Source: Author

In this blog, we will focus on scenario 2 where randomness is at play and the Monte Carlo simulation will fit in perfectly. Let us start with generating a random value for diameter with a mean of 150 and a standard deviation of 5.

prob_value = random.uniform(0,1)
print(f'The random probability value :{round(prob_value,4)}')
sim_value = NormalDist(500, 5).inv_cdf(prob_value)
print(f'The simulated value :{round(sim_value,4)}')Output:
The random probability value :0.3291
The simulated value :497.7881

First, we generate a random variable between 0 and 1. We then use the inverse cumulative distribution function to fetch a value that belongs to a normal distribution with a mean of 500 and a standard deviation of 5. Let us do the same for the other two variables by looping through multiple iterations and the most efficient way to do that will be to wrap it under a function.

def simulate_values(mu, sigma, iterations, print_output = True):
mu (int): Define the mean of the diameter, thickness and yield
sigma (int): Define the standard deviation of the diameter, thickness and yield
iterations (int): Define number of iterations the simulations to be run
print_output (bool, optional): Set value to True to view the print statement at various stages and set False to skip the prints. Defaults to True.Returns:
List: The function returns the list which has the simulation values for diameter, thickess and yield
result = []
for i in range(iterations):
prob_value = round(random.uniform(.01, 0.99),3)
sim_value = round(NormalDist(mu=mu, sigma= sigma).inv_cdf(prob_value),3)

if print_output:
print(f"The prob value is {prob_value} and the simulated value is {sim_value}")
return result
except Exception as e:
print(f'An exception occurred while generating simulation values: {e}')

Please refer to the function simulate_values() from simulation.py

Defining the simulation

We have so far simulated random values for all three variables. Now it’s time to calculate Hoop stress and Objective variable. If the Objective variable has a negative value, it means that the hoop stress was greater than yield strength and pipe failure occurred.

df_final['Hoop_Stress'] = simulation_data.internal_pressure * df_final['Diameter'] / 
(2 * df_final['Thickness'])
df_final['Objective'] = df_final['Yield'] - df_final["Hoop_Stress"]

We will create a data frame that will have all the simulated values along with the hoop stress and objective. The most efficient way is to do it with a function, it returns the number of negative cases for each cycle of iteration. eg: If we ran 1,000 iterations and there were 10 negative instances, the probability of failure will be 0.01.

def initiate_simulation(iter_start = simulation_data.iter_start, iter_end = simulation_data.iter_end, iter_step = simulation_data.iter_step):
iter_start (int, optional): define starting point for generating simulation values. Defaults to simulation_data.iter_start.
iter_end (int, optional): define end point for generating simulation values. Defaults to simulation_data.iter_end.
iter_step (int, optional): define the increment. Defaults to simulation_data.iter_step.
print("---------------------------------------Initiating Simulations ---------------------------------------")
runs = list(range(iter_start, iter_end, iter_step))
sim_res = []
hoopStress = []
for run in runs:
sim_results, hs = run_simulation(run)
print(f'Simulation run for {run}: {sim_results}')
plot_linechart(runs, sim_res)
# plot_histogram(hoopStress)

except Exception as e:
print(f'An exception occurred while trying to initiate the simulation: {e}')

Please refer to the function initiate_simulation() from simulation.py

Setting up the simulation run

Now that we have defined the function to generate the random values, we can calculate the hoop stress. It’s time to iterate it multiple times and capture the probability of failure. As we did in the previous sections, we will wrap this logic into function.

def run_simulation(iterations):    
float: The function returns the number of negative values found in the simulation - negative value indicates that pipe failedHoop Stress:
int: Hoop stress =(Internal Pressure * Diameter)/(2 * Thickness)

int: Failure = Yield - Hoop stress

lst_diameter = simulate_values(simulation_data.diameter_mean, simulation_data.diameter_std,iterations, False)lst_thickness = simulate_values(simulation_data.thickness_mean, simulation_data.thickness_std,iterations, False)

lst_yield = simulate_values(simulation_data.yield_mean, simulation_data.yield_std, iterations, False)

df_final = pd.DataFrame(list(zip(lst_diameter, lst_thickness, lst_yield)), columns = ['Diameter', 'Thickness', 'Yield'])df_final['Hoop_Stress'] = simulation_data.internal_pressure * df_final['Diameter'] / (2 * df_final['Thickness'])df_final['Objective'] = df_final['Yield'] - df_final["Hoop_Stress"]

# min_stress = df_final['Hoop_Stress'].min()
# max_stress = df_final['Hoop_Stress'].max()

return df_final[df_final['Objective'] < 0].shape[0] / iterations, df_final['Hoop_Stress'].tolist()

except Exception as e:
print(f'An exception occurred with running the simulation: {e}')

Please refer to the function run_simulation() from simulation.py

Measuring the probability of pipe failure

The final phase of the process is to measure the probability of failure across iterations and also to visualize the result in a simple line chart.

def plot_linechart(runs, sim_res):    
print("---------------------------------------Initiating Simulation plot ---------------------------------------")

y_mean = [np.mean(sim_res)]*len(runs)
fig,ax = plt.subplots()
ax.set_ylabel('Probability of Pipe Failure')
ax.set_xlabel('Number of Iterations')
ax.set_title('Montecarlo Simulation for Pipe Failure')
# plt.plot(runs, sim_res, marker = '*')
plt.plot(runs, sim_res, 'bo-', label = "Simulation Probabilities")
for x,y in zip(runs, sim_res):
label = "{:.6f}".format(y)
textcoords="offset points",
ax.plot(runs, y_mean, color='red', lw=2, ls='dashdot', label=f'Average Probability: {round(y_mean[0],6)}')
except Exception as e:
print(f'An error occurred while trying to generate line chart {e}')

Please refer to the function plot_linechart() from simulation.py

We will define the main function via which the execution will be triggered. Let us define the main function in the file pipe.py

import simulation
def main():
simulation.initiate_simulation(1000, 100000, 5000)
# simulation.initiate_simulation()
if __name__ == '__main__':

There are two ways to trigger the simulation

  • Input the number of iterations needed from the main function. Eg: start from 1,000 iterations and go till 1,00,000 in increments of 5000.
simulation.initiate_simulation(1000, 100000, 5000)
  • If we don’t feed the iteration value, the function will fetch values from configuration files where we have defined default values.

As the number of simulations increases, we see that the probability of failure hovers around 0.007xxxxx. The line chart clearly shows the pattern and also the average value in the Red horizontal line. It is also to be noted that we don’t need 1,00,000 simulations. Instead, we could have stopped simulation around the 40K range as we start to see the consistent values.

---------Initiating Simulations -----------
Simulation run for 1000: 0.01
Simulation run for 6000: 0.006833333333333334
Simulation run for 11000: 0.008272727272727272
Simulation run for 16000: 0.0076875
Simulation run for 21000: 0.0074761904761904766
Simulation run for 26000: 0.009115384615384615
Simulation run for 31000: 0.007741935483870968
Simulation run for 36000: 0.007694444444444445
Simulation run for 41000: 0.007829268292682927
Simulation run for 46000: 0.007869565217391305
Simulation run for 51000: 0.0072549019607843135
Simulation run for 56000: 0.007928571428571429
Simulation run for 61000: 0.007327868852459016
Simulation run for 66000: 0.007378787878787879
Simulation run for 71000: 0.007577464788732394
Simulation run for 76000: 0.007289473684210526
Simulation run for 81000: 0.007691358024691358
Simulation run for 86000: 0.007848837209302326
Simulation run for 91000: 0.007736263736263736
Simulation run for 96000: 0.008177083333333333

Closing thoughts

The Monte Carlo simulation method is neither an alternative to traditional modeling nor does it eliminates any associated risk. Instead, it runs thousands of iterations with various values defined within the distribution and calculates the outcomes each time. This, when plotted, gives a probability distribution that gives us enough confidence to make our decision. In our use case, we see that the probability of failure is roughly around 0.7863%

Hope the blog was useful to you !!!!

You can connect with me on Linkedin

You can find the code on Github


Hoop Stress


Leave a Reply

Your email address will not be published. Required fields are marked *