## Comprehensive Exploration of Monte Carlo Simulation for Accurate Pipe Failure Projections

# Introduction

In the realm of data analysis, estimation, forecasting, and prediction play crucial roles in driving informed decision-making. However, these projections often carry inherent uncertainties, influenced by assumptions, biases, past results, unforeseen events, and other factors. To mitigate the risks and uncertainties associated with such projections, is there a way to introduce variability and randomness that align with real-world complexities? Can we minimize ambiguity and enhance confidence in decision-making? The answer lies in Monte Carlo Methods/Simulations, and this blog delves into their application through a practical industry use case implemented in Python.

Monte Carlo Simulation offers a powerful technique for incorporating risk into quantitative analysis and decision-making processes. By introducing a wide range of values that follow a similar distribution to actual data, it amplifies variability and generates multiple possible outcomes. Through thousands of iterations, each producing measurable outcomes, Monte Carlo Simulation empowers decision-makers to rely on data-driven insights that surpass mere intuition or gut feelings. With the ability to mimic diverse business scenarios, simulation outcomes, backed by domain expertise, provide valuable support for making well-informed decisions.

# Overview of Hoop Stress and pipe failure

The failure of pipes can happen due to various reasons and the mode of failure depends on the kind of stresses applied to the pipe.

**There are three types of stress as shown below**

**1) Diameter **— the internal pressure

**2) Longitudinal stress**– which increases the length of the pipe

**3) Hoop stress-** which increases the diameter of the pipe. The Hoop stress is also known as circumferential stress.

The pipe will fail if the Hoop stress becomes greater than its yield strength. In this blog, we will simulate this scenario hundreds or maybe a thousand times, and depending on the result, the decision can be made to either continue to the current design or to review and redesign the pipe. Here is the mathematical way to define the hoop stress.

**Project environment**

We will create a virtual environment for our project and work on it for the rest of the blog. Here are the steps

**Step 1**: Install the virtual environment if not already done

pip install virtualenv

**Step 2**: Create a virtual environment by the name eg: **hoop_stress**

virtualenv hoop_stress

**Step 3**: We can switch to **hoop_stress** environment and activate it using the below command.

hoop_stress\scripts\activate

Once, the environment is ready, we will load the libraries.

**Loading the libraries**

Please refer to the list of libraries from requirements.txt

pip install -r requirements.txt

**Define configs: **We will define a YAML file that will have default values for all the variables. It will be easy to modify the values in configuration and trigger the simulation process later. Refer config.YAML for complete code.

**Generation of random values for a given distribution**

We have three variables with respective mean and standard deviations, which we have defined in the config file. There are two possible scenarios.

**Scenario 1**: If the values of all three variables are fixed, then we will always have the same outcome. This is called the deterministic model/approach, where there is no uncertainty in the factors at play.

**Scenario 2**: If the values of one or all three variables are uncertain, meaning values are not fixed, unlike in scenario 1 then the problem at hand becomes more complex. We will not be as confident of the pipe’s strength or failure as in scenario 1. The approach taken to address the uncertainty part is called the probabilistic approach/model.

In this blog, we will focus on scenario 2 where randomness is at play and the Monte Carlo simulation will fit in perfectly. Let us start with generating a random value for diameter with a mean of 150 and a standard deviation of 5.

prob_value = random.uniform(0,1)

print(f'The random probability value :{round(prob_value,4)}')

sim_value = NormalDist(500, 5).inv_cdf(prob_value)

print(f'The simulated value :{round(sim_value,4)}')Output:

The random probability value :0.3291

The simulated value :497.7881

First, we generate a random variable between 0 and 1. We then use the inverse cumulative distribution function to fetch a value that belongs to a normal distribution with a mean of 500 and a standard deviation of 5. Let us do the same for the other two variables by looping through multiple iterations and the most efficient way to do that will be to wrap it under a function.

def simulate_values(mu, sigma, iterations, print_output = True):

"""summaryArgs:

mu (int): Define the mean of the diameter, thickness and yield

sigma (int): Define the standard deviation of the diameter, thickness and yield

iterations (int): Define number of iterations the simulations to be run

print_output (bool, optional): Set value to True to view the print statement at various stages and set False to skip the prints. Defaults to True.Returns:

List: The function returns the list which has the simulation values for diameter, thickess and yield

"""

try:

result = []

for i in range(iterations):

prob_value = round(random.uniform(.01, 0.99),3)

sim_value = round(NormalDist(mu=mu, sigma= sigma).inv_cdf(prob_value),3)

if print_output:

print(f"The prob value is {prob_value} and the simulated value is {sim_value}")

result.append(sim_value)

return result

except Exception as e:

print(f'An exception occurred while generating simulation values: {e}')

Please refer to the function ** simulate_values()** from simulation.py

# Defining the simulation

We have so far simulated random values for all three variables. Now it’s time to calculate Hoop stress and Objective variable. If the Objective variable has a negative value, it means that the hoop stress was greater than yield strength and pipe failure occurred.

df_final['Hoop_Stress'] = simulation_data.internal_pressure * df_final['Diameter'] /

(2 * df_final['Thickness'])

df_final['Objective'] = df_final['Yield'] - df_final["Hoop_Stress"]

We will create a data frame that will have all the simulated values along with the hoop stress and objective. The most efficient way is to do it with a function, it returns the number of negative cases for each cycle of iteration. eg: If we ran 1,000 iterations and there were 10 negative instances, the probability of failure will be 0.01.

def initiate_simulation(iter_start = simulation_data.iter_start, iter_end = simulation_data.iter_end, iter_step = simulation_data.iter_step):

"""summaryArgs:

iter_start (int, optional): define starting point for generating simulation values. Defaults to simulation_data.iter_start.

iter_end (int, optional): define end point for generating simulation values. Defaults to simulation_data.iter_end.

iter_step (int, optional): define the increment. Defaults to simulation_data.iter_step.

"""

try:

print("---------------------------------------Initiating Simulations ---------------------------------------")

runs = list(range(iter_start, iter_end, iter_step))

sim_res = []

hoopStress = []

for run in runs:

sim_results, hs = run_simulation(run)

sim_res.append(sim_results)

hoopStress.append(hs)

print(f'Simulation run for {run}: {sim_results}')

plot_linechart(runs, sim_res)

# plot_histogram(hoopStress)

except Exception as e:

print(f'An exception occurred while trying to initiate the simulation: {e}')

Please refer to the function ** initiate_simulation()** from simulation.py

# Setting up the simulation run

Now that we have defined the function to generate the random values, we can calculate the hoop stress. It’s time to iterate it multiple times and capture the probability of failure. As we did in the previous sections, we will wrap this logic into function.

def run_simulation(iterations):

"""summaryReturns:

float: The function returns the number of negative values found in the simulation - negative value indicates that pipe failedHoop Stress:

int: Hoop stress =(Internal Pressure * Diameter)/(2 * Thickness)

Objective:

int: Failure = Yield - Hoop stress

"""

try:

lst_diameter = simulate_values(simulation_data.diameter_mean, simulation_data.diameter_std,iterations, False)lst_thickness = simulate_values(simulation_data.thickness_mean, simulation_data.thickness_std,iterations, False)

lst_yield = simulate_values(simulation_data.yield_mean, simulation_data.yield_std, iterations, False)

df_final = pd.DataFrame(list(zip(lst_diameter, lst_thickness, lst_yield)), columns = ['Diameter', 'Thickness', 'Yield'])df_final['Hoop_Stress'] = simulation_data.internal_pressure * df_final['Diameter'] / (2 * df_final['Thickness'])df_final['Objective'] = df_final['Yield'] - df_final["Hoop_Stress"]

# min_stress = df_final['Hoop_Stress'].min()

# max_stress = df_final['Hoop_Stress'].max()

return df_final[df_final['Objective'] < 0].shape[0] / iterations, df_final['Hoop_Stress'].tolist()

except Exception as e:

print(f'An exception occurred with running the simulation: {e}')

Please refer to the function ** run_simulation()** from simulation.py

# Measuring the probability of pipe failure

The final phase of the process is to measure the probability of failure across iterations and also to visualize the result in a simple line chart.

def plot_linechart(runs, sim_res):

try:

print("---------------------------------------Initiating Simulation plot ---------------------------------------")

y_mean = [np.mean(sim_res)]*len(runs)

fig,ax = plt.subplots()

ax.set_ylabel('Probability of Pipe Failure')

ax.set_xlabel('Number of Iterations')

ax.set_title('Montecarlo Simulation for Pipe Failure')

# plt.plot(runs, sim_res, marker = '*')

plt.plot(runs, sim_res, 'bo-', label = "Simulation Probabilities")

for x,y in zip(runs, sim_res):

label = "{:.6f}".format(y)

plt.annotate(label,

(x,y),

textcoords="offset points",

xytext=(0,10),

ha='center')

ax.plot(runs, y_mean, color='red', lw=2, ls='dashdot', label=f'Average Probability: {round(y_mean[0],6)}')

plt.legend(loc=0)

plt.show()

except Exception as e:

print(f'An error occurred while trying to generate line chart {e}')

Please refer to the function ** plot_linechart()** from simulation.py

We will define the main function via which the execution will be triggered. Let us define the main function in the file **pipe.py**

import simulation

def main():

simulation.initiate_simulation(1000, 100000, 5000)

# simulation.initiate_simulation()

if __name__ == '__main__':

main()

There are two ways to trigger the simulation

- Input the number of iterations needed from the main function. Eg: start from 1,000 iterations and go till 1,00,000 in increments of 5000.

simulation.initiate_simulation(1000, 100000, 5000)

- If we don’t feed the iteration value, the function will fetch values from configuration files where we have defined default values.

As the number of simulations increases, we see that the probability of failure hovers around 0.007xxxxx. The line chart clearly shows the pattern and also the average value in the Red horizontal line. It is also to be noted that we don’t need 1,00,000 simulations. Instead, we could have stopped simulation around the 40K range as we start to see the consistent values.

---------Initiating Simulations -----------

Simulation run for 1000: 0.01

Simulation run for 6000: 0.006833333333333334

Simulation run for 11000: 0.008272727272727272

Simulation run for 16000: 0.0076875

Simulation run for 21000: 0.0074761904761904766

Simulation run for 26000: 0.009115384615384615

Simulation run for 31000: 0.007741935483870968

Simulation run for 36000: 0.007694444444444445

Simulation run for 41000: 0.007829268292682927

Simulation run for 46000: 0.007869565217391305

Simulation run for 51000: 0.0072549019607843135

Simulation run for 56000: 0.007928571428571429

Simulation run for 61000: 0.007327868852459016

Simulation run for 66000: 0.007378787878787879

Simulation run for 71000: 0.007577464788732394

Simulation run for 76000: 0.007289473684210526

Simulation run for 81000: 0.007691358024691358

Simulation run for 86000: 0.007848837209302326

Simulation run for 91000: 0.007736263736263736

Simulation run for 96000: 0.008177083333333333

# Closing thoughts

The Monte Carlo simulation method is neither an alternative to traditional modeling nor does it eliminates any associated risk. Instead, it runs thousands of iterations with various values defined within the distribution and calculates the outcomes each time. This, when plotted, gives a probability distribution that gives us enough confidence to make our decision. In our use case, we see that the probability of failure is roughly around 0.7863%

Hope the blog was useful to you !!!!

You can connect with me on Linkedin

You can find the code on Github