Adding random numbers: Tolerance analysis

  • #1
Juanda
303
106
TL;DR Summary
I want to find the probability of adding random numbers with the result being within a certain bracket.
I am not a fan of random and statistics. I know it is extremely useful and probably the mathematical branch more applicable to real life to understand the world around us but I am a Calculus and Vectors boy. This problem though I find interesting. I would like to find a generalized solution for the following situation. It is related to the study of the tolerances in manufacturing. I am certain it's got more applications but that's how it occurred to me. To be honest, I don't think I will be able to apply this in real life because I don't have access to the probability distributions (although I could make some educated guesses) but the problem is still interesting.

Let's say we are adding two numbers. The result is simple.
$$2+3=5$$
However, let's now think of a scenario where 2 and 3 are not set. Instead, they follow a probability distribution. To start with something simple, let's assume they are constant distributions as shown.
1710583016444.png


In this case, I believe that's right so I know how to do it.
$$(2\pm 0.5)+(3\pm 1)=5\pm 1.5$$
But if the probability distributions get more complex I do not know how I could try to solve it. For example:
1710584127336.png


I used a normal distribution but I am certain there must be a generalized way to calculate it for any given probability distribution. I am interested in being able to answer the following:
Given the number ##a## with this probability distribution (whichever) and the number ##b## with this probability distribution (whichever), what is the probability that the sum c falls within the numbers ##e## and ##d##.
Similarly, the following question would be interesting to answer:
Given the target of the number ##c## being within the limits caused by ##e## and ##d## with a 95% certainty, how should ##a## and ##b## be given that they follow this probability distribution (whichever)?
This second question would look a lot like what a designer would need to face when choosing the tolerances of machined parts. The first one allows us to check if what was chosen is right. The second one is like trying to find the best answer directly.
I am aware that the second question has infinite solutions because the problem is not sufficiently defined. I don't know how to restrain it a little more so it is still realistic and will spit out the best possible solution directly. I would guess an additional restraint would be to force the tolerance brackets to be as big as possible in both numbers so the parts are easier to manufacture. "Big" is relative, so I would define big as a percentage of the average.

As a last bonus question, I mentioned at the top of the post that the applicability of this is limited by the actual knowledge of the probability distributions present in machined parts. Let's imagine I ask for 100 cylinders to be machined to 50±0.5 mm in diameter. I believe the machinist will keep removing material with his lathe until he's within the tolerance bracket. He has no particular interest in giving me the part being as close to the nominal value as possible since that risks scrapping a part because he removed more material than necessary. Therefore, my current guess is that the distributions must look somewhat like this:
1710584420014.png

That is only an educated guess in which even psychology is involved. The machinist might be somewhat of a perfectionist and he wants to get close to the average value. The only way to know would be to take all manufactured parts and get the probability distribution but that implies the job is already done so you cannot modify the tolerance bracket in the design to nail the target you initially had in mind. Is there a best way to approach this? What probability distribution would you apply?

So far, all my work is in low volumes so even if I knew the details of this problem there is little chance they would be representative enough to apply it but I still think it's an interesting matter to think about and it might be more useful in the future. Knowledge is power.

Let me know your thoughts.
Thanks in advance.
 
Physics news on Phys.org
  • #2
The general equation for the variance of a linear combination of two random variables, ##X,Y##, is
##Var(aX+bY) = a^2\ Var(X) + b^2\ Var(Y) + 2 a b\ Cov(X,Y)##, where ##a, b \in R##.
 
  • Like
Likes WWGD, Vanadium 50 and PeroK
  • #4
I am sorry but I don't understand your replies. They are either too general and I cannot make the connection to the problem at hand or they are not too related.

I kept checking the matter on the internet and thankfully I found a video that covered the point in detail. It is from the wonderful channel 3Blue1Brown and it is about convolution.


It actually bothers me a little that I watched this video when it was published but I didn't really understand it that well back then. Now that I needed it my brain didn't add 2+2 together to take me back to the video so I had to rediscover it. At least, the video is so good that rediscovering it doesn't feel bad at all.

First of all, I want to point out an error I committed in the original post.
Juanda said:
Let's say we are adding two numbers. The result is simple.
$$2+3=5$$
However, let's now think of a scenario where 2 and 3 are not set. Instead, they follow a probability distribution. To start with something simple, let's assume they are constant distributions as shown.
View attachment 341883

In this case, I believe that's right so I know how to do it.
$$(2\pm 0.5)+(3\pm 1)=5\pm 1.5$$

The result I posted is not the correct probability density function. The whole spectrum of possibilities is indeed right. The result will vary between ##3.5## and ##6.5##. However, the probabilities assigned in the drawing are not correct.
To find the correct probability distribution it is necessary to apply the convolution to the two original probability density functions.

NOTE: The functions for the dice are weird because he's talking about weighted dice in that case.
1711059633823.png


I initially proposed those constant probability functions because they seemed like the simplest possible option but it turns out that, since the functions are defined in segments and there is no variable in those segments, the convolution formula isn't as straightforward.
I generalized it using two step functions so the probability distributions are defined by just one function in all ##\mathbb{R}##.
1711059914619.png


Once that's done, it is easy to apply the formulas as long as it is a computer crunching the numbers.
I created a little demo in Desmos in which you can play around with the numbers to see how it changes things. Just be wary to keep ##a<b##, ##c<d## and ##k## not too big because I feel it overflows the calculator.
https://www.desmos.com/calculator/4swxqgoh6d
NOTE: I checked all areas under the probability density functions and the result is 1. Although the resultant probability density function from the convolution is not plotted, I checked it as well.
I couldn't plot the resultant probability density function in Desmos but you can see how the function would evaluate for each ##s## using the slider for that variable. That also moves the graphs around. That is analogous to what the video shows at around 15:32. The final step is to integrate that probability density function between the two values you are interested in and you'll obtain the probability of the sum being within that interval.

So, after all this, I feel capable of solving the problem in the forward direction as mentioned in the OP.
Juanda said:
Given the number ##a## with this probability distribution (whichever) and the number ##b## with this probability distribution (whichever), what is the probability that the sum c falls within the numbers ##e## and ##d##.

The inverse problem is yet to be solved.
Juanda said:
Similarly, the following question would be interesting to answer:

Given the target of the number ##c## being within the limits caused by ##e## and ##d## with a 95% certainty, how should ##a## and ##b## be given that they follow this probability distribution (whichever)?
This second question would look a lot like what a designer would need to face when choosing the tolerances of machined parts. The first one allows us to check if what was chosen is right. The second one is like trying to find the best answer directly.
I am aware that the second question has infinite solutions because the problem is not sufficiently defined. I don't know how to restrain it a little more so it is still realistic and will spit out the best possible solution directly. I would guess an additional restraint would be to force the tolerance brackets to be as big as possible in both numbers so the parts are easier to manufacture. "Big" is relative, so I would define big as a percentage of the average.
I have already gone numerical so I am not expecting an analytical pretty solution but I guess I'm OK with that. I'm thinking of calculating a bunch of combinations defining the boundaries of the two initial numbers and then choosing the combination that results in the intervals being the biggest while still fulfilling the condition. It would be like solving the forward problem several times changing things and then choosing the best option from them all.
I am aware I am not following a very efficient calculation method but, at this level, I will choose clarity and computer power over other parameters.

Lastly, any input regarding the possible probability density distributions to be expected in parts from workshops?
Juanda said:
As a last bonus question, I mentioned at the top of the post that the applicability of this is limited by the actual knowledge of the probability distributions present in machined parts. Let's imagine I ask for 100 cylinders to be machined to 50±0.5 mm in diameter. I believe the machinist will keep removing material with his lathe until he's within the tolerance bracket. He has no particular interest in giving me the part being as close to the nominal value as possible since that risks scrapping a part because he removed more material than necessary. Therefore, my current guess is that the distributions must look somewhat like this:
View attachment 341886
That is only an educated guess in which even psychology is involved. The machinist might be somewhat of a perfectionist and he wants to get close to the average value. The only way to know would be to take all manufactured parts and get the probability distribution but that implies the job is already done so you cannot modify the tolerance bracket in the design to nail the target you initially had in mind. Is there a best way to approach this? What probability distribution would you apply?

So far, all my work is in low volumes so even if I knew the details of this problem there is little chance they would be representative enough to apply it but I still think it's an interesting matter to think about and it might be more useful in the future. Knowledge is power.

Let me know your thoughts.
Thanks in advance.
 
  • #5
The way to handle this is a little harder than your first idea of just adding the ##\pm## numbers but is much simpler than your latest post. You do not need to know the entire probability distribution and compute convolutions.

When you say that one variable, X, is ##2 \pm 0.5##, you are saying that you have some level of confidence that the value will be within ##\pm 0.5## of ##2##. Assume that you are using the same level of confidence for the second variable, Y, and can say with that confidence that it is within ##\pm 1## of ##3##. Also assume that the variation of the two variables, X and Y, are independent of each other.
Suppose that you want to keep the same level of confidence for the sum, X+Y, being within some distance of the mean of X+Y. The trick is that to find the deviation of the sum, you must add the squares of the individual deviations and then take the square root. So you should calculate ##\sqrt {0.5^2 + 1^2} = \sqrt {1.25} = 1.118##. The final answer is ##5 \pm 1.118## with the same confidence of the original two variables.

EDIT, ADDED JUSTIFICATION: Let ##\mu_X## and ##\mu_Y## denote the means of the variables and ##\sigma_X, \sigma_Y## their standard deviations. Saying that you want the same level of confidence that the variables, X, Y, and X+Y to be within a certain distance of their means is to say that you want the ##\pm## numbers to be the same multiple, a, of their respective standard deviations. That is, you want ##a\cdot\sigma_X, a\cdot\sigma_Y, a\cdot\sigma_{X+Y}## to be 0.5, 1, and [to be determined].
The equation for ##a\cdot\sigma_{X+Y}## is ##a \sqrt{ \sigma_X^2 + \sigma_Y^2} = \sqrt{(a\cdot\sigma_X)^2 + (a\cdot\sigma_Y)^2} = \sqrt {0.5^2 + 1^2} = \sqrt {1.25} = 1.118##

EDIT, ADDED INTUITION ON WHY THE TOLERANCES SHOULD NOT JUST BE ADDED:
Consider an extreme case where one variable, X, has a relatively huge uncertainty (like 1000) and the other, Y, has a very small uncertainty (like 0.1). Then the question is whether the uncertainty of the sum X+Y would be 1000.1. The answer is no. The small uncertainty of Y is not likely to push the sum out of the tolerance of X except for the small fraction of the time that X is within [-1000.1,-999.9] or [+999.9, 1000.1]. So most of the time, the uncertainty of Y does not matter. It has very little effect on the X+Y region of the confidence level -- much smaller than 0.1.
 
Last edited:
  • #6
FactChecker said:
When you say that one variable, X, is ##2 \pm 0.5##, you are saying that you have some level of confidence that the value will be within ##\pm 0.5## of ##2##. Assume that you are using the same level of confidence for the second variable, Y, and can say with that confidence that it is within ##\pm 1## of ##3##. Also assume that the variation of the two variables, X and Y, are independent of each other.
Suppose that you want to keep the same level of confidence for the sum, X+Y, being within some distance of the mean of X+Y. The trick is that to find the deviation of the sum, you must add the squares of the individual deviations and then take the square root. So you should calculate ##\sqrt {0.5^2 + 1^2} = \sqrt {1.25} = 1.118##. The final answer is ##5 \pm 1.118## with the same confidence of the original two variables.
What do you mean by "you have some level of confidence that the value will be within ##\pm 0.5## of ##2##"?
I believe we're misunderstanding each other because you're assuming a normal distribution for the pdf. I think you're assuming that because you're talking about standard deviation and confidence. I was talking about a general case and did this example with a constant distribution because I thought it'd be the simplest although I was proven wrong as exposed in post #4.
Juanda said:
I initially proposed those constant probability functions because they seemed like the simplest possible option but it turns out that, since the functions are defined in segments and there is no variable in those segments, the convolution formula isn't as straightforward.

If I try to apply your logic to this constant distribution I get strange results. In that example, I assumed a constant pdf so I have 100% confidence the number will be within those margins.
Following the logic you exposed, does that mean that I can be 100% confident that the sum will be ##5 \pm 1.118##? That makes no sense to me.
I cannot draw the exact resultant pdf from the convolution yet. But looking at how the value that the resultant pdf would take changes as I use the slider that defines ##s## I can see it'd look like a trapezoid.
1711130411852.png

By the way, that result coincides with the intuition you mentioned in your second edit. It's less likely to obtain a very small sum because it'd require both inputs to be very small. The same applies to a big sum. It's the numbers in the middle that will be more likely because they can be produced by:
  1. Medium number + Medium number
  2. Big number + Small number
  3. Small number + Big number
 
  • #7
I guess we are talking about different things. IMO, the logic in my posts apply to general random variables that have a finite mean and standard deviation. I don't see right now what breaks down in my logic for your 100% confidence example.
Regardless of the probability distribution, doesn't your logic for 100% confidence always work? I don't see anything else that needs to be done and I agree with your original post in that situation. The only complication would be if the two random variables are related in a way that might always force some cancellations of the extreme deviations.
 
  • #8
FactChecker said:
Regardless of the probability distribution, doesn't your logic for 100% confidence always work?
I am pretty sure the method I exposed works for any pdf.

FactChecker said:
I don't see anything else that needs to be done and I agree with your original post in that situation. The only complication would be if the two random variables are related in a way that might always force some cancellations of the extreme deviations.
Dependent variables are out of the question. Still, I disagree. I think there is still depth to explore. More concretely applied to this case, the two points left in the OP and reiterated in post #4.
  1. Solving the inverse problem.
  2. Finding the connection to the real world to assess the correct pdf to the two original values that will be summed.
To solve the inverse problem I proposed this method although I am curious to know if there is a better alternative.
Juanda said:
The inverse problem is yet to be solved.

I have already gone numerical so I am not expecting an analytical pretty solution but I guess I'm OK with that. I'm thinking of calculating a bunch of combinations defining the boundaries of the two initial numbers and then choosing the combination that results in the intervals being the biggest while still fulfilling the condition. It would be like solving the forward problem several times changing things and then choosing the best option from them all.
I am aware I am not following a very efficient calculation method but, at this level, I will choose clarity and computer power over other parameters.

To find the actual pdf of the two variables though... I don't really know if there is a way to know besides experimentally. Maybe someone with experience has already been in those shoes and can provide some insight.
Juanda said:
As a last bonus question, I mentioned at the top of the post that the applicability of this is limited by the actual knowledge of the probability distributions present in machined parts. Let's imagine I ask for 100 cylinders to be machined to 50±0.5 mm in diameter. I believe the machinist will keep removing material with his lathe until he's within the tolerance bracket. He has no particular interest in giving me the part being as close to the nominal value as possible since that risks scrapping a part because he removed more material than necessary. Therefore, my current guess is that the distributions must look somewhat like this:
View attachment 341886
That is only an educated guess in which even psychology is involved. The machinist might be somewhat of a perfectionist and he wants to get close to the average value. The only way to know would be to take all manufactured parts and get the probability distribution but that implies the job is already done so you cannot modify the tolerance bracket in the design to nail the target you initially had in mind. Is there a best way to approach this? What probability distribution would you apply?

So far, all my work is in low volumes so even if I knew the details of this problem there is little chance they would be representative enough to apply it but I still think it's an interesting matter to think about and it might be more useful in the future. Knowledge is power.

Let me know your thoughts.
Thanks in advance.
 
  • #9
Thanks to the arrival of ChatGPT the barrier for coding is now much lower than it used to be. I could write some basic code reasonably quickly to find the convolution of two constant probability density functions.

Being this the input:
#Nominal values of each measurement.
L_1_nom = 5 #[mm]
L_2_nom = 10 #[mm]

#Tolerance bracket of each measurement.
tol_sup_1 = .5 #[mm]
tol_inf_1 = -.2 #[mm]
tol_sup_2 = .8 #[mm]
tol_inf_2 = -.8 #[mm]

# Limiting values.
lim_inf = 14.2
lim_sup = 15.8

Here is the output:
1717328065168.png

1717328939396.png



This is still far from done. To be useful in any way I need to keep crunching numbers so that I can make an informed decision regarding the tolerance bracket so that it allows me to keep the sum within certain values with a x% certainty. Maximizing the tolerance bracket should make the parts easier and cheaper to manufacture.

At least, with this code, I can now do the brute force approach where I keep changing the input parameters until the output looks acceptable.

Also, for now, the code is only compatible with constant PDFs. Some minor changes are needed to perform the calculation with other shapes. However, the methodology for choosing either a constant PDF, a normal distribution, or any other shape is still unclear.

Here is the code in case anyone finds this useful in the future. I changed most of the code to English, but some bits in Spanish might be there.
Oh, and there probably are a lot of typos because I have Grammarly disabled in Python.

Calc_tolerancia_combinadas:
# -*- coding: utf-8 -*-
"""
Created on Tue May 28 21:56:41 2024

@author: Juanda
"""

#Probability calculations
#Addition of two random variables. Obtain the probability of the sum being within two values.
#For example, suppose it's the total lenght of two blocks next to each other.
#The random variables will have an assosiated probability density function (PDF).
#For now, the code works with constant PDFs.

import numpy as np
import matplotlib.pyplot as plt

#Nominal values of each measurement.
L_1_nom = 5 #[mm]
L_2_nom = 10 #[mm]

#Tolerance bracket of each measurement.
tol_sup_1 = .5 #[mm]
tol_inf_1 = -.2 #[mm]
tol_sup_2 = .8 #[mm]
tol_inf_2 = -.8 #[mm]


#Definition of the tolerance bracket.
def tol_bracket(tol_sup,tol_inf):
    tol_bra = abs(tol_sup - tol_inf)
    return tol_bra

tol_bra_1 = tol_bracket(tol_sup_1, tol_inf_1)
tol_bra_2 = tol_bracket(tol_sup_2, tol_inf_2)


#Defition of the max and min deviations.
def L_max_min(L_nom, tol_sup, tol_inf):
    L_max = L_nom + tol_sup
    L_min = L_nom + tol_inf
    return L_max, L_min
   
L_1_max = L_max_min(L_1_nom, tol_sup_1, tol_inf_1)[0]
L_1_min = L_max_min(L_1_nom, tol_sup_1, tol_inf_1)[1]
L_2_max = L_max_min(L_2_nom, tol_sup_2, tol_inf_2)[0]
L_2_min = L_max_min(L_2_nom, tol_sup_2, tol_inf_2)[1]

#Definition of the costant PDFs.
#It'll be 0 outside of the bracket and a costant value within the bracket.
#The vectors for the two variables must have the same step size due to how calculations are done.
k = 0.001 #Step size [mm].

#Python function to define the x coordinate of the PDFs.
def CPDF_X(x_min, x_max, k): #CPDF: Constant probability density function.
    x = np.arange(x_min, x_max, k)
    return x

L_1_x = CPDF_X(L_1_min, L_1_max, k)
L_2_x = CPDF_X(L_2_min, L_2_max, k)

#Python function to find the "y" value of the PDF.
#Area below the PDF must be 1. It's the area of a rectangle.
def CPDF_P(tol_bracket, vect): #CPDF: Constant probability density function.
    p_L = 1 / tol_bracket
    C = np.size(vect)
    p_L = np.linspace(p_L, p_L, C)
    return p_L

p_L_1 = CPDF_P(tol_bra_1, L_1_x)
p_L_2 = CPDF_P(tol_bra_2, L_2_x)



#Python funtion to draw each PDF separately.
def plot1PDF(x,P, title, PDF_label, x_label, P_label):
    plt.plot(x, P, label=PDF_label, color='red')
    plt.vlines(x=x[0], ymin=0, ymax=P[0], color='red', linestyle='-')
    plt.vlines(x=x[-1], ymin=0, ymax=P[-1], color='red', linestyle='-')
    # Labels and title
    plt.xlabel(x_label)
    plt.ylabel(P_label)
    plt.title(title)
    #Horizontal line added to more clearly represent 0 probability
    plt.axhline(y=0, color='black', linestyle='-')
    plt.legend()
    #Grid added
    plt.grid(True, which='both', linestyle='--', linewidth=0.5, color='grey')
    #Show the graph
    plt.show()
   
plot1PDF(L_1_x, p_L_1, 'Probability of L_1','L_1 PDF', 'L_1 length [mm]', 'Probability [-]')
plot1PDF(L_2_x, p_L_2, 'Probability of L_2','L_2 PDF', 'L_2 length [mm]', 'Probability [-]')

#Python function to draw two PDFs in the same graph.
def plot2PDF(x, Px, y, Py, title, FunctPxlabel, FunctPylabel,  x_axis_label, y_axis_label):
    plt.plot(x, Px, label=FunctPxlabel, color='red')
    plt.vlines(x=x[0], ymin=0, ymax=Px[0], color='red', linestyle='-')
    plt.vlines(x=x[-1], ymin=0, ymax=Px[-1], color='red', linestyle='-')
    plt.plot(y, Py, label=FunctPylabel, color='blue')
    plt.vlines(x=y[0], ymin=0, ymax=Py[0], color='blue', linestyle='-')
    plt.vlines(x=y[-1], ymin=0, ymax=Py[-1], color='blue', linestyle='-')
    # Labels and title
    plt.xlabel(x_axis_label)
    plt.ylabel(y_axis_label)
    plt.title(title)
    #Horizontal line added to more clearly represent 0 probability
    plt.axhline(y=0, color='black', linestyle='-')
    plt.legend()
    #Grid added
    plt.grid(True, which='both', linestyle='--', linewidth=0.5, color='grey')
    #Show the graph
    plt.show()

plot2PDF(L_1_x, p_L_1, L_2_x, p_L_2, 'Prob Dens Functions', 'Body 1', 'Body 2', 'Length [mm]', 'Probability [-]')


#Python function to define the convolution of the two variables.
#Although the functions are discretized, it's still a PDF so the "integral" must be done by using dx
#The result will be the PDF of the sum of the two variables.
def convolucion_pdfs(x1, y1, x2, y2):
    """
    Performs the convolution of two discretized probability density functions (PDFs).
   
    :param x1: Vector of x coordinates for the first PDF
    :param y1: Vector of values for the first PDF
    :param x2: Vector of x coordinates for the second PDF
    :param y2: Vector of values for the second PDF
    :return: (x_conv, y_conv) Vector of x coordinates and values of the convolved PDF
    """
    # Find the step size of the vectors (assuming it is constant)
    dx1 = x1[1] - x1[0]
    dx2 = x2[1] - x2[0]

    if not np.isclose(dx1, dx2):
        raise ValueError("Step size in x1 y x2 must be equal.")

    dx = dx1  # Use the same for both
   
    # Perform the convolution using numpy
    y_conv = np.convolve(y1, y2, mode='full') * dx
   
    # Calculate the new x vector. I had to use +dx/2 to obtain a vector of the right size.
    x_conv = np.arange(x1[0] + x2[0], x1[-1] + x2[-1] + dx/2, dx)
   
    return x_conv, y_conv

Conv_sol = convolucion_pdfs(L_1_x, p_L_1, L_2_x, p_L_2)
plot1PDF(Conv_sol[0], Conv_sol[1], 'Probability of the sum of probabilities','Sum PDF', 'Total length [mm]', 'Probability [-]')

#Find the area between two x values.
#I define a function to find the 2 values within the vector closest to the defined input values
def encontrar_indices_mas_cercanos(x, valor1, valor2):
    """
    Find the indices of the closest values in the x vector for the given values.
   
    :param x: Vector of x coordinates
    :param value1: First value to find the closest index
    :param value2: Second value to find the closest index
    :return: Closest indices in x for value1 and value2
    """
    idx1 = (np.abs(x - valor1)).argmin()
    idx2 = (np.abs(x - valor2)).argmin()
    return idx1, idx2

#Calculo el área entre esos 2 valores
def calcular_area(x, y, valor1, valor2):
    """
    Calculate the area under the curve defined by x and y between value1 and value2.
   
    :param x: Vector of x coordinates
    :param y: Vector of y coordinates
    :param value1: Initial value for area calculation
    :param value2: Final value for area calculation
    :return: The area under the curve between value1 and value2
    """
    # Find the nearest indices in x for value1 and value2.
    idx1, idx2 = encontrar_indices_mas_cercanos(x, valor1, valor2)
   
    # Make sure idx1 is lower than idx2
    if idx1 > idx2:
        idx1, idx2 = idx2, idx1
   
    # Extract the relevant segments in x and y
    x_segment = x[idx1:idx2 + 1]
    y_segment = y[idx1:idx2 + 1]
   
    # Find the area below the relevant part of the curve usig trapezoids
    area = np.trapz(y_segment, x_segment)
   
    return area

# It's now possible to find the probability of the sum being between two values.
lim_inf = 14.2
lim_sup = 15.8
Prob_result = calcular_area(Conv_sol[0], Conv_sol[1], lim_inf, lim_sup)
print('Probability of the sum being between', lim_inf, 'mm and', lim_sup, 'mm is', np.round(Prob_result*100, 2), '%.')


def plot_function_with_area(x, y, x_val1, x_val2, title, f_label, x_label, y_label, prob_value):
    """
    Draws the function defined by the vectors x and y, and fills the area between x_val1 and x_val2.
   
    :param x: Vector of x coordinates
    :param y: Vector of values corresponding to x
    :param x_val1: First value of x where a vertical line is drawn
    :param x_val2: Second value of x where a vertical line is drawn
    """
   
    # Find the closest positions to the input values.
    idx1 = np.argmin(np.abs(x - x_val1))
    idx2 = np.argmin(np.abs(x - x_val2))
    # Make sure idx1 is lower than idx2
    if idx1 > idx2:
        idx1, idx2 = idx2, idx1
   
    plt.plot(x, y, label=f_label, color='purple')
 
    plt.vlines(x=x[idx1], ymin=0, ymax=y[idx1], color='green', linestyle='--')
    plt.vlines(x=x[idx2], ymin=0, ymax=y[idx2], color='green', linestyle='--')
 
    plt.xlabel(x_label)
    plt.ylabel(y_label)
    plt.title(title)
    plt.axhline(y=0, color='black', linestyle='-')

    plt.grid(True, which='both', linestyle='--', linewidth=0.5, color='gray')  

    label=f'{x_val1} < x < {x_val2} = {round(prob_value*100,2)}%'
    plt.fill_between(x[idx1:idx2+1], y[idx1:idx2+1], color='green', alpha=0.5, label=label)

    plt.legend()

    plt.show()
   

plot_function_with_area(Conv_sol[0], Conv_sol[1], lim_inf, lim_sup, 'Probability of the sum', 'Prob density function', 'Length [mm]', 'Prob [-]', Prob_result)
 
  • Like
Likes FactChecker
  • #10
Juanda said:
Thanks to the arrival of ChatGPT the barrier for coding is now much lower than it used to be. I could write some basic code reasonably quickly to find the convolution of two constant probability density functions.
Interesting. I have seen some simple examples of using ChatGPT to make code.
This is probably a subject for a separate thread, but I wonder how much work it took to get the result that you wanted from ChatGPT? And how much you had to modify the result?
 
  • #11
FactChecker said:
Interesting. I have seen some simple examples of using ChatGPT to make code.
This is probably a subject for a separate thread, but I wonder how much work it took to get the result that you wanted from ChatGPT? And how much you had to modify the result?


In my case, I try to ask for specific tasks and build the code point by point. For example:

Given two discretized functions created by the vectors x, y1, and y2, create a function in Python to multiply the y values and plot the result.

That's just an example unrelated to anything. The point is that if you know what's the target for the code and which steps you need to follow then you can use ChatGPT as a translator which is just what I need. Also, it's tremendously useful to learn to code. You learn by doing and, as long as you pay attention to the explanations from ChatGPT, the learning process is way less frustrating than spending hours on Google looking for someone trying to solve a somewhat similar thing that you can adapt to your own project. I underlined that text because it also opens the door to code blindly and not learn anything in the process but I think it won't be the case in most situations and by the time that becomes the norm it'll be because programming will be done in a completely different manner. Similar to how programming in the 60s was totally different to programming now. Machine language is pretty rare nowadays.

Probably in the future, it'll be possible to give it more broad commands. Even now that may be already a reality but I'm not skilled enough to extract all the potential value from it. Still, the way it works now already is tremendously useful to me.

Another thing that I think works great is debugging. Computers are powerful but silly. If you are missing a comma or something like that the computer won't work correctly. There might be a command that you know it'd work but it's spitting an error. If you copy-paste the code line where the error happens it'll either tell you what's wrong in the first shot or walk you to the solution after a few interactions.

For now, it's free, I'd recommend you to use it. Even incorporate it into your workflow if that's an option because it can accelerate many tasks. I believe some open-source versions such as Llama 3 are coming around and working fine but I'm not an expert.


By the way, I fed the example prompt I mentioned in the beginning and this is the answer it produced.
Given two discretized functions created by the vectors x, y1, and y2, create a function in Python to multiply the y values and plot the result.
Here's a Python function that takes three vectors, x, y1, and y2, multiplies the y1 and y2 values, and then plots the result using matplotlib.

First, make sure you have the necessary library installed. You can install it using:

bash
Copiar código
pip install matplotlib

Now, here's the function:

python
Copiar código
import matplotlib.pyplot as plt

def multiply_and_plot(x, y1, y2):
# Ensure the vectors have the same length
if len(x) != len(y1) or len(y1) != len(y2):
raise ValueError("All input vectors must have the same length.")

# Multiply y1 and y2 element-wise
y_product = [a * b for a, b in zip(y1, y2)]

# Plot the result
plt.figure(figsize=(10, 5))
plt.plot(x, y_product, label='y1 * y2', color='b')
plt.xlabel('x')
plt.ylabel('y1 * y2')
plt.title('Product of y1 and y2')
plt.legend()
plt.grid(True)
plt.show()

# Example usage:
x = [0, 1, 2, 3, 4]
y1 = [1, 2, 3, 4, 5]
y2 = [2, 3, 4, 5, 6]

multiply_and_plot(x, y1, y2)

Explanation:​

  1. Importing Matplotlib: The matplotlib.pyplot module is imported for plotting.
  2. Function Definition: The function multiply_and_plot takes three arguments: x, y1, and y2.
  3. Length Check: It checks if all input vectors have the same length. If not, it raises a ValueError.
  4. Multiplication: It multiplies y1 and y2 element-wise using a list comprehension.
  5. Plotting: It plots the result using plt.plot, with the x-values from x and the product values from y_product. Labels, title, legend, and grid are added for clarity.
  6. Example Usage: An example usage is provided to demonstrate how to use the function.
You can replace the example vectors with your own data to see the results.
Even if you already knew how to code such a simple task, asking it to do it for you is way faster. Also, the resulting code tends to be much cleaner than what a beginner would produce.
 
  • Like
Likes FactChecker and WWGD
  • #12
A word of caution: You assume these variations are independent.

Beware situations where may 'compound' rather than 'add', loose marine analogy being 'Rogue Waves'....
 

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
536
  • Set Theory, Logic, Probability, Statistics
Replies
11
Views
735
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
9
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
20
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
3
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
6
Views
1K
Back
Top