# Thread: Probability That Receipt Of Email 1 AND Email 2 Was Random

1. An issue I’ll call Issue 1 arose and prompted Event A to occur. A few days after Event A occurred, I received a virus email I’ll call Virus Email 1. The content of Virus Email 1 referred to something that is a direct outcome of Event A. Therefore, Virus Email 1 is directly related to Event A. Later Event B occurred. Event B was about Issue 1 and about receiving Virus Email 1. Therefore, Event B is directly related to Event A. A few days after Event B occurred, I received another virus email I’ll call Virus Email 2. The content of Virus Email 2 was identical to Virus Email 1. Therefore, Virus Email 2 is directly related to Virus Email 1.

I would like to calculate the probability that the receipt of Virus Email 1 AND Virus Email 2 was random.

Some thoughts about the problem (right or wrong ???) ......
* The probability of receiving an email any day of the year (ignoring a leap year) is 1 out of 365.
* However, the probability of receiving an email directly related to an event after the event occurs is not 1 out of 365.
* The probability of receipt of an email although directly related to an event received after the event occurs being random increases as the number of days after the event occurs increases. In other words, the probability of the email being random is much higher if 100 days have passed since the event than if 3 days have passed since the event.
* So my problem reduces to how is the probability of receiving an email directly related to an event X number of days after the event occurs calculated.
* Intuitively, It seems that receiving both Virus Email 1 and Virus Email 2 has a bearing on calculating the probability of receiving both emails being random. Generally, if the probability for receiving Virus Email 1 was 1/50 and receiving Virus Email 2 was 1/50, the probability for receiving both virus emails would be 1/50 x 1/50. Intuitively, It seems that the probability of receiving both emails would be somewhat less than that because Virus Email 1 and Virus Email 2 are related in that Event A and Event B are related.

Thanks for any thoughts and or suggestions. Hoping to get a solution!!!!

Steve

2.

3. Hi Steve,

Welcome to MHB!

It's a little hard to follow your situation but I'll make some general comments. I'm assuming this is a real life situation instead of an exercise from a textbook.

When we are calculating probabilities, we need to know the distribution of the random variable in question. Some events occur equally likely each day (uniform) while others do not (Poisson, geometric, normal, etc.). It can be dangerous to assume that something automatically follows a certain distribution without empirical data to back that up.

Also, how do you know that the emails and events are perfectly correlated? Maybe they have a strong correlation but don't follow each other exactly...

Anyway, nice to formula to use when we have some prior knowledge of an event is: $\displaystyle P(A|B)=\frac{P(A \cap B)}{P(B)}=\frac{P(A) \times P(B|A)}{P(B)}$.

This requires knowing the various probabilities though. What I think you should do, if this is a real problem, is work with some data to estimate the correlations and probability. You won't be able to work on this purely theoretically. The idea that the probability of both events occurring is $\dfrac{1}{50} \cdot \dfrac{1}{50}$ is only true if the events are independent. Since you think the events are perfectly correlated, they are definitely not independent.

Hope this gives you something to think about.

Hello Jameson,

Thanks for your reply to my poat!

I'm new as of today so I'm still just finding my way. Did I get your name right? I clicked on Reply To Thread to get here to reply back to you; was that correct?

Yes, this is a real-life situation. I appealed a decision by my insurance company to an independent review body (Event A) and three days later I received an email with a zip file attachment containing a virus (Virus Email 1). The content of the email said that I was to appear in court and my case would be heard by a judge in my absence if I don't appear. It said I needed to open the attachment for the details. I knew this was a fake because courts do not send out notices by email. I immediately suspected either the insurance company employee or her superior that made the decision in my case sent the email. Shortly after receiving Virus Email 1 I wrote a letter to the superior's boss (Event B) complaining about how the decision was made by the employee and her superior and complaining about Virus Email 1. Four days after that letter was sent, I received Virus Email 2, a duplicate of Virus Email 1. Checking the headers of both emails I found both emails had the same source but they were spoofed so I could not identify the sender. So now I want to determine if both emails were likely sent by the employee and or her supervisor or were randomly sent by a spammer and the dates of the emails were only coincidental with Event A and Event B.

As you can see, I have no data to work with to estimate the correlations and probability.

Does this make it more clear?

5. Yes, you got my name correctly as well as replied perfectly.

I am sorry you are in this pickle right now but to be straight with you, you don't have enough data to show anything and the data you have is based on non-verifiable assumptions. I'm a masters of statistics student and trying to leverage data into something meaningful is what I love to do, but you just don't have the right pieces.

In my opinion what you need is someone who specializes in internet security and could investigate further with what you have. Good luck.

#### Posting Permissions

• You may not post new threads
• You may not post replies
• You may not post attachments
• You may not edit your posts
•