Statistics for M.L and D.S Vol-3:
In this and the next article, i am going to talk about the probability and its distributions. In the last article, we have talked about some basic but important terminologies used for Data Science and Machine Learning which are Standard deviation, Variance, Z-score, Pearson's r, and Regression. We haven't talked about regression that much because will gonna cover it on Machine Learning algorithms article. Now, we will discuss the Probability and its Distributions. Such as Randomness, Sample-space, tree diagram, Joint and Marginal probability, Conditional probability, Bayes theorem, Discrete Probability distribution, Bernoulli Distribution, Binomial Distribution, Hypergeometric Distribution, Geometric Distribution, Poisson Distribution, Negative Binomial Distribution. So without wasting any time, let's get started.
Here's the link of the last two articles:
Randomness:
Randomness is everywhere. Imagine, you are on a beach, watching the waves roll in. And then you saw a specific or say a beautiful shell which is quite different from other shells. And you are quite fascinated by watching that shell. As soon as you found out that shell, you started looking for others. This might be unpredictable because it is uncertain that you may or may not be able to find another one. If you'd have been at this beach before, you might have spotted the special shell previously so this may change your search strategy. But one thing for sure, we can define randomness as a quantitative variable. There are various examples on randomness like gambler's fallacy. So it is really important to learn about formal ways for quantifying randomness, reasoning about it and generating realistic random patterns.
Probability:
As you now know, the human brain may not be particularly fit with the nuances of Randomness, but fortunately, there is a fundamental mechanism with deals with the nuances of Randomness. By this mechanism Randomness changes from unpredictive and variable to something which is constant and predictable. There is even a mathematical proof for this: Law of Large Numbers. It basically relies on independence, which the original outcome doesn't depend upon the previous outcomes.
Lemme give you an example. Suppose you are on a beach, you somehow found out there are four types of shells at the beach i.e Q, R, S, T. The shells are distributed randomly and in equal quantities. Now, if you were to count the number of Q shells. You selected 20 shells from the beach at random and found out that only 2 Q shells out of 20. The fraction or say the relative frequency is one-tenth, but irregularities with small samples is the nature of Randomness. So that's why you carry on selecting Q shells and saw that eventually, it comes to one-fourth. Each time you select a new shell either of Q, R, S, T. That phenomenon or mechanism is called as Probability. In Probability, every time you picked up is called an Event. and the act of selecting a shell call as Independent Trial. and this whole process called an Experiment. In short, you would need to take some samples to have the law of large numbers to its work and ensure the probability close to its actual value.
Lemme give you an example. Suppose you are on a beach, you somehow found out there are four types of shells at the beach i.e Q, R, S, T. The shells are distributed randomly and in equal quantities. Now, if you were to count the number of Q shells. You selected 20 shells from the beach at random and found out that only 2 Q shells out of 20. The fraction or say the relative frequency is one-tenth, but irregularities with small samples is the nature of Randomness. So that's why you carry on selecting Q shells and saw that eventually, it comes to one-fourth. Each time you select a new shell either of Q, R, S, T. That phenomenon or mechanism is called as Probability. In Probability, every time you picked up is called an Event. and the act of selecting a shell call as Independent Trial. and this whole process called an Experiment. In short, you would need to take some samples to have the law of large numbers to its work and ensure the probability close to its actual value.
Sample Space, Event, Probability of Event and Tree Diagram:
Imagine you are standing on a beach and its a warm afternoon. You might wanna use refreshment may be some cold drinks and coincidently, you see a stall which sells Ice-creams and cold drinks. You would like to go for cold drinks. But unfortunately, it is almost sold out and there are four persons standing in a queue. However, there's one ice cream and two cold drinks bottle left so the owner decided to sell one item for each customer.
As you can see there are four persons standing in a queue and there are a limited number of the item to sell and you wish to go for a cold drink. You have no idea what other people would go for, so there are Random Events. Now in order to calculate chances, we will draw a tree diagram:
That's the tree diagram for our events. It looks tedious to understand, but i will tell you what exactly is this. H1->Human 1, H2->Human 2, H3->Human 3, C->Cold drink, I->Ice cream, Me->Myself. In the problem, we have 3 Ice creams and 2 Cold drinks left. And we would like to go for Cold Drinks.
As you can see all the possible outcomes of combinations of cold drinks and ice creams in a group of three(last combinations or say sample space). Any outcome of combinations of outcome is called an Event. You are both lucky and unlucky in the sample space or all possible outcomes. In any case, when assigning these probabilities per event, lies between 0 and 1.
As you can see, we have 4 out of 7 events where we can buy one cold drink bottle. The diagram shows that you assume there's an 0.5 probability for choosing either ice cream or cold drink by each customer. This rule applied to every node in the tree diagram. For instance, if you consider this tree diagram at the first customer, there are two branches, each with probability 0.5 summing to 1. And same for other nodes. So in a tree diagram, you multiply all the probabilities along a certain path to find the probability of combining event. So, at last, we get 0.125 for the first 6 events and 0.25 for the last. The probability of getting a cold drink is(3 Ice creams and 2 Cold drinks) is 0.125+0.125+0.125+0.125=0.5.
***important note-> Tree diagram can be used for the small number of events and each node should participate. For instance, we cannot draw a tree diagram in the case where one person who is in the queue changed his/her mind to buy one of the things
The rectangle depicts the sample space. For the third diagram, inside the space is A and the complement of A. For the second diagram, B is not overlapping with A is a disjoint event, having a single head as an outcome in event A and having two heads in B. And, for the first diagram, it is an intersection event, A could be an event of getting heads as the second result and B could be the event of having just a single heads.
Let's find the probability of intersection event:
P(A intersection B)=P(A ∩ B)=P(A)*P(B).
Now, we will talk about the Union. It is simply the sum of the parts where special care is taken that things are not counted double. The union is the name for combining events or individual events. As event A occurs or B occurs or both.
P(A ∪ B) =P(A)+P(B)-p(A ∩ B).
That's all for this article. The next article would be quite long and in that, i will discuss the joint and marginal probability, conditional probability and various probability distributions. So until then
Happy Coding!
As you can see all the possible outcomes of combinations of cold drinks and ice creams in a group of three(last combinations or say sample space). Any outcome of combinations of outcome is called an Event. You are both lucky and unlucky in the sample space or all possible outcomes. In any case, when assigning these probabilities per event, lies between 0 and 1.
Quantifying Probability with Tree Diagram:
Now, we will see how to calculate the probability using the tree diagram. We need to calculate the probability of availability of cold drink bottle.
***important note-> Tree diagram can be used for the small number of events and each node should participate. For instance, we cannot draw a tree diagram in the case where one person who is in the queue changed his/her mind to buy one of the things
Probability and Sets:
Now, we are going to talk about some concepts related to sets, a collection of items. They are quite important to understand probability and also to derive calculus rules for probabilities. As we already know that sample space is the collection of all possible outcomes for a random phenomenon and an event is a subset of the sample space.
Now, there can be two events that do not share any outcomes called disjoint or mutually exclusive. For instance, there are four possible outcomes of tossing coins two times, the case where you double toss which result is Zero, One and Two Heads. In case the event is not happening is called a Complement. For example, One Two Three Heads v/s Zero Heads.
You can also have multiple events which together fill up the complete sample space. These are called as jointly or collectively exhaustive. And if they don't overlap then called as disjoint collectively exhaustive. The sum of probabilities associated with collectively exhaustive is equal to one and the one with associated with disjoint collectively exhaustive is lie between Zero and One. This can be greatly visualized with Venn diagrams. Venn diagrams are the combinations of simple geometric shapes that represent sets or part of sets
The rectangle depicts the sample space. For the third diagram, inside the space is A and the complement of A. For the second diagram, B is not overlapping with A is a disjoint event, having a single head as an outcome in event A and having two heads in B. And, for the first diagram, it is an intersection event, A could be an event of getting heads as the second result and B could be the event of having just a single heads.
Let's find the probability of intersection event:
P(A intersection B)=P(A ∩ B)=P(A)*P(B).
Now, we will talk about the Union. It is simply the sum of the parts where special care is taken that things are not counted double. The union is the name for combining events or individual events. As event A occurs or B occurs or both.
P(A ∪ B) =P(A)+P(B)-p(A ∩ B).
That's all for this article. The next article would be quite long and in that, i will discuss the joint and marginal probability, conditional probability and various probability distributions. So until then
Happy Coding!
Comments
Post a Comment