Statistics 101: Probability Distribution

Mayfest2023

Bài đăng này đã không được cập nhật trong 2 năm

Definition:

Probability Distribution is defined as a range of values that random variables take to decide the likelihood of an event happening. These events such as a roll of a dice, or the probability that a king is drawn from a pack of cards is calculated and assigned across all possible outcomes. However, it starts proving significance when Probability Distribution helps to work out the probability distribution function, which in turn is used in statistical models such as hypothesis testing. Before we dive deep into it, let’s see some common examples, uses and types of probability distributions.

Types and properties of a probability distribution:

In Statistics, probability distribution is done with the use of plotting on the X-axis and the Y-axis. On the X-scale, we label the random variables while we can tell about the scores of probabilities of distribution from the Y-axis. For general expression of probability distribution in statistics, we use X ~ N (µ, σ) where the population mean and the standard deviation is denoted inside the bracket. Now the distribution here is called by N, which is usually the expression for Normal Distribution, however, we can have symbols denoted on the Y-axis for a variety of probability distributions. Mainly, there are two types of Probability Distributions - the Discrete Probability Distribution and the Continuous Probability Distribution.

Discrete Probability Distribution:

A discrete probability distribution represents the probability distribution for discrete variables. A discrete variable is one which takes a value that cannot have any other value in between. For example, coin tosses are a discrete random variable because for every flip of a coin, the result can only be a head or a tail. Here, we can arrange the discrete values of data in a table. However, one thing to keep in mind is that the probabilities of all the random variables combined will always be equal to one, as only then we can put a finite number of observations to the test.

The Types of Discrete Probability Distributions:

Even in discrete probability distributions, we have a variety of denotations such as the Binomial, Bernoulli, Poisson’s or the Uniform distributions. These are all characterized by the various shapes they take depending upon the time and occurrence of variables. For example, the difference between a Binomial and a Bernoulli distribution is that of the number of samples assessed during the occurrence. While a Bernoulli will calculate the probability of a single event, a Binomial distribution calculates the likely occurrence of an outcome out of a series of successive events. At the same time, a Poisson’s probability distribution focuses on returning the probability of outcome at a certain time interval. Uniform, on the other hand, depicts the distribution of variables evenly spread from the mean on both sides of the graph.

Continuous Probability Distribution:

We saw earlier that the Discrete variables take values which cannot take any other values in between. Comparatively, Continuous variables are those which can take any number of values in between because these are generally time-bound entities. For example, take the weather, height or weight of a person for data. This data is subject to change for any time-sensitivity involved, and a range of values to take during that range as well. This makes Continuous Probability Distribution get a shape that tells us about the parameters that it holds true for statistics - such as standard deviation or measures of dispersion.

Types of Continuous Probability Distributions:

Just like the discrete probability distributions, continuous probability distributions can be plotted in a variety of ways. Depending on the parameters chosen, we could get a normal distribution popularly called the Gaussian distribution, or even the “bell curve”. Apart from the normal distribution, skewed data graphs are commonly observed in continuous probability distributions where the curves could be left- or right- skewed. One type of distribution which entails the right-skewed distribution is the lognormal and the other is gamma distribution. Other distributions that can be covered under the banner of continuous distribution are Uniform, Exponential and Beta.

Uses of Probability Distribution: Hypothesis testing

Now that we’ve covered the different types of distributions and how they tell us about the various parameters, we can look closely at how it is used in practical applications of statistics. Hypothesis testing, which forms the pillar of deducing from observations in statistics, uses the probability distributions to calculate the p-values. Again, the various kinds of tests performed in hypothesis testing yield various values, t-tests use t-values, ANOVA uses f-values, and chi square tests use chi square values.

Conclusion:

Probability is an important part of statistical analysis which allows for generation of real-time insights, forecasting and statistical modeling using libraries to give the right output. Most importantly, it forms a key module of the Data Science Course and AI course. Skillslash offers a certified work experience at top MNCs right upon course completion. To know more about the program, contact us at www.skillslash.com

Mục lục