Last Updated on 2024-08-05 by Clay
The probability density function (PDF) is a fundamental concept in probability theory and statistics, describing the probability distribution of a random variable across its range of values.
Definition
For a continuous random variable , its probability density function satisfies the following conditions:
- Non-negativity: The probability value is not negative, for all ,
- Normalization: The integral over the entire domain is 1 (the total probability is 1), i.e.:
- Interval Probability: The probability that the random variable lies within the interval can also be obtained through integration:
Applications
1. Calculating the Expected Value of a Random Variable
The expected value of a random variable is the weighted average of its values, with the weights being the probabilities of each value. This is called the weighted average rather than just the average because the probabilities of the variable's values are not necessarily equal.
For example, if we consider a fair six-sided die, the probability of each side appearing is equal. In this case, the expected value is the arithmetic mean of the variable's values. Specifically:
However, if we consider a pond filled with different types of fish and calculate the expected value of catching a carp, the probabilities are weighted by the different types of fish and their respective probabilities.
- Number of carps , value currency units
- Number of goldfish , value currency units
- Number of other fish , value currency units
The total number of fish is , which is 100 fish. We can then calculate the probability of catching each type of fish:
- Probability of catching a carp
- Probability of catching a goldfish
- Probability of catching other fish
Expected value
Thus, the expected value of catching a fish is 42 currency units.
2. Calculating the Variance of a Random Variable
The variance measures the degree of dispersion around the mean. Simply subtracting the mean from each data point can result in positive and negative errors canceling each other out. Taking the absolute value is less convenient for integration and differentiation. Instead, squaring the differences is more convenient and amplifies the effect of larger deviations.