## Types of probability density functions

There are many PDFs outlined in the statistical literature that often represent particular real situations. The choice of a particular type of PDF depends, at least in part, on the domain of the function (e.g., can it have both positive or negative values, or only non-negative values), the range of the function (e.g., is the range narrow or does it cover orders-of-magnitude), the shape (e.g., symmetry), and processes that generated the data (e.g., additive, multiplicative). These considerations are elaborated below in a brief discussion of many commonly used distributions of practical importance. Examples of such functions and the situations they represent are4:

4 Further information on methods for developing distributions based upon statistical analysis of data are described and illustrated by Cullen and Frey (1999). Other useful references include Hahn and Shapiro (1967), Ang and Tang (1975) D'Agostino and Stephens (1986), Morgan and Henrion (1990), and U.S.EPA (1996, 1997, 1999). Some examples of probabilistic analyses applied to emission inventories are given by Frey and Zheng (2002) and Frey and Zhao (2004).

• The normal distribution is most appropriate when the range of uncertainty is small, and symmetric relative to the mean. The normal distribution arises in situations where many individual inputs contribute to an overall uncertainty, and in which none of the individual uncertainties dominates the total uncertainty. Similarly, if an inventory is the sum of uncertainties of many individual categories, however, none of which dominates the total uncertainty, then the overall uncertainty is likely to be normal. A normality assumption is often appropriate for many categories for which the relative range of uncertainty is small, e.g., fossil fuel emission factors and activity data.

• The lognormal distribution may be appropriate when uncertainties are large for a non-negative variable and known to be positively skewed. The emission factor for nitrous oxide from fertiliser applied to soil provides a typical inventory example. If many uncertain variables are multiplied, the product asymptotically approaches lognormality. Because concentrations are the result of mixing processes, which are in turn multiplicative, concentration data tend to be distributed similar to a lognormal. However, real-world data may not be as tail-heavy as a lognormal distribution. The Weibull and Gamma distributions have approximately similar properties to the lognormal but are less tail-heavy and, therefore, are sometimes a better fit to data than the lognormal.

• Uniform distribution describes an equal likelihood of obtaining any value within a range. Sometimes the uniform distribution is useful for representing physically-bounded quantities (e.g., a fraction that must vary between 0 and 1) or for representing expert judgement when an expert is able to specify an upper and lower bound. The uniform distribution is a special case of the Beta distribution.

• The triangular distribution is appropriate where upper and lower limits and a preferred value are provided by experts but there is no other information about the PDF. The triangular distribution can be asymmetrical.

• Fractile distribution is a type of empirical distribution in which judgements are made regarding the relative likelihood of different ranges of values for a variable, such as illustrated in Figure 3.5. This type of distribution is sometimes useful in representing expert judgement regarding uncertainty.

Figure 3.5 Examples of some commonly used probability density function models

(a) UNIFORM

Value of Variable

(c) FRACTILE

Value of Variable

(b) TRIANGLE

(b) TRIANGLE

Value of Variable

(d) NORMAL

(d) NORMAL

Value of Variable

(e) LOGNORMAL

Value of Variable

(e) LOGNORMAL

Value of Variable