In probability theory, a probability space or a probability triple is a mathematical construct that models a realworld process (or "experiment") consisting of states that occur randomly. A probability space is constructed with a specific kind of situation or experiment in mind. One proposes that each time a situation of that kind arises, the set of possible outcomes is the same and the probabilities are also the same.
A probability space consists of three parts:^{[1]}^{[2]}

A sample space, Ω, which is the set of all possible outcomes.

A set of events \scriptstyle \mathcal{F}, where each event is a set containing zero or more outcomes.

The assignment of probabilities to the events; that is, a function P from events to probabilities.
An outcome is the result of a single execution of the model. Since individual outcomes might be of little practical use, more complex events are used to characterize groups of outcomes. The collection of all such events is a σalgebra \scriptstyle \mathcal{F}. Finally, there is a need to specify each event's likelihood of happening. This is done using the probability measure function, P.
Once the probability space is established, it is assumed that “nature” makes its move and selects a single outcome, ω, from the sample space Ω. All the events in \scriptstyle \mathcal{F} that contain the selected outcome ω (recall that each event is a subset of Ω) are said to “have occurred”. The selection performed by nature is done in such a way that if the experiment were to be repeated an infinite number of times, the relative frequencies of occurrence of each of the events would coincide with the probabilities prescribed by the function P.
The Russian mathematician Andrey Kolmogorov introduced the notion of probability space, together with other axioms of probability, in the 1930s. Nowadays alternative approaches for axiomatization of probability theory exist; see “Algebra of random variables”, for example.
This article is concerned with the mathematics of manipulating probabilities. The article probability interpretations outlines several alternative views of what "probability" means and how it should be interpreted. In addition, there have been attempts to construct theories for quantities that are notionally similar to probabilities but do not obey all their rules; see, for example, Free probability, Fuzzy logic, Possibility theory, Negative probability and Quantum probability.
Contents

Introduction 1

Definition 2

Discrete case 3

General case 4

Nonatomic case 5

Complete probability space 6

Examples 7

Discrete examples 7.1

Example 1 7.1.1

Example 2 7.1.2

Example 3 7.1.3

Nonatomic examples 7.2

Example 4 7.2.1

Example 5 7.2.2

Related concepts 8

Probability distribution 8.1

Random variables 8.2

Defining the events in terms of the sample space 8.3

Conditional probability 8.4

Independence 8.5

Mutual exclusivity 8.6

See also 9

References 10

Bibliography 11

External links 12
Introduction
A probability space is a mathematical triplet (Ω, \scriptstyle \mathcal{F}, P) that presents a model for a particular class of realworld situations. As with other models, its author ultimately defines which elements Ω, \scriptstyle \mathcal{F}, and P will contain.

The sample space Ω is a set of outcomes. An outcome is the result of a single execution of the model. Outcomes may be states of nature, possibilities, experimental results and the like. Every instance of the realworld situation (or run of the experiment) must produce exactly one outcome. If outcomes of different runs of an experiment differ in any way that matters, they are distinct outcomes. Which differences matter depends on the kind of analysis we want to do: This leads to different choices of sample space.

The σalgebra \scriptstyle \mathcal{F} is a collection of all the events (not necessarily elementary) we would like to consider. Here, an "event" is a set of zero or more outcomes, i.e., a subset of the sample space. An event is considered to have "happened" during an experiment when the outcome of the latter is an element of the event. Since the same outcome may be a member of many events, it is possible for many events to have happened given a single outcome. For example, when the trial consists of throwing two dice, the set of all outcomes with a sum of 7 pips may constitute an event, whereas outcomes with an odd number of pips may constitute another event. If the outcome is the element of the elementary event of two pips on the first die and five on the second, then both of the events, "7 pips" and "odd number of pips", are said to have happened.

The probability measure P is a function returning an event's probability. A probability is a real number between zero (impossible events have probability zero, though probabilityzero events are not necessarily impossible) and one (the event happens almost surely, with total certainty). Thus P is a function \scriptstyle P:\ \mathcal{F} \rightarrow [0,1]. The probability measure function must satisfy two simple requirements: First, the probability of a union of two (or countably many) disjoint events must be equal to the sum of probabilities of each of these events. For example, if two events are Heads and Tails, then the probability of HeadsorTails must be equal to the sum of probabilities for Heads and Tails). Second, the probability of the sample space Ω must be equal to 1 (which accounts for the fact that, given an execution of the model, some outcome must occur). In the previous example the probability of the set of outcomes P(Heads,Tails) must be equal to one, because it is entirely certain that the outcome will be either Heads or Tails (the model neglects any other possibility).
Not every subset of the sample space Ω must necessarily be considered an event: some of the subsets are simply not of interest, others cannot be "measured". This is not so obvious in a case like a coin toss. In a different example, one could consider javelin throw lengths, where the events typically are intervals like "between 60 and 65 meters" and unions of such intervals, but not sets like the "irrational numbers between 60 and 65 meters"
Definition
In short, a probability space is a measure space such that the measure of the whole space is equal to one.
The expanded definition is following: a probability space is a triple \scriptstyle (\Omega,\; \mathcal{F},\; P) consisting of:

the sample space Ω — an arbitrary nonempty set,

the σalgebra \scriptstyle \mathcal{F} ⊆ 2^{Ω} (also called σfield) — a set of subsets of Ω, called events, such that:

\scriptstyle \mathcal{F} contains the sample space: \scriptstyle \Omega \in \mathcal{F},

\scriptstyle \mathcal{F} is closed under complements: if A∈\scriptstyle \mathcal{F}, then also (Ω∖A)∈\scriptstyle \mathcal{F},

\scriptstyle \mathcal{F} is closed under countable unions: if A_{i}∈\scriptstyle \mathcal{F} for i=1,2,..., then also (∪_{i}A_{i})∈\scriptstyle \mathcal{F}

The corollary from the previous two properties and intersections: if A_{i}∈\scriptstyle \mathcal{F} for i=1,2,..., then also (∩_{i}A_{i})∈\scriptstyle \mathcal{F}

the probability measure P: \scriptstyle \mathcal{F}→[0,1] — a function on \scriptstyle \mathcal{F} such that:

P is countably additive: if {A_{i}}⊆\scriptstyle \mathcal{F} is a countable collection of pairwise disjoint sets, then P(⊔A_{i}) = ∑P(A_{i}), where “⊔” denotes the disjoint union,

the measure of entire sample space is equal to one: P(Ω) = 1.
Discrete case
Discrete probability theory needs only at most countable sample spaces Ω. Probabilities can be ascribed to points of Ω by the probability mass function p: Ω→[0,1] such that ∑_{ω∈Ω} p(ω) = 1. All subsets of Ω can be treated as events (thus, \scriptstyle \mathcal{F} = 2^{Ω} is the power set). The probability measure takes the simple form
(*) \qquad P(A) = \sum_{\omega\in A} p(\omega) \quad \text{for all } A \subseteq \Omega \, .
The greatest σalgebra \scriptstyle \mathcal{F} = 2^{Ω} describes the complete information. In general, a σalgebra \scriptstyle \mathcal{F} ⊆ 2^{Ω} corresponds to a finite or countable partition Ω = B_{1} ⊔ B_{2} ⊔ ..., the general form of an event A ∈ \scriptstyle \mathcal{F} being A = B_{k1} ⊔ B_{k2} ⊔ ... (here ⊔ means the disjoint union.) See also the examples.
The case p(ω) = 0 is permitted by the definition, but rarely used, since such ω can safely be excluded from the sample space.
General case
If Ω is uncountable, still, it may happen that p(ω) ≠ 0 for some ω; such ω are called atoms. They are an at most countable (maybe empty) set, whose probability is the sum of probabilities of all atoms. If this sum is equal to 1 then all other points can safely be excluded from the sample space, returning us to the discrete case. Otherwise, if the sum of probabilities of all atoms is less than 1 (maybe 0), then the probability space decomposes into a discrete (atomic) part (maybe empty) and a nonatomic part.
Nonatomic case
If p(ω) = 0 for all ω∈Ω (in this case, Ω must be uncountable, because otherwise P(Ω)=1 could not be satisfied), then equation (∗) fails: the probability of a set is not the sum over its elements, as summation is only defined for countable amount of elements. This makes the probability space theory much more technical. A formulation stronger than summation, measure theory is applicable. Initially the probabilities are ascribed to some “generator” sets (see the examples). Then a limiting procedure allows assigning probabilities to sets that are limits of sequences of generator sets, or limits of limits, and so on. All these sets are the σalgebra \scriptstyle \mathcal{F}. For technical details see Carathéodory's extension theorem. Sets belonging to \scriptstyle \mathcal{F} are called measurable. In general they are much more complicated than generator sets, but much better than nonmeasurable sets.
Complete probability space
A probability space \scriptstyle (\Omega,\; \mathcal{F},\; P) is said to be a complete probability space if for all \scriptstyle B\, \in \,\mathcal{F} with \scriptstyle P(B)\,=\;0 and all \scriptstyle A\; \subset \;B one has \scriptstyle A \;\in\; \mathcal{F}. Often, the study of probability spaces is restricted to complete probability spaces.
Examples
Discrete examples
Example 1
If the experiment consists of just one flip of a perfect coin, then the outcomes are either heads or tails: Ω = {H, T}. The σalgebra \scriptstyle \mathcal{F} = 2^{Ω} contains 2² = 4 events, namely: {H} – “heads”, {T} – “tails”, {} – “neither heads nor tails”, and {H,T} – “either heads or tails”. So, \scriptstyle \mathcal{F} = . There is a fifty percent chance of tossing heads, and fifty percent for tails. Thus the probability measure in this example is P({}) = 0, P({H}) = 0.5, P({T}) = 0.5, P({H,T}) = 1.
Example 2
The fair coin is tossed three times. There are 8 possible outcomes: Ω = {HHH, HHT, HTH, HTT, THH, THT, TTH, TTT} (here “HTH” for example means that first time the coin landed heads, the second time tails, and the last time heads again). The complete information is described by the σalgebra \scriptstyle \mathcal{F} = 2^{Ω} of 2^{8} = 256 events, where each of the events is a subset of Ω.
Alice knows the outcome of the second toss only. Thus her incomplete information is described by the partition Ω = A_{1} ⊔ A_{2} = {HHH, HHT, THH, THT} ⊔ {HTH, HTT, TTH, TTT}, and the corresponding σalgebra \scriptstyle \mathcal{F}_{Alice} = {{}, A_{1}, A_{2}, Ω}. Brian knows only the total number of tails. His partition contains four parts: Ω = B_{0} ⊔ B_{1} ⊔ B_{2} ⊔ B_{3} = {HHH} ⊔ {HHT, HTH, THH} ⊔ {TTH, THT, HTT} ⊔ {TTT}; accordingly, his σalgebra \scriptstyle \mathcal{F}_{Brian} contains 2^{4} = 16 events.
The two σalgebras are incomparable: neither \scriptstyle \mathcal{F}_{Alice} ⊆ \scriptstyle \mathcal{F}_{Brian} nor \scriptstyle \mathcal{F}_{Brian} ⊆ \scriptstyle \mathcal{F}_{Alice}; both are subσalgebras of 2^{Ω}.
Example 3
If 100 voters are to be drawn randomly from among all voters in California and asked whom they will vote for governor, then the set of all sequences of 100 Californian voters would be the sample space Ω. We assume that sampling without replacement is used: only sequences of 100 different voters are allowed. For simplicity an ordered sample is considered, that is a sequence {Alice, Brian} is different from {Brian, Alice}. We also take for granted that each potential voter knows exactly his future choice, that is he/she doesn’t choose randomly.
Alice knows only whether or not Arnold Schwarzenegger has received at least 60 votes. Her incomplete information is described by the σalgebra \scriptstyle \mathcal{F}_{Alice} that contains: (1) the set of all sequences in Ω where at least 60 people vote for Schwarzenegger; (2) the set of all sequences where fewer than 60 vote for Schwarzenegger; (3) the whole sample space Ω; and (4) the empty set ∅.
Brian knows the exact number of voters who are going to vote for Schwarzenegger. His incomplete information is described by the corresponding partition Ω = B_{0} ⊔ B_{1} ... ⊔ B_{100} (though some of these sets may be empty, depending on the Californian voters...) and the σalgebra \scriptstyle \mathcal{F}_{Brian} consists of 2^{101} events.
In this case Alice’s σalgebra is a subset of Brian’s: \scriptstyle \mathcal{F}_{Alice} ⊂ \scriptstyle \mathcal{F}_{Brian}. The Brian’s σalgebra is in turn the subset of the much larger “complete information” σalgebra 2^{Ω} consisting of 2^{n(n−1)...(n−99)} events, where n is the number of all potential voters in California.
Nonatomic examples
Example 4
A number between 0 and 1 is chosen at random, uniformly. Here Ω = [0,1], \scriptstyle \mathcal{F} is the σalgebra of Borel sets on Ω, and P is the Lebesgue measure on [0,1].
In this case the open intervals of the form (a,b), where 0 < a < b < 1, could be taken as the generator sets. Each such set can be ascribed the probability of P((a,b)) = (b − a), which generates the Lebesgue measure on [0,1], and the Borel σalgebra on Ω.
Example 5
A fair coin is tossed endlessly. Here one can take Ω = {0,1}^{∞}, the set of all infinite sequences of numbers 0 and 1. Cylinder sets {(x_{1}, x_{2}, ...) ∈ Ω : x_{1} = a_{1}, ..., x_{n} = a_{n}} may be used as the generator sets. Each such set describes an event in which the first n tosses have resulted in a fixed sequence (a_{1}, ..., a_{n}), and the rest of the sequence may be arbitrary. Each such event can be naturally given the probability of 2^{−n}.
These two nonatomic examples are closely related: a sequence (x_{1},x_{2},...) ∈ {0,1}^{∞} leads to the number 2^{−1}x_{1} + 2^{−2}x_{2} + ... ∈ [0,1]. This is not a onetoone correspondence between {0,1}^{∞} and [0,1] however: it is an isomorphism modulo zero, which allows for treating the two probability spaces as two forms of the same probability space. In fact, all nonpathologic nonatomic probability spaces are the same in this sense. They are socalled standard probability spaces. Basic applications of probability spaces are insensitive to standardness. However, nondiscrete conditioning is easy and natural on standard probability spaces, otherwise it becomes obscure.
Related concepts
Probability distribution
Any probability distribution defines a probability measure.
Random variables
A random variable X is a measurable function X: Ω→S from the sample space Ω to another measurable space S called the state space.
The notation Pr(X∈A) is a commonly used shorthand for P({ω∈Ω: X(ω)∈A}).
Defining the events in terms of the sample space
If Ω is countable we almost always define \scriptstyle \mathcal{F} as the power set of Ω, i.e. \scriptstyle \mathcal{F} = 2^{Ω} which is trivially a σalgebra and the biggest one we can create using Ω. We can therefore omit ℱ and just write (Ω,P) to define the probability space.
On the other hand, if Ω is uncountable and we use \scriptstyle \mathcal{F} = 2^{Ω} we get into trouble defining our probability measure P because \scriptstyle \mathcal{F} is too “large”, i.e. there will often be sets to which it will be impossible to assign a unique measure. In this case, we have to use a smaller σalgebra \scriptstyle \mathcal{F}, for example the Borel algebra of Ω, which is the smallest σalgebra that makes all open sets measurable.
Conditional probability
Kolmogorov’s definition of probability spaces gives rise to the natural concept of conditional probability. Every set A with nonzero probability (that is, P(A) > 0) defines another probability measure

P(B  A) = {P(B \cap A) \over P(A)}
on the space. This is usually pronounced as the “probability of B given A”.
For any event B such that P(B) > 0 the function Q defined by Q(A) = P(AB) for all events A is itself a probability measure.
Independence
Two events, A and B are said to be independent if P(A∩B)=P(A)P(B).
Two random variables, X and Y, are said to be independent if any event defined in terms of X is independent of any event defined in terms of Y. Formally, they generate independent σalgebras, where two σalgebras G and H, which are subsets of F are said to be independent if any element of G is independent of any element of H.
Mutual exclusivity
Two events, A and B are said to be mutually exclusive or disjoint if P(A∩B) = 0. (This is weaker than A∩B = ∅, which is the definition of disjoint for sets).
If A and B are disjoint events, then P(A∪B) = P(A) + P(B). This extends to a (finite or countably infinite) sequence of events. However, the probability of the union of an uncountable set of events is not the sum of their probabilities. For example, if Z is a normally distributed random variable, then P(Z=x) is 0 for any x, but P(Z∈R) = 1.
The event A∩B is referred to as “A and B”, and the event A∪B as “A or B”.
See also
References

^ Loève, Michel. Probability Theory, Vol 1. New York: D. Van Nostrand Company, 1955.

^ Stroock, D. W. (1999). Probability theory: an analytic view. Cambridge University Press.
Bibliography


The first major treatise blending calculus with probability theory, originally in French: Théorie Analytique des Probabilités.


The modern measuretheoretic foundation of probability theory; the original German version (Grundbegriffe der Wahrscheinlichkeitrechnung) appeared in 1933.


An empiricist, Bayesian approach to the foundations of probability theory.


Discrete foundations of probability theory, based on nonstandard analysis and internal set theory. downloadable. http://www.math.princeton.edu/~nelson/books.html

Patrick Billingsley: Probability and Measure, John Wiley and Sons, New York, Toronto, London, 1979.

Henk Tijms (2004) Understanding Probability


A lively introduction to probability theory for the beginner, Cambridge Univ. Press.

David Williams (1991) Probability with martingales


An undergraduate introduction to measuretheoretic probability, Cambridge Univ. Press.

Gut, Allan (2005). Probability: A Graduate Course. Springer.
External links

Sazonov, V.V. (2001), "Probability space", in Hazewinkel, Michiel,

Animation demonstrating probability space of dice

Virtual Laboratories in Probability and Statistics (principal author Kyle Siegrist), especially, Probability Spaces

Citizendium

Complete probability space

Weisstein, Eric W., "Probability space", MathWorld.
This article was sourced from Creative Commons AttributionShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, EGovernment Act of 2002.
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a nonprofit organization.