{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "dbe7f4b6",
   "metadata": {},
   "source": [
    "\n",
    "<a id='odu-v3'></a>\n",
    "<div id=\"qe-notebook-header\" align=\"right\" style=\"text-align:right;\">\n",
    "        <a href=\"https://quantecon.org/\" title=\"quantecon.org\">\n",
    "                <img style=\"width:250px;display:inline;\" width=\"250px\" src=\"https://assets.quantecon.org/img/qe-menubar-logo.svg\" alt=\"QuantEcon\">\n",
    "        </a>\n",
    "</div>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "38897c2f",
   "metadata": {},
   "source": [
    "# Exchangeability and Bayesian Updating"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "69c977ab",
   "metadata": {},
   "source": [
    "## Contents\n",
    "\n",
    "- [Exchangeability and Bayesian Updating](#Exchangeability-and-Bayesian-Updating)  \n",
    "  - [Overview](#Overview)  \n",
    "  - [Independently and Identically Distributed](#Independently-and-Identically-Distributed)  \n",
    "  - [A Setting in Which Past Observations Are Informative](#A-Setting-in-Which-Past-Observations-Are-Informative)  \n",
    "  - [Relationship Between IID and Exchangeable](#Relationship-Between-IID-and-Exchangeable)  \n",
    "  - [Exchangeability](#Exchangeability)  \n",
    "  - [Bayes’ Law](#Bayes’-Law)  \n",
    "  - [More Details about Bayesian Updating](#More-Details-about-Bayesian-Updating)  \n",
    "  - [Appendix](#Appendix)  \n",
    "  - [Sequels](#Sequels)  "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d6af650e",
   "metadata": {},
   "source": [
    "## Overview\n",
    "\n",
    "This lecture studies  learning\n",
    "via Bayes’ Law.\n",
    "\n",
    "We touch foundations of Bayesian statistical inference invented by Bruno DeFinetti [[de Finetti, 1937](https://python.quantecon.org/zreferences.html#id47)].\n",
    "\n",
    "The relevance of DeFinetti’s work for economists is presented forcefully  by David Kreps\n",
    "in chapter 11 of [[Kreps, 1988](https://python.quantecon.org/zreferences.html#id111)].\n",
    "\n",
    "An example  that we study in this lecture  is a key component of [this lecture](https://python.quantecon.org/odu.html) that augments the\n",
    "[classic](https://python.quantecon.org/mccall_model.html)  job search model of McCall\n",
    "[[McCall, 1970](https://python.quantecon.org/zreferences.html#id195)] by presenting an unemployed worker with a statistical inference problem.\n",
    "\n",
    "Here we create  graphs that illustrate the role that  a  likelihood ratio\n",
    "plays in  Bayes’ Law.\n",
    "\n",
    "We’ll use such graphs to provide insights into  mechanics driving outcomes in [this lecture](https://python.quantecon.org/odu.html) about learning in an augmented McCall job\n",
    "search model.\n",
    "\n",
    "Among other things, this lecture discusses  connections between the statistical concepts of sequences of random variables that are\n",
    "\n",
    "- independently and identically distributed  \n",
    "- exchangeable (also known as *conditionally* independently and identically distributed)  \n",
    "\n",
    "\n",
    "Understanding these concepts is essential for appreciating how Bayesian updating\n",
    "works.\n",
    "\n",
    "You can read about exchangeability [here](https://en.wikipedia.org/wiki/Exchangeable_random_variables).\n",
    "\n",
    "Because another term for **exchangeable** is **conditionally independent**,  we want   to convey an answer to the question *conditional on what?*\n",
    "\n",
    "We also tell why  an assumption of independence precludes  learning while\n",
    "an assumption of conditional independence makes learning possible.\n",
    "\n",
    "Below, we’ll often use\n",
    "\n",
    "- $ W $ to denote a random variable  \n",
    "- $ w $ to denote a particular realization of a random variable $ W $  \n",
    "\n",
    "\n",
    "Let’s start with some imports:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "48fede3e",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "import matplotlib.pyplot as plt\n",
    "plt.rcParams[\"figure.figsize\"] = (11, 5)  #set default figure size\n",
    "from numba import jit, vectorize\n",
    "from math import gamma\n",
    "import scipy.optimize as op\n",
    "from scipy.integrate import quad\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4e51fb62",
   "metadata": {},
   "source": [
    "## Independently and Identically Distributed\n",
    "\n",
    "We begin by looking at the notion of an  **independently and identically  distributed sequence** of random variables.\n",
    "\n",
    "An independently and identically distributed sequence is often abbreviated as IID.\n",
    "\n",
    "Two notions are involved\n",
    "\n",
    "- **independence**  \n",
    "- **identically distributed**  \n",
    "\n",
    "\n",
    "A sequence $ W_0, W_1, \\ldots $ is **independently distributed** if the joint probability density\n",
    "of the sequence is the **product** of the densities of the  components of the sequence.\n",
    "\n",
    "The sequence $ W_0, W_1, \\ldots $ is **independently and identically distributed** (IID) if in addition the marginal\n",
    "density of $ W_t $ is the same for all $ t =0, 1, \\ldots $.\n",
    "\n",
    "For example,  let $ p(W_0, W_1, \\ldots) $ be the **joint density** of the sequence and\n",
    "let $ p(W_t) $ be the **marginal density** for a particular $ W_t $ for all $ t =0, 1, \\ldots $.\n",
    "\n",
    "Then the joint density of the sequence $ W_0, W_1, \\ldots $ is IID if\n",
    "\n",
    "$$\n",
    "p(W_0, W_1, \\ldots) =  p(W_0) p(W_1) \\cdots\n",
    "$$\n",
    "\n",
    "so that the joint density is the product of a sequence of identical marginal densities."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d78458d6",
   "metadata": {},
   "source": [
    "### IID Means Past Observations Don’t Tell Us Anything About Future Observations\n",
    "\n",
    "If a sequence is random variables is IID, past information provides no information about future realizations.\n",
    "\n",
    "Therefore, there is **nothing to learn** from the past  about the future.\n",
    "\n",
    "To understand these statements, let the joint distribution of a sequence of random variables $ \\{W_t\\}_{t=0}^T $\n",
    "that is not necessarily IID be\n",
    "\n",
    "$$\n",
    "p(W_T, W_{T-1}, \\ldots, W_1, W_0)\n",
    "$$\n",
    "\n",
    "Using the laws of probability, we can always factor such a joint density into a product of conditional densities:\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "  p(W_T, W_{T-1}, \\ldots, W_1, W_0)    = & p(W_T | W_{T-1}, \\ldots, W_0) p(W_{T-1} | W_{T-2}, \\ldots, W_0) \\cdots  \\cr\n",
    "  & \\quad \\quad \\cdots p(W_1 | W_0) p(W_0)\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "In general,\n",
    "\n",
    "$$\n",
    "p(W_t | W_{t-1}, \\ldots, W_0)   \\neq   p(W_t)\n",
    "$$\n",
    "\n",
    "which states that the **conditional density** on the left side does not equal the **marginal density** on the right side.\n",
    "\n",
    "But in the special IID case,\n",
    "\n",
    "$$\n",
    "p(W_t | W_{t-1}, \\ldots, W_0)   =  p(W_t) ,\n",
    "$$\n",
    "\n",
    "so that the  partial history $ W_{t-1}, \\ldots, W_0 $ contains no information about the probability of $ W_t $.\n",
    "\n",
    "So in the IID case, there is **nothing to learn** about the densities of future random variables from past random variables.\n",
    "\n",
    "But when the sequence is not IID, there is something to learn about the future from observations of past random variables.\n",
    "\n",
    "We turn next to an instance of the general case in which the sequence is not IID.\n",
    "\n",
    "Please watch for what can be  learned from the past and when."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f6e4a6d",
   "metadata": {},
   "source": [
    "## A Setting in Which Past Observations Are Informative\n",
    "\n",
    "Let $ \\{W_t\\}_{t=0}^\\infty $ be a sequence of nonnegative\n",
    "scalar random variables with a joint probability distribution\n",
    "constructed as follows.\n",
    "\n",
    "There are two distinct cumulative distribution functions $ F $ and $ G $ that have  densities $ f $ and $ g $, respectively,  for a nonnegative scalar random\n",
    "variable $ W $.\n",
    "\n",
    "Before the start of time, say at time $ t= -1 $, “nature” once and for\n",
    "all selects **either** $ f $ **or** $ g $.\n",
    "\n",
    "Thereafter at each time\n",
    "$ t \\geq 0 $, nature  draws a random variable $ W_t $ from the selected\n",
    "distribution.\n",
    "\n",
    "So  the data are permanently generated as independently and identically distributed (IID) draws from **either** $ F $ **or**\n",
    "$ G $.\n",
    "\n",
    "We could say that *objectively*, meaning *after* nature has chosen either $ F $ or $ G $, the probability that the data are generated as draws from $ F $ is either $ 0 $\n",
    "or $ 1 $.\n",
    "\n",
    "We now drop into this setting a partially informed decision maker who\n",
    "\n",
    "- knows both $ F $ and $ G $, but  \n",
    "- does not know whether  at $ t = -1 $ nature had drawn   $ F $ or whether nature had drawn   $ G $ once-and-for-all  \n",
    "\n",
    "\n",
    "Thus, although our decision maker knows $ F $ and knows $ G $, he does not know which of these two known distributions nature had selected to draw from.\n",
    "\n",
    "The decision maker describes his ignorance with a **subjective probability**\n",
    "$ \\tilde \\pi $ and reasons as if  nature had selected $ F $ with probability\n",
    "$ \\tilde \\pi \\in (0,1) $ and\n",
    "$ G $ with probability $ 1 - \\tilde \\pi $.\n",
    "\n",
    "Thus, we  assume that the decision maker\n",
    "\n",
    "- **knows** both $ F $ and $ G $  \n",
    "- **doesn’t know** which of these two distributions that nature has drawn  \n",
    "- expresses  his ignorance by **acting as if** or **thinking that** nature chose distribution $ F $ with probability $ \\tilde \\pi \\in (0,1) $ and distribution\n",
    "  $ G $ with probability $ 1 - \\tilde \\pi $  \n",
    "- at date $ t \\geq 0 $ knows  the partial history $ w_t, w_{t-1}, \\ldots, w_0 $  \n",
    "\n",
    "\n",
    "To proceed, we want to know the decision maker’s belief about the joint distribution of the partial history.\n",
    "\n",
    "We’ll discuss that next and in the process describe the concept of **exchangeability**."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8d1a0d1b",
   "metadata": {},
   "source": [
    "## Relationship Between IID and Exchangeable\n",
    "\n",
    "Conditional on nature selecting $ F $, the joint density of the\n",
    "sequence $ W_0, W_1, \\ldots $ is\n",
    "\n",
    "$$\n",
    "f(W_0) f(W_1) \\cdots\n",
    "$$\n",
    "\n",
    "Conditional on nature selecting $ G $, the joint density of the\n",
    "sequence $ W_0, W_1, \\ldots $ is\n",
    "\n",
    "$$\n",
    "g(W_0) g(W_1) \\cdots\n",
    "$$\n",
    "\n",
    "Thus,  **conditional on nature having selected** $ F $, the\n",
    "sequence $ W_0, W_1, \\ldots $ is independently and\n",
    "identically distributed.\n",
    "\n",
    "Furthermore,  **conditional on nature having\n",
    "selected** $ G $, the sequence $ W_0, W_1, \\ldots $ is also\n",
    "independently and identically distributed.\n",
    "\n",
    "But what about the **unconditional distribution** of a partial history?\n",
    "\n",
    "The unconditional distribution of $ W_0, W_1, \\ldots $ is\n",
    "evidently\n",
    "\n",
    "\n",
    "<a id='equation-eq-definetti'></a>\n",
    "$$\n",
    "h(W_0, W_1, \\ldots ) \\equiv \\tilde \\pi [f(W_0) f(W_1) \\cdots \\ ] + ( 1- \\tilde \\pi) [g(W_0) g(W_1) \\cdots \\ ] \\tag{24.1}\n",
    "$$\n",
    "\n",
    "Under the unconditional distribution $ h(W_0, W_1, \\ldots ) $, the\n",
    "sequence $ W_0, W_1, \\ldots $ is **not** independently and\n",
    "identically distributed.\n",
    "\n",
    "To verify this claim, it is sufficient to notice, for example, that\n",
    "\n",
    "$$\n",
    "h(W_0, W_1) = \\tilde \\pi f(W_0)f (W_1) + (1 - \\tilde \\pi) g(W_0)g(W_1) \\neq\n",
    "              (\\tilde \\pi f(W_0) + (1-\\tilde \\pi) g(W_0))(\n",
    "               \\tilde \\pi f(W_1) + (1-\\tilde \\pi) g(W_1))\n",
    "$$\n",
    "\n",
    "Thus, the conditional distribution\n",
    "\n",
    "$$\n",
    "h(W_1 | W_0) \\equiv \\frac{h(W_0, W_1)}{(\\tilde \\pi f(W_0) + (1-\\tilde \\pi) g(W_0))}\n",
    " \\neq ( \\tilde \\pi f(W_1) + (1-\\tilde \\pi) g(W_1))\n",
    "$$\n",
    "\n",
    "This means that  random variable  $ W_0 $ contains information about random variable  $ W_1 $.\n",
    "\n",
    "So there is something to learn from the past about the future."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ab69a243",
   "metadata": {},
   "source": [
    "## Exchangeability\n",
    "\n",
    "While the sequence $ W_0, W_1, \\ldots $ is not IID, it can be verified that it is\n",
    "**exchangeable**, which means that the   joint distributions $ h(W_0, W_1) $ and $ h(W_1, W_0) $ of the ‘‘re-ordered’’ sequences\n",
    "satisfy\n",
    "\n",
    "$$\n",
    "h(W_0, W_1) = h(W_1, W_0)\n",
    "$$\n",
    "\n",
    "and so on.\n",
    "\n",
    "More generally, a sequence of random variables is said to be **exchangeable** if  the  joint probability distribution\n",
    "for a sequence does not change when the positions in the sequence in which finitely many  of random variables\n",
    "appear are altered.\n",
    "\n",
    "Equation [(24.1)](#equation-eq-definetti) represents our instance of an exchangeable joint density over a sequence of random\n",
    "variables  as a **mixture**  of  two IID joint densities over a sequence of random variables.\n",
    "\n",
    "A Bayesian statistician interprets the mixing parameter $ \\tilde \\pi \\in (0,1) $ as a decision maker’s subjective belief – the decision maker’s  **prior probability**  – that nature had  selected probability distribution $ F $.\n",
    "\n",
    ">**Note**\n",
    ">\n",
    ">DeFinetti [[de Finetti, 1937](https://python.quantecon.org/zreferences.html#id47)] established a related representation of an exchangeable process created by mixing\n",
    "sequences of IID Bernoulli random variables with parameter $ \\theta \\in (0,1) $ and mixing probability density $ \\pi(\\theta) $\n",
    "that a Bayesian statistician would interpret as a prior over the unknown\n",
    "Bernoulli parameter $ \\theta $."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "943bd4d8",
   "metadata": {},
   "source": [
    "## Bayes’ Law\n",
    "\n",
    "We noted above that in our example model there is something to learn about about the future from past data drawn\n",
    "from our particular instance of a process that is exchangeable but not IID.\n",
    "\n",
    "But how can we learn?\n",
    "\n",
    "And about what?\n",
    "\n",
    "The answer to the *about what* question is  $ \\tilde \\pi $.\n",
    "\n",
    "The answer to the *how* question is to use  Bayes’ Law.\n",
    "\n",
    "Another way to say *use Bayes’ Law* is to say *from a (subjective) joint distribution, compute an appropriate conditional distribution*.\n",
    "\n",
    "Let’s dive into Bayes’ Law in this context.\n",
    "\n",
    "Let $ q $ represent the distribution that nature actually draws $ w $\n",
    "from and let\n",
    "\n",
    "$$\n",
    "\\pi = \\mathbb{P}\\{q = f \\}\n",
    "$$\n",
    "\n",
    "where we regard $ \\pi $ as a decision maker’s **subjective probability**  (also called a **personal probability**).\n",
    "\n",
    "Suppose that at $ t \\geq 0 $, the decision maker has  observed a history\n",
    "$ w^t \\equiv [w_t, w_{t-1}, \\ldots, w_0] $.\n",
    "\n",
    "We let\n",
    "\n",
    "$$\n",
    "\\pi_t  = \\mathbb{P}\\{q = f  | w^t \\}\n",
    "$$\n",
    "\n",
    "where we adopt the convention\n",
    "\n",
    "$$\n",
    "\\pi_{-1}  = \\tilde \\pi\n",
    "$$\n",
    "\n",
    "The distribution of $ w_{t+1} $ conditional on $ w^t $ is then\n",
    "\n",
    "$$\n",
    "\\pi_t f + (1 - \\pi_t) g .\n",
    "$$\n",
    "\n",
    "Bayes’ rule for updating $ \\pi_{t+1} $ is\n",
    "\n",
    "\n",
    "<a id='equation-eq-bayes102'></a>\n",
    "$$\n",
    "\\pi_{t+1} = \\frac{\\pi_t f(w_{t+1})}{\\pi_t f(w_{t+1}) + (1 - \\pi_t) g(w_{t+1})} \\tag{24.2}\n",
    "$$\n",
    "\n",
    "Equation [(24.2)](#equation-eq-bayes102) follows from Bayes’ rule, which\n",
    "tells us that\n",
    "\n",
    "$$\n",
    "\\mathbb{P}\\{q = f \\,|\\, W = w\\}\n",
    "= \\frac{\\mathbb{P}\\{W = w \\,|\\, q = f\\}\\mathbb{P}\\{q = f\\}}\n",
    "{\\mathbb{P}\\{W = w\\}}\n",
    "$$\n",
    "\n",
    "where\n",
    "\n",
    "$$\n",
    "\\mathbb{P}\\{W = w\\} = \\sum_{a \\in \\{f, g\\}} \\mathbb{P}\\{W = w \\,|\\, q = a \\} \\mathbb{P}\\{q = a \\}\n",
    "$$"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b10ffebf",
   "metadata": {},
   "source": [
    "## More Details about Bayesian Updating\n",
    "\n",
    "Let’s stare at and rearrange Bayes’ Law as represented in equation [(24.2)](#equation-eq-bayes102) with the aim of understanding\n",
    "how the **posterior** probability $ \\pi_{t+1} $ is influenced by the **prior** probability $ \\pi_t $ and the **likelihood ratio**\n",
    "\n",
    "$$\n",
    "l(w) = \\frac{f(w)}{g(w)}\n",
    "$$\n",
    "\n",
    "It is convenient for us to rewrite the updating rule [(24.2)](#equation-eq-bayes102) as\n",
    "\n",
    "$$\n",
    "\\pi_{t+1}   =\\frac{\\pi_{t}f\\left(w_{t+1}\\right)}{\\pi_{t}f\\left(w_{t+1}\\right)+\\left(1-\\pi_{t}\\right)g\\left(w_{t+1}\\right)}\n",
    "    =\\frac{\\pi_{t}\\frac{f\\left(w_{t+1}\\right)}{g\\left(w_{t+1}\\right)}}{\\pi_{t}\\frac{f\\left(w_{t+1}\\right)}{g\\left(w_{t+1}\\right)}+\\left(1-\\pi_{t}\\right)}\n",
    "    =\\frac{\\pi_{t}l\\left(w_{t+1}\\right)}{\\pi_{t}l\\left(w_{t+1}\\right)+\\left(1-\\pi_{t}\\right)}\n",
    "$$\n",
    "\n",
    "This implies that\n",
    "\n",
    "\n",
    "<a id='equation-eq-bayes103'></a>\n",
    "$$\n",
    "\\frac{\\pi_{t+1}}{\\pi_{t}}=\\frac{l\\left(w_{t+1}\\right)}{\\pi_{t}l\\left(w_{t+1}\\right)+\\left(1-\\pi_{t}\\right)}\\begin{cases} >1 &\n",
    "\\text{if }l\\left(w_{t+1}\\right)>1\\\\\n",
    "\\leq1 & \\text{if }l\\left(w_{t+1}\\right)\\leq1\n",
    "\\end{cases} \\tag{24.3}\n",
    "$$\n",
    "\n",
    "Notice how the likelihood ratio and the prior interact to determine whether an observation $ w_{t+1} $ leads the decision maker\n",
    "to increase or decrease the subjective probability he/she attaches to distribution $ F $.\n",
    "\n",
    "When the likelihood ratio $ l(w_{t+1}) $ exceeds one, the observation $ w_{t+1} $ nudges the probability\n",
    "$ \\pi $ put on distribution $ F $ upward,\n",
    "and when the likelihood ratio $ l(w_{t+1}) $ is less that  one, the observation $ w_{t+1} $ nudges $ \\pi $ downward.\n",
    "\n",
    "Representation [(24.3)](#equation-eq-bayes103) is the foundation of some graphs that we’ll use to display the dynamics of\n",
    "$ \\{\\pi_t\\}_{t=0}^\\infty $ that are  induced by\n",
    "Bayes’ Law.\n",
    "\n",
    "We’ll plot $ l\\left(w\\right) $ as a way to enlighten us about how\n",
    "learning – i.e., Bayesian updating of the probability $ \\pi $ that\n",
    "nature has chosen distribution $ f $ – works.\n",
    "\n",
    "To create the Python infrastructure to do our work for us,  we construct a wrapper function that displays informative graphs\n",
    "given parameters of $ f $ and $ g $."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "ce0799a5",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "@vectorize\n",
    "def p(x, a, b):\n",
    "    \"The general beta distribution function.\"\n",
    "    r = gamma(a + b) / (gamma(a) * gamma(b))\n",
    "    return r * x ** (a-1) * (1 - x) ** (b-1)\n",
    "\n",
    "def learning_example(F_a=1, F_b=1, G_a=3, G_b=1.2):\n",
    "    \"\"\"\n",
    "    A wrapper function that displays the updating rule of belief π,\n",
    "    given the parameters which specify F and G distributions.\n",
    "    \"\"\"\n",
    "\n",
    "    f = jit(lambda x: p(x, F_a, F_b))\n",
    "    g = jit(lambda x: p(x, G_a, G_b))\n",
    "\n",
    "    # l(w) = f(w) / g(w)\n",
    "    l = lambda w: f(w) / g(w)\n",
    "    # objective function for solving l(w) = 1\n",
    "    obj = lambda w: l(w) - 1\n",
    "\n",
    "    x_grid = np.linspace(0, 1, 100)\n",
    "    π_grid = np.linspace(1e-3, 1-1e-3, 100)\n",
    "\n",
    "    w_max = 1\n",
    "    w_grid = np.linspace(1e-12, w_max-1e-12, 100)\n",
    "\n",
    "    # the mode of beta distribution\n",
    "    # use this to divide w into two intervals for root finding\n",
    "    G_mode = (G_a - 1) / (G_a + G_b - 2)\n",
    "    roots = np.empty(2)\n",
    "    roots[0] = op.root_scalar(obj, bracket=[1e-10, G_mode]).root\n",
    "    roots[1] = op.root_scalar(obj, bracket=[G_mode, 1-1e-10]).root\n",
    "\n",
    "    fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(18, 5))\n",
    "\n",
    "    ax1.plot(l(w_grid), w_grid, label='$l$', lw=2)\n",
    "    ax1.vlines(1., 0., 1., linestyle=\"--\")\n",
    "    ax1.hlines(roots, 0., 2., linestyle=\"--\")\n",
    "    ax1.set_xlim([0., 2.])\n",
    "    ax1.legend(loc=4)\n",
    "    ax1.set(xlabel='$l(w)=f(w)/g(w)$', ylabel='$w$')\n",
    "\n",
    "    ax2.plot(f(x_grid), x_grid, label='$f$', lw=2)\n",
    "    ax2.plot(g(x_grid), x_grid, label='$g$', lw=2)\n",
    "    ax2.vlines(1., 0., 1., linestyle=\"--\")\n",
    "    ax2.hlines(roots, 0., 2., linestyle=\"--\")\n",
    "    ax2.legend(loc=4)\n",
    "    ax2.set(xlabel='$f(w), g(w)$', ylabel='$w$')\n",
    "\n",
    "    area1 = quad(f, 0, roots[0])[0]\n",
    "    area2 = quad(g, roots[0], roots[1])[0]\n",
    "    area3 = quad(f, roots[1], 1)[0]\n",
    "\n",
    "    ax2.text((f(0) + f(roots[0])) / 4, roots[0] / 2, f\"{area1: .3g}\")\n",
    "    ax2.fill_between([0, 1], 0, roots[0], color='blue', alpha=0.15)\n",
    "    ax2.text(np.mean(g(roots)) / 2, np.mean(roots), f\"{area2: .3g}\")\n",
    "    w_roots = np.linspace(roots[0], roots[1], 20)\n",
    "    ax2.fill_betweenx(w_roots, 0, g(w_roots), color='orange', alpha=0.15)\n",
    "    ax2.text((f(roots[1]) + f(1)) / 4, (roots[1] + 1) / 2, f\"{area3: .3g}\")\n",
    "    ax2.fill_between([0, 1], roots[1], 1, color='blue', alpha=0.15)\n",
    "\n",
    "    W = np.arange(0.01, 0.99, 0.08)\n",
    "    Π = np.arange(0.01, 0.99, 0.08)\n",
    "\n",
    "    ΔW = np.zeros((len(W), len(Π)))\n",
    "    ΔΠ = np.empty((len(W), len(Π)))\n",
    "    for i, w in enumerate(W):\n",
    "        for j, π in enumerate(Π):\n",
    "            lw = l(w)\n",
    "            ΔΠ[i, j] = π * (lw / (π * lw + 1 - π) - 1)\n",
    "\n",
    "    q = ax3.quiver(Π, W, ΔΠ, ΔW, scale=2, color='r', alpha=0.8)\n",
    "\n",
    "    ax3.fill_between(π_grid, 0, roots[0], color='blue', alpha=0.15)\n",
    "    ax3.fill_between(π_grid, roots[0], roots[1], color='green', alpha=0.15)\n",
    "    ax3.fill_between(π_grid, roots[1], w_max, color='blue', alpha=0.15)\n",
    "    ax3.hlines(roots, 0., 1., linestyle=\"--\")\n",
    "    ax3.set(xlabel='$\\pi$', ylabel='$w$')\n",
    "    ax3.grid()\n",
    "\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d9456469",
   "metadata": {},
   "source": [
    "Now we’ll create a group of graphs that illustrate  dynamics induced by Bayes’ Law.\n",
    "\n",
    "We’ll begin with Python function default values of various objects, then change them in a subsequent example."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "81a24f01",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "learning_example()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8f20274a",
   "metadata": {},
   "source": [
    "Please look at the three graphs above created for an instance in which $ f $ is a uniform distribution on $ [0,1] $\n",
    "(i.e., a Beta distribution with parameters $ F_a=1, F_b=1 $), while  $ g $ is a Beta distribution with the default parameter values $ G_a=3, G_b=1.2 $.\n",
    "\n",
    "The graph on the left  plots the likelihood ratio $ l(w) $ as the absciassa  axis against $ w $ as the ordinate.\n",
    "\n",
    "The middle graph plots both $ f(w) $ and $ g(w) $  against $ w $, with the horizontal dotted lines showing values\n",
    "of $ w $ at which the likelihood ratio equals $ 1 $.\n",
    "\n",
    "The graph on the right plots arrows to the right that show when Bayes’ Law  makes $ \\pi $ increase and arrows\n",
    "to the left that show when Bayes’ Law make $ \\pi $ decrease.\n",
    "\n",
    "Lengths of the arrows  show  magnitudes of the force from Bayes’ Law impelling $ \\pi $ to change.\n",
    "\n",
    "These lengths depend on both the prior probability $ \\pi $ on the abscissa axis and the evidence in the form of the current draw of\n",
    "$ w $ on the ordinate axis.\n",
    "\n",
    "The fractions in the colored areas of the middle graphs are probabilities under $ F $ and $ G $, respectively,\n",
    "that  realizations of $ w $ fall\n",
    "into the interval that updates the belief $ \\pi $ in a correct direction (i.e., toward $ 0 $ when $ G $ is the true\n",
    "distribution, and toward $ 1 $ when $ F $ is the true distribution).\n",
    "\n",
    "For example,\n",
    "in the above  example, under true distribution $ F $,  $ \\pi $ will  be updated toward $ 0 $ if $ w $ falls into the interval\n",
    "$ [0.524, 0.999] $, which occurs with probability $ 1 - .524 = .476 $ under $ F $.\n",
    "\n",
    "But this\n",
    "would occur with probability\n",
    "$ 0.816 $ if $ G $ were the true distribution.\n",
    "\n",
    "The fraction $ 0.816 $\n",
    "in the orange region is the integral of $ g(w) $ over this interval.\n",
    "\n",
    "Next we use our code to create graphs for another instance of our model.\n",
    "\n",
    "We keep $ F $ the same as in the preceding instance, namely a uniform distribution, but now assume that $ G $\n",
    "is a Beta distribution with parameters $ G_a=2, G_b=1.6 $."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5101609e",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "learning_example(G_a=2, G_b=1.6)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3da5955b",
   "metadata": {},
   "source": [
    "Notice how the likelihood ratio, the middle graph, and the arrows compare with the previous instance of our example."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4ea1a4ee",
   "metadata": {},
   "source": [
    "## Appendix"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "43ce8bd8",
   "metadata": {},
   "source": [
    "### Sample Paths of $ \\pi_t $\n",
    "\n",
    "Now we’ll have some fun by plotting multiple realizations of sample paths of $ \\pi_t $ under two possible\n",
    "assumptions about nature’s choice of distribution, namely\n",
    "\n",
    "- that nature permanently draws from $ F $  \n",
    "- that nature permanently draws from $ G $  \n",
    "\n",
    "\n",
    "Outcomes depend on a peculiar property of likelihood ratio processes  discussed in\n",
    "[this lecture](https://python-advanced.quantecon.org/additive_functionals.html).\n",
    "\n",
    "To proceed, we create some Python code."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f0ebb0cf",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "def function_factory(F_a=1, F_b=1, G_a=3, G_b=1.2):\n",
    "\n",
    "    # define f and g\n",
    "    f = jit(lambda x: p(x, F_a, F_b))\n",
    "    g = jit(lambda x: p(x, G_a, G_b))\n",
    "\n",
    "    @jit\n",
    "    def update(a, b, π):\n",
    "        \"Update π by drawing from beta distribution with parameters a and b\"\n",
    "\n",
    "        # Draw\n",
    "        w = np.random.beta(a, b)\n",
    "\n",
    "        # Update belief\n",
    "        π = 1 / (1 + ((1 - π) * g(w)) / (π * f(w)))\n",
    "\n",
    "        return π\n",
    "\n",
    "    @jit\n",
    "    def simulate_path(a, b, T=50):\n",
    "        \"Simulates a path of beliefs π with length T\"\n",
    "\n",
    "        π = np.empty(T+1)\n",
    "\n",
    "        # initial condition\n",
    "        π[0] = 0.5\n",
    "\n",
    "        for t in range(1, T+1):\n",
    "            π[t] = update(a, b, π[t-1])\n",
    "\n",
    "        return π\n",
    "\n",
    "    def simulate(a=1, b=1, T=50, N=200, display=True):\n",
    "        \"Simulates N paths of beliefs π with length T\"\n",
    "\n",
    "        π_paths = np.empty((N, T+1))\n",
    "        if display:\n",
    "            fig = plt.figure()\n",
    "\n",
    "        for i in range(N):\n",
    "            π_paths[i] = simulate_path(a=a, b=b, T=T)\n",
    "            if display:\n",
    "                plt.plot(range(T+1), π_paths[i], color='b', lw=0.8, alpha=0.5)\n",
    "\n",
    "        if display:\n",
    "            plt.show()\n",
    "\n",
    "        return π_paths\n",
    "\n",
    "    return simulate"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7871a0cb",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "simulate = function_factory()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b7fbaa55",
   "metadata": {},
   "source": [
    "We begin by generating $ N $ simulated $ \\{\\pi_t\\} $ paths with $ T $\n",
    "periods when the sequence is truly IID draws from $ F $. We set an initial prior $ \\pi_{-1} = .5 $."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7e0b5d19",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "T = 50"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "cf3037ae",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "# when nature selects F\n",
    "π_paths_F = simulate(a=1, b=1, T=T, N=1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "095c1958",
   "metadata": {},
   "source": [
    "In the above example,  for most paths $ \\pi_t \\rightarrow 1 $.\n",
    "\n",
    "So Bayes’ Law evidently eventually\n",
    "discovers the truth for most of our paths.\n",
    "\n",
    "Next, we generate paths with $ T $\n",
    "periods when the sequence is truly IID draws from $ G $. Again, we set the initial prior $ \\pi_{-1} = .5 $."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "bd669fc7",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "# when nature selects G\n",
    "π_paths_G = simulate(a=3, b=1.2, T=T, N=1000)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d75732f2",
   "metadata": {},
   "source": [
    "In the above graph we observe that now  most paths $ \\pi_t \\rightarrow 0 $."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "584efee4",
   "metadata": {},
   "source": [
    "### Rates of convergence\n",
    "\n",
    "We study rates of  convergence of $ \\pi_t $ to $ 1 $ when nature generates the data as IID draws from $ F $\n",
    "and of convergence of $ \\pi_t $ to $ 0 $ when nature generates  IID draws from $ G $.\n",
    "\n",
    "We do this by averaging across simulated paths of $ \\{\\pi_t\\}_{t=0}^T $.\n",
    "\n",
    "Using   $ N $ simulated $ \\pi_t $ paths, we compute\n",
    "$ 1 - \\sum_{i=1}^{N}\\pi_{i,t} $ at each $ t $ when the data are generated as draws from  $ F $\n",
    "and compute $ \\sum_{i=1}^{N}\\pi_{i,t} $ when the data are generated as draws from $ G $."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d1282448",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "plt.plot(range(T+1), 1 - np.mean(π_paths_F, 0), label='F generates')\n",
    "plt.plot(range(T+1), np.mean(π_paths_G, 0), label='G generates')\n",
    "plt.legend()\n",
    "plt.title(\"convergence\");"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ea8bce2c",
   "metadata": {},
   "source": [
    "From the above graph, rates of convergence appear not to depend on whether $ F $ or $ G $ generates the data."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d3826980",
   "metadata": {},
   "source": [
    "### Graph of Ensemble Dynamics of $ \\pi_t $\n",
    "\n",
    "More insights about the dynamics of $ \\{\\pi_t\\} $ can be gleaned by computing\n",
    "conditional expectations of $ \\frac{\\pi_{t+1}}{\\pi_{t}} $ as functions of $ \\pi_t $ via integration with respect\n",
    "to the pertinent probability distribution:\n",
    "\n",
    "$$\n",
    "\\begin{aligned}\n",
    "E\\left[\\frac{\\pi_{t+1}}{\\pi_{t}}\\biggm|q=a, \\pi_{t}\\right] &=E\\left[\\frac{l\\left(w_{t+1}\\right)}{\\pi_{t}l\\left(w_{t+1}\\right)+\\left(1-\\pi_{t}\\right)}\\biggm|q= a, \\pi_{t}\\right], \\\\\n",
    "    &=\\int_{0}^{1}\\frac{l\\left(w_{t+1}\\right)}{\\pi_{t}l\\left(w_{t+1}\\right)+\\left(1-\\pi_{t}\\right)} a\\left(w_{t+1}\\right)dw_{t+1}\n",
    "\\end{aligned}\n",
    "$$\n",
    "\n",
    "where $ a =f,g $.\n",
    "\n",
    "The following code approximates the integral above:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "0f9e42cc",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "def expected_ratio(F_a=1, F_b=1, G_a=3, G_b=1.2):\n",
    "\n",
    "    # define f and g\n",
    "    f = jit(lambda x: p(x, F_a, F_b))\n",
    "    g = jit(lambda x: p(x, G_a, G_b))\n",
    "\n",
    "    l = lambda w: f(w) / g(w)\n",
    "    integrand_f = lambda w, π: f(w) * l(w) / (π * l(w) + 1 - π)\n",
    "    integrand_g = lambda w, π: g(w) * l(w) / (π * l(w) + 1 - π)\n",
    "\n",
    "    π_grid = np.linspace(0.02, 0.98, 100)\n",
    "\n",
    "    expected_rario = np.empty(len(π_grid))\n",
    "    for q, inte in zip([\"f\", \"g\"], [integrand_f, integrand_g]):\n",
    "        for i, π in enumerate(π_grid):\n",
    "            expected_rario[i]= quad(inte, 0, 1, args=(π,))[0]\n",
    "        plt.plot(π_grid, expected_rario, label=f\"{q} generates\")\n",
    "\n",
    "    plt.hlines(1, 0, 1, linestyle=\"--\")\n",
    "    plt.xlabel(\"$π_t$\")\n",
    "    plt.ylabel(\"$E[\\pi_{t+1}/\\pi_t]$\")\n",
    "    plt.legend()\n",
    "\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f5f513ea",
   "metadata": {},
   "source": [
    "First, consider the case where $ F_a=F_b=1 $ and\n",
    "$ G_a=3, G_b=1.2 $."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "94dbc732",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "expected_ratio()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "40dc1198",
   "metadata": {},
   "source": [
    "The above graphs shows that when $ F $ generates the data, $ \\pi_t $ on average always heads north, while\n",
    "when $ G $ generates the data, $ \\pi_t $ heads south.\n",
    "\n",
    "Next, we’ll look at a degenerate case in whcih  $ f $ and $ g $ are identical beta\n",
    "distributions, and $ F_a=G_a=3, F_b=G_b=1.2 $.\n",
    "\n",
    "In a sense, here  there\n",
    "is nothing to learn."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "e64bcd4a",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "expected_ratio(F_a=3, F_b=1.2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "802548bc",
   "metadata": {},
   "source": [
    "The above graph says that $ \\pi_t $ is inert and  remains at its initial value.\n",
    "\n",
    "Finally, let’s look at a case in which  $ f $ and $ g $ are neither very\n",
    "different nor identical, in particular one in which  $ F_a=2, F_b=1 $ and\n",
    "$ G_a=3, G_b=1.2 $."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "059eec68",
   "metadata": {
    "hide-output": false
   },
   "outputs": [],
   "source": [
    "expected_ratio(F_a=2, F_b=1, G_a=3, G_b=1.2)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5400f3e0",
   "metadata": {},
   "source": [
    "## Sequels\n",
    "\n",
    "We’ll apply and dig deeper into some of the ideas presented in this lecture:\n",
    "\n",
    "- [this lecture](https://python.quantecon.org/likelihood_ratio_process.html) describes **likelihood ratio processes**\n",
    "  and their role in frequentist and Bayesian statistical theories  \n",
    "- [this lecture](https://python.quantecon.org/navy_captain.html) studies  whether a World War II US Navy Captain’s hunch that a (frequentist) decision rule that the Navy had told\n",
    "  him to use was  inferior to a sequential rule that Abraham\n",
    "  Wald had not yet designed.  "
   ]
  }
 ],
 "metadata": {
  "date": 1751452616.155862,
  "filename": "exchangeable.md",
  "kernelspec": {
   "display_name": "Python",
   "language": "python3",
   "name": "python3"
  },
  "title": "Exchangeability and Bayesian Updating"
 },
 "nbformat": 4,
 "nbformat_minor": 5
}