{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Bayesian versus Frequentist Decision Rules" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Contents\n", "\n", "- [Bayesian versus Frequentist Decision Rules](#Bayesian-versus-Frequentist-Decision-Rules) \n", " - [Overview](#Overview) \n", " - [Setup](#Setup) \n", " - [Frequentist Decision Rule](#Frequentist-Decision-Rule) \n", " - [Bayesian Decision Rule](#Bayesian-Decision-Rule) \n", " - [Was the Navy Captain’s hunch correct?](#Was-the-Navy-Captain’s-hunch-correct?) \n", " - [More details](#More-details) " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In addition to what’s in Anaconda, this lecture will need the following libraries:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "!conda install -y quantecon" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "%matplotlib inline\n", "import matplotlib.pyplot as plt\n", "plt.rcParams[\"figure.figsize\"] = (11, 5) #set default figure size\n", "import numpy as np\n", "from numba import njit, prange, float64, int64\n", "from numba.experimental import jitclass\n", "from interpolation import interp\n", "from math import gamma\n", "from scipy.optimize import minimize" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n", "\n", "This lecture follows up on ideas presented in the following lectures:\n", "\n", "- [A Problem that Stumped Milton Friedman](https://python.quantecon.org/wald_friedman.html) \n", "- [Exchangeability and Bayesian Updating](https://python.quantecon.org/exchangeable.html) \n", "- [Likelihood Ratio Processes](https://python.quantecon.org/likelihood_ratio_process.html) \n", "\n", "\n", "In [A Problem that Stumped Milton Friedman](https://python.quantecon.org/wald_friedman.html) we described a problem\n", "that a Navy Captain presented to Milton Friedman during World War II.\n", "\n", "The Navy had instructed the Captain to use a decision rule for quality control that the Captain suspected\n", "could be dominated by a better rule.\n", "\n", "(The Navy had ordered the Captain to use an instance of a **frequentist decision rule**.)\n", "\n", "Milton Friedman recognized the Captain’s conjecture as posing a challenging statistical problem that he and other\n", "members of the US Government’s Statistical Research Group at Columbia University proceeded to try to solve.\n", "\n", "One of the members of the group, the great mathematician Abraham Wald, soon solved the problem.\n", "\n", "A good way to formulate the problem is to use some ideas from Bayesian statistics that we describe in\n", "this lecture [Exchangeability and Bayesian Updating](https://python.quantecon.org/exchangeable.html) and in this lecture\n", "[Likelihood Ratio Processes](https://python.quantecon.org/likelihood_ratio_process.html), which describes the link between Bayesian\n", "updating and likelihood ratio processes.\n", "\n", "The present lecture uses Python to generate simulations that evaluate expected losses under **frequentist** and **Bayesian**\n", "decision rules for an instance of the Navy Captain’s decision problem.\n", "\n", "The simulations validate the Navy Captain’s hunch that there is a better rule than the one the Navy had ordered him\n", "to use." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Setup\n", "\n", "To formalize the problem of the Navy Captain whose questions posed the\n", "problem that Milton Friedman and Allan Wallis handed over to Abraham\n", "Wald, we consider a setting with the following parts.\n", "\n", "- Each period a decision maker draws a non-negative random variable\n", " $Z$ from a probability distribution that he does not completely\n", " understand. He knows that two probability distributions are possible,\n", " $f_{0}$ and $f_{1}$, and that which ever distribution it\n", " is remains fixed over time. The decision maker believes that before\n", " the beginning of time, nature once and for all selected either\n", " $f_{0}$ or $f_1$ and that the probability that it\n", " selected $f_0$ is probability $\\pi^{*}$. \n", "- The decision maker observes a sample\n", " $\\left\\{ z_{i}\\right\\} _{i=0}^{t}$ from the the distribution\n", " chosen by nature. \n", "\n", "\n", "The decision maker wants to decide which distribution actually governs\n", "$Z$ and is worried by two types of errors and the losses that they\n", "impose on him.\n", "\n", "- a loss $\\bar L_{1}$ from a **type I error** that occurs when he decides that\n", " $f=f_{1}$ when actually $f=f_{0}$ \n", "- a loss $\\bar L_{0}$ from a **type II error** that occurs when he decides that\n", " $f=f_{0}$ when actually $f=f_{1}$ \n", "\n", "\n", "The decision maker pays a cost $c$ for drawing\n", "another $z$\n", "\n", "We mainly borrow parameters from the quantecon lecture\n", "[A Problem that Stumped Milton Friedman](https://python.quantecon.org/wald_friedman.html) except that we increase both $\\bar L_{0}$\n", "and $\\bar L_{1}$ from $25$ to $100$ to encourage the\n", "frequentist Navy Captain to take more draws before deciding.\n", "\n", "We set the cost $c$ of taking one more draw at $1.25$.\n", "\n", "We set the probability distributions $f_{0}$ and $f_{1}$ to\n", "be beta distributions with $a_{0}=b_{0}=1$, $a_{1}=3$, and\n", "$b_{1}=1.2$, respectively.\n", "\n", "Below is some Python code that sets up these objects." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "@njit\n", "def p(x, a, b):\n", " \"Beta distribution.\"\n", "\n", " r = gamma(a + b) / (gamma(a) * gamma(b))\n", "\n", " return r * x**(a-1) * (1 - x)**(b-1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We start with defining a jitclass that stores parameters and\n", "functions we need to solve problems for both the Bayesian and\n", "frequentist Navy Captains." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "wf_data = [\n", " ('c', float64), # unemployment compensation\n", " ('a0', float64), # parameters of beta distribution\n", " ('b0', float64),\n", " ('a1', float64),\n", " ('b1', float64),\n", " ('L0', float64), # cost of selecting f0 when f1 is true\n", " ('L1', float64), # cost of selecting f1 when f0 is true\n", " ('π_grid', float64[:]), # grid of beliefs π\n", " ('π_grid_size', int64),\n", " ('mc_size', int64), # size of Monto Carlo simulation\n", " ('z0', float64[:]), # sequence of random values\n", " ('z1', float64[:]) # sequence of random values\n", "]" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "@jitclass(wf_data)\n", "class WaldFriedman:\n", "\n", " def __init__(self,\n", " c=1.25,\n", " a0=1,\n", " b0=1,\n", " a1=3,\n", " b1=1.2,\n", " L0=100,\n", " L1=100,\n", " π_grid_size=200,\n", " mc_size=1000):\n", "\n", " self.c, self.π_grid_size = c, π_grid_size\n", " self.a0, self.b0, self.a1, self.b1 = a0, b0, a1, b1\n", " self.L0, self.L1 = L0, L1\n", " self.π_grid = np.linspace(0, 1, π_grid_size)\n", " self.mc_size = mc_size\n", "\n", " self.z0 = np.random.beta(a0, b0, mc_size)\n", " self.z1 = np.random.beta(a1, b1, mc_size)\n", "\n", " def f0(self, x):\n", "\n", " return p(x, self.a0, self.b0)\n", "\n", " def f1(self, x):\n", "\n", " return p(x, self.a1, self.b1)\n", "\n", " def κ(self, z, π):\n", " \"\"\"\n", " Updates π using Bayes' rule and the current observation z\n", " \"\"\"\n", "\n", " a0, b0, a1, b1 = self.a0, self.b0, self.a1, self.b1\n", "\n", " π_f0, π_f1 = π * p(z, a0, b0), (1 - π) * p(z, a1, b1)\n", " π_new = π_f0 / (π_f0 + π_f1)\n", "\n", " return π_new" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "wf = WaldFriedman()\n", "\n", "grid = np.linspace(0, 1, 50)\n", "\n", "plt.figure()\n", "\n", "plt.title(\"Two Distributions\")\n", "plt.plot(grid, wf.f0(grid), lw=2, label=\"$f_0$\")\n", "plt.plot(grid, wf.f1(grid), lw=2, label=\"$f_1$\")\n", "\n", "plt.legend()\n", "plt.xlabel(\"$z$ values\")\n", "plt.ylabel(\"density of $z_k$\")\n", "\n", "plt.tight_layout()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Above, we plot the two possible probability densities $f_0$ and\n", "$f_1$" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Frequentist Decision Rule\n", "\n", "The Navy told the Captain to use a frequentist decision rule.\n", "\n", "In particular, it gave him a decision rule that the Navy had designed by using\n", "frequentist statistical theory to minimize an\n", "expected loss function.\n", "\n", "That decision rule is characterized by a sample size $t$ and a\n", "cutoff $d$ associated with a likelihood ratio.\n", "\n", "Let\n", "$L\\left(z^{t}\\right)=\\prod_{i=0}^{t}\\frac{f_{0}\\left(z_{i}\\right)}{f_{1}\\left(z_{i}\\right)}$\n", "be the likelihood ratio associated with observing the sequence\n", "$\\left\\{ z_{i}\\right\\} _{i=0}^{t}$.\n", "\n", "The decision rule associated with a sample size $t$ is:\n", "\n", "- decide that $f_0$ is the distribution if the likelihood ratio\n", " is greater than $d$ \n", "\n", "\n", "To understand how that rule was engineered, let null and alternative\n", "hypotheses be\n", "\n", "- null: $H_{0}$: $f=f_{0}$, \n", "- alternative $H_{1}$: $f=f_{1}$. \n", "\n", "\n", "Given sample size $t$ and cutoff $d$, under the model\n", "described above, the mathematical expectation of total loss is\n", "\n", "\n", "\n", "\n", "\\begin{aligned}\n", "\\bar{V}_{fre}\\left(t,d\\right)=ct+\\pi^{*}PFA\\times \\bar L_{1}+\\left(1-\\pi^{*}\\right)\\left(1-PD\\right)\\times \\bar L_{0}\n", "\\end{aligned} \\tag{1}\n", "\n", "\n", "\n", "\\begin{aligned}\n", "\\textrm{where} \\quad PFA & =\\Pr\\left\\{ L\\left(z^{t}\\right) tol:\n", " h_new = Q(h, wf)\n", " error = np.max(np.abs(h - h_new))\n", " i += 1\n", " h = h_new\n", "\n", " if i == max_iter:\n", " print(\"Failed to converge!\")\n", "\n", " return h_new" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "h_star = solve_model(wf)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "@njit\n", "def find_cutoff_rule(wf, h):\n", "\n", " \"\"\"\n", " This function takes a continuation value function and returns the\n", " corresponding cutoffs of where you transition between continuing and\n", " choosing a specific model\n", " \"\"\"\n", "\n", " π_grid = wf.π_grid\n", " L0, L1 = wf.L0, wf.L1\n", "\n", " # Evaluate cost at all points on grid for choosing a model\n", " payoff_f0 = (1 - π_grid) * L0\n", " payoff_f1 = π_grid * L1\n", "\n", " # The cutoff points can be found by differencing these costs with\n", " # The Bellman equation (J is always less than or equal to p_c_i)\n", " β = π_grid[np.searchsorted(\n", " payoff_f1 - np.minimum(h, payoff_f0),\n", " 1e-10)\n", " - 1]\n", " α = π_grid[np.searchsorted(\n", " np.minimum(h, payoff_f1) - payoff_f0,\n", " 1e-10)\n", " - 1]\n", "\n", " return (β, α)\n", "\n", "β, α = find_cutoff_rule(wf, h_star)\n", "cost_L0 = (1 - wf.π_grid) * wf.L0\n", "cost_L1 = wf.π_grid * wf.L1\n", "\n", "fig, ax = plt.subplots(figsize=(10, 6))\n", "\n", "ax.plot(wf.π_grid, h_star, label='continuation value')\n", "ax.plot(wf.π_grid, cost_L1, label='choose f1')\n", "ax.plot(wf.π_grid, cost_L0, label='choose f0')\n", "ax.plot(wf.π_grid,\n", " np.amin(np.column_stack([h_star, cost_L0, cost_L1]),axis=1),\n", " lw=15, alpha=0.1, color='b', label='minimum cost')\n", "\n", "ax.annotate(r\"\\beta\", xy=(β + 0.01, 0.5), fontsize=14)\n", "ax.annotate(r\"\\alpha\", xy=(α + 0.01, 0.5), fontsize=14)\n", "\n", "plt.vlines(β, 0, β * wf.L0, linestyle=\"--\")\n", "plt.vlines(α, 0, (1 - α) * wf.L1, linestyle=\"--\")\n", "\n", "ax.set(xlim=(0, 1), ylim=(0, 0.5 * max(wf.L0, wf.L1)), ylabel=\"cost\",\n", " xlabel=\"\\pi\", title=\"Value function\")\n", "\n", "plt.legend(borderpad=1.1)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above figure portrays the value function plotted against the decision\n", "maker’s Bayesian posterior.\n", "\n", "It also shows the probabilities  \\alpha  and  \\beta .\n", "\n", "The Bayesian decision rule is:\n", "\n", "- accept  H_0  if  \\pi \\geq \\alpha  \n", "- accept  H_1  if  \\pi \\leq \\beta  \n", "- delay deciding and draw another  z  if\n", "  \\beta \\leq \\pi \\leq \\alpha  \n", "\n", "\n", "We can calculate two “objective” loss functions under this situation\n", "conditioning on knowing for sure that nature has selected  f_{0} ,\n", "in the first case, or  f_{1} , in the second case.\n", "\n", "1. under  f_{0} , \n", "\n", " V^{0}\\left(\\pi\\right)=\\begin{cases}\n", " 0 & \\text{if }\\alpha\\leq\\pi,\\\\\n", " c+EV^{0}\\left(\\pi^{\\prime}\\right) & \\text{if }\\beta\\leq\\pi<\\alpha,\\\\\n", " \\bar L_{1} & \\text{if }\\pi<\\beta.\n", " \\end{cases}\n", " $$\n", "1. under  f_{1}  \n", "$$\n", " V^{1}\\left(\\pi\\right)=\\begin{cases}\n", " \\bar L_{0} & \\text{if }\\alpha\\leq\\pi,\\\\\n", " c+EV^{1}\\left(\\pi^{\\prime}\\right) & \\text{if }\\beta\\leq\\pi<\\alpha,\\\\\n", " 0 & \\text{if }\\pi<\\beta.\n", " \\end{cases}\n", " $$\n", "\n", "\n", "where\n", " \\pi^{\\prime}=\\frac{\\pi f_{0}\\left(z^{\\prime}\\right)}{\\pi f_{0}\\left(z^{\\prime}\\right)+\\left(1-\\pi\\right)f_{1}\\left(z^{\\prime}\\right)} .\n", "\n", "Given a prior probability  \\pi_{0} , the expected loss for the\n", "Bayesian is\n", "\n", "$$\n", "\\bar{V}_{Bayes}\\left(\\pi_{0}\\right)=\\pi^{*}V^{0}\\left(\\pi_{0}\\right)+\\left(1-\\pi^{*}\\right)V^{1}\\left(\\pi_{0}\\right).\n", "\n", "\n", "Below we write some Python code that computes\n", "$V^{0}\\left(\\pi\\right)$ and $V^{1}\\left(\\pi\\right)$\n", "numerically." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "@njit(parallel=True)\n", "def V_q(wf, flag):\n", " V = np.zeros(wf.π_grid_size)\n", " if flag == 0:\n", " z_arr = wf.z0\n", " V[wf.π_grid < β] = wf.L1\n", " else:\n", " z_arr = wf.z1\n", " V[wf.π_grid >= α] = wf.L0\n", "\n", " V_old = np.empty_like(V)\n", "\n", " while True:\n", " V_old[:] = V[:]\n", " V[(β <= wf.π_grid) & (wf.π_grid < α)] = 0\n", "\n", " for i in prange(len(wf.π_grid)):\n", " π = wf.π_grid[i]\n", "\n", " if π >= α or π < β:\n", " continue\n", "\n", " for j in prange(len(z_arr)):\n", " π_next = wf.κ(z_arr[j], π)\n", " V[i] += wf.c + interp(wf.π_grid, V_old, π_next)\n", "\n", " V[i] /= wf.mc_size\n", "\n", " if np.abs(V - V_old).max() < 1e-5:\n", " break\n", "\n", " return V" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "V0 = V_q(wf, 0)\n", "V1 = V_q(wf, 1)\n", "\n", "plt.plot(wf.π_grid, V0, label='$V^0$')\n", "plt.plot(wf.π_grid, V1, label='$V^1$')\n", "plt.vlines(β, 0, wf.L0, linestyle='--')\n", "plt.text(β+0.01, wf.L0/2, 'β')\n", "plt.vlines(α, 0, wf.L0, linestyle='--')\n", "plt.text(α+0.01, wf.L0/2, 'α')\n", "plt.xlabel('$\\pi$')\n", "plt.title('Objective value function $V(\\pi)$')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Given an assumed value for\n", "$\\pi^{*}=\\Pr\\left\\{ \\text{nature selects }f_{0}\\right\\}$, we can\n", "then compute $\\bar{V}_{Bayes}\\left(\\pi_{0}\\right)$.\n", "\n", "We can then determine an initial Bayesian prior $\\pi_{0}^{*}$ that\n", "minimizes this objective concept of expected loss.\n", "\n", "The figure 9 below plots four cases corresponding to\n", "$\\pi^{*}=0.25,0.3,0.5,0.7$.\n", "\n", "We observe that in each case $\\pi_{0}^{*}$ equals $\\pi^{*}$." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "def compute_V_baye_bar(π_star, V0, V1, wf):\n", "\n", " V_baye = π_star * V0 + (1 - π_star) * V1\n", " π_idx = np.argmin(V_baye)\n", " π_optimal = wf.π_grid[π_idx]\n", " V_baye_bar = V_baye[π_idx]\n", "\n", " return V_baye, π_optimal, V_baye_bar" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "π_star_arr = [0.25, 0.3, 0.5, 0.7]\n", "\n", "fig, axs = plt.subplots(2, 2, figsize=(15, 10))\n", "\n", "for i, π_star in enumerate(π_star_arr):\n", " row_i = i // 2\n", " col_i = i % 2\n", "\n", " V_baye, π_optimal, V_baye_bar = compute_V_baye_bar(π_star, V0, V1, wf)\n", "\n", " axs[row_i, col_i].plot(wf.π_grid, V_baye)\n", " axs[row_i, col_i].hlines(V_baye_bar, 0, 1, linestyle='--')\n", " axs[row_i, col_i].vlines(π_optimal, V_baye_bar, V_baye.max(), linestyle='--')\n", " axs[row_i, col_i].text(π_optimal+0.05, (V_baye_bar + V_baye.max()) / 2,\n", " '${\\pi_0^*}=$'+f'{π_optimal:0.2f}')\n", " axs[row_i, col_i].set_xlabel('$\\pi$')\n", " axs[row_i, col_i].set_ylabel('$\\overline{V}_{baye}(\\pi)$')\n", " axs[row_i, col_i].set_title('$\\pi^*=$' + f'{π_star}')\n", "\n", "fig.suptitle('$\\overline{V}_{baye}(\\pi)=\\pi^*V^0(\\pi) + (1-\\pi^*)V^1(\\pi)$', fontsize=16)\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This pattern of outcomes holds more generally.\n", "\n", "Thus, the following Python code generates the associated graph that\n", "verifies the equality of $\\pi_{0}^{*}$ to $\\pi^{*}$ holds\n", "for all $\\pi^{*}$." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "π_star_arr = np.linspace(0.1, 0.9, n_π)\n", "V_baye_bar_arr = np.empty_like(π_star_arr)\n", "π_optimal_arr = np.empty_like(π_star_arr)\n", "\n", "for i, π_star in enumerate(π_star_arr):\n", "\n", " V_baye, π_optimal, V_baye_bar = compute_V_baye_bar(π_star, V0, V1, wf)\n", "\n", " V_baye_bar_arr[i] = V_baye_bar\n", " π_optimal_arr[i] = π_optimal\n", "\n", "fig, axs = plt.subplots(1, 2, figsize=(14, 5))\n", "\n", "axs.plot(π_star_arr, V_baye_bar_arr)\n", "axs.set_xlabel('$\\pi^*$')\n", "axs.set_title('$\\overline{V}_{baye}$')\n", "\n", "axs.plot(π_star_arr, π_optimal_arr, label='optimal prior')\n", "axs.plot([π_star_arr.min(), π_star_arr.max()],\n", " [π_star_arr.min(), π_star_arr.max()],\n", " c='k', linestyle='--', label='45 degree line')\n", "axs.set_xlabel('$\\pi^*$')\n", "axs.set_title('optimal prior given $\\pi^*$')\n", "axs.legend()\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Was the Navy Captain’s hunch correct?\n", "\n", "We now compare average (i.e., frequentist) losses obtained by the\n", "frequentist and Bayesian decision rules.\n", "\n", "As a starting point, let’s compare average loss functions when\n", "$\\pi^{*}=0.5$." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "π_star = 0.5" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "# frequentist\n", "V_fre_arr, PFA_arr, PD_arr = compute_V_fre(L0_arr, L1_arr, π_star, wf)\n", "\n", "# bayesian\n", "V_baye = π_star * V0 + π_star * V1\n", "V_baye_bar = V_baye.min()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "plt.plot(range(T), V_fre_arr, label='$\\min_{d} \\overline{V}_{fre}(t,d)$')\n", "plt.plot([0, T], [V_baye_bar, V_baye_bar], label='$\\overline{V}_{baye}$')\n", "plt.xlabel('t')\n", "plt.title('$\\pi^*=0.5$')\n", "plt.legend()\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Evidently, there is no sample size $t$ at which the frequentist\n", "decision rule attains a lower loss function than does the Bayesian rule.\n", "\n", "Furthermore, the following graph indicates that the Bayesian decision\n", "rule does better on average for all values of $\\pi^{*}$." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "fig, axs = plt.subplots(1, 2, figsize=(14, 5))\n", "\n", "axs.plot(π_star_arr, V_fre_bar_arr, label='$\\overline{V}_{fre}$')\n", "axs.plot(π_star_arr, V_baye_bar_arr, label='$\\overline{V}_{baye}$')\n", "axs.legend()\n", "axs.set_xlabel('$\\pi^*$')\n", "\n", "axs.plot(π_star_arr, V_fre_bar_arr - V_baye_bar_arr, label='$diff$')\n", "axs.legend()\n", "axs.set_xlabel('$\\pi^*$')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The right panel of the above graph plots the difference\n", "$\\bar{V}_{fre}-\\bar{V}_{Bayes}$.\n", "\n", "It is always positive." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## More details\n", "\n", "We can provide more insights by focusing on the case in which\n", "$\\pi^{*}=0.5=\\pi_{0}$." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "π_star = 0.5" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Recall that when $\\pi^*=0.5$, the frequentist decision rule sets a\n", "sample size t_optimal **ex ante**.\n", "\n", "For our parameter settings, we can compute its value:" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "t_optimal" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For convenience, let’s define t_idx as the Python array index\n", "corresponding to t_optimal sample size." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "t_idx = t_optimal - 1" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Distribution of Bayesian decision rule’s times to decide\n", "\n", "By using simulations, we compute the frequency distribution of time to\n", "deciding for the Bayesian decision rule and compare that time to the\n", "frequentist rule’s fixed $t$.\n", "\n", "The following Python code creates a graph that shows the frequency\n", "distribution of Bayesian times to decide of Bayesian decision maker,\n", "conditional on distribution $q=f_{0}$ or $q= f_{1}$\n", "generating the data.\n", "\n", "The blue and red dotted lines show averages for the Bayesian decision\n", "rule, while the black dotted line shows the frequentist optimal sample\n", "size $t$.\n", "\n", "On average the Bayesian rule decides **earlier** than the frequentist\n", "rule when $q= f_0$ and **later** when $q = f_1$." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "@njit(parallel=True)\n", "def check_results(L_arr, α, β, flag, π0):\n", "\n", " N, T = L_arr.shape\n", "\n", " time_arr = np.empty(N)\n", " correctness = np.empty(N)\n", "\n", " π_arr = π0 * L_arr / (π0 * L_arr + 1 - π0)\n", "\n", " for i in prange(N):\n", " for t in range(T):\n", " if (π_arr[i, t] < β) or (π_arr[i, t] > α):\n", " time_arr[i] = t + 1\n", " correctness[i] = (flag == 0 and π_arr[i, t] > α) or (flag == 1 and π_arr[i, t] < β)\n", " break\n", "\n", " return time_arr, correctness" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "time_arr0, correctness0 = check_results(L0_arr, α, β, 0, π_star)\n", "time_arr1, correctness1 = check_results(L1_arr, α, β, 1, π_star)\n", "\n", "# unconditional distribution\n", "time_arr_u = np.concatenate((time_arr0, time_arr1))\n", "correctness_u = np.concatenate((correctness0, correctness1))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "n1 = plt.hist(time_arr0, bins=range(1, 30), alpha=0.4, label='f0 generates')\n", "n2 = plt.hist(time_arr1, bins=range(1, 30), alpha=0.4, label='f1 generates')\n", "plt.vlines(t_optimal, 0, max(n1.max(), n2.max()), linestyle='--', label='frequentist')\n", "plt.vlines(np.mean(time_arr0), 0, max(n1.max(), n2.max()),\n", " linestyle='--', color='b', label='E(t) under f0')\n", "plt.vlines(np.mean(time_arr1), 0, max(n1.max(), n2.max()),\n", " linestyle='--', color='r', label='E(t) under f1')\n", "plt.legend();\n", "\n", "plt.xlabel('t')\n", "plt.ylabel('n')\n", "plt.title('Conditional frequency distribution of times')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Later we’ll figure out how these distributions ultimately affect\n", "objective expected values under the two decision rules.\n", "\n", "To begin, let’s look at simulations of the Bayesian’s beliefs over time.\n", "\n", "We can easily compute the updated beliefs at any time $t$ using\n", "the one-to-one mapping from $L_{t}$ to $\\pi_{t}$ given\n", "$\\pi_0$ described in this lecture [Likelihood Ratio Processes](https://python.quantecon.org/likelihood_ratio_process.html)." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "π0_arr = π_star * L0_arr / (π_star * L0_arr + 1 - π_star)\n", "π1_arr = π_star * L1_arr / (π_star * L1_arr + 1 - π_star)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "fig, axs = plt.subplots(1, 2, figsize=(14, 4))\n", "\n", "axs.plot(np.arange(1, π0_arr.shape+1), np.mean(π0_arr, 0), label='f0 generates')\n", "axs.plot(np.arange(1, π1_arr.shape+1), 1 - np.mean(π1_arr, 0), label='f1 generates')\n", "axs.set_xlabel('t')\n", "axs.set_ylabel('$E(\\pi_t)$ or ($1 - E(\\pi_t)$)')\n", "axs.set_title('Expectation of beliefs after drawing t observations')\n", "axs.legend()\n", "\n", "axs.plot(np.arange(1, π0_arr.shape+1), np.var(π0_arr, 0), label='f0 generates')\n", "axs.plot(np.arange(1, π1_arr.shape+1), np.var(π1_arr, 0), label='f1 generates')\n", "axs.set_xlabel('t')\n", "axs.set_ylabel('var($\\pi_t$)')\n", "axs.set_title('Variance of beliefs after drawing t observations')\n", "axs.legend()\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The above figures compare averages and variances of updated Bayesian\n", "posteriors after $t$ draws.\n", "\n", "The left graph compares $E\\left(\\pi_{t}\\right)$ under\n", "$f_{0}$ to $1-E\\left(\\pi_{t}\\right)$ under $f_{1}$:\n", "they lie on top of each other.\n", "\n", "However, as the right hand size graph shows, there is significant\n", "difference in variances when $t$ is small: the variance is lower\n", "under $f_{1}$.\n", "\n", "The difference in variances is the reason that the Bayesian decision\n", "maker waits longer to decide when $f_{1}$ generates the data.\n", "\n", "The code below plots outcomes of constructing an unconditional\n", "distribution by simply pooling the simulated data across the two\n", "possible distributions $f_0$ and $f_1$.\n", "\n", "The pooled distribution describes a sense in which on average the\n", "Bayesian decides earlier, an outcome that seems at least partly to\n", "confirm the Navy Captain’s hunch." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "n = plt.hist(time_arr_u, bins=range(1, 30), alpha=0.4, label='bayesian')\n", "plt.vlines(np.mean(time_arr_u), 0, n.max(), linestyle='--',\n", " color='b', label='bayesian E(t)')\n", "plt.vlines(t_optimal, 0, n.max(), linestyle='--', label='frequentist')\n", "plt.legend()\n", "\n", "plt.xlabel('t')\n", "plt.ylabel('n')\n", "plt.title('Unconditional distribution of times')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Probability of making correct decisions\n", "\n", "Now we use simulations to compute the fraction of samples in which the\n", "Bayesian and the frequentist decision rules decide correctly.\n", "\n", "For the frequentist rule, the probability of making the correct decision\n", "under $f_{1}$ is the optimal probability of detection given\n", "$t$ that we defined earlier, and similarly it equals $1$\n", "minus the optimal probability of a false alarm under $f_{0}$.\n", "\n", "Below we plot these two probabilities for the frequentist rule, along\n", "with the conditional probabilities that the Bayesian rule decides before\n", "$t$ *and* that the decision is correct." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "# optimal PFA and PD of frequentist with optimal sample size\n", "V, PFA, PD = V_fre_t(t_optimal, L0_arr, L1_arr, π_star, wf)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "plt.plot([1, 20], [PD, PD], linestyle='--', label='PD: fre. chooses f1 correctly')\n", "plt.plot([1, 20], [1-PFA, 1-PFA], linestyle='--', label='1-PFA: fre. chooses f0 correctly')\n", "plt.vlines(t_optimal, 0, 1, linestyle='--', label='frequentist optimal sample size')\n", "\n", "N = time_arr0.size\n", "T_arr = np.arange(1, 21)\n", "plt.plot(T_arr, [np.sum(correctness0[time_arr0 <= t] == 1) / N for t in T_arr],\n", " label='q=f0 and baye. choose f0')\n", "plt.plot(T_arr, [np.sum(correctness1[time_arr1 <= t] == 1) / N for t in T_arr],\n", " label='q=f1 and baye. choose f1')\n", "plt.legend(loc=4)\n", "\n", "plt.xlabel('t')\n", "plt.ylabel('Probability')\n", "plt.title('Cond. probability of making correct decisions before t')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "By averaging using $\\pi^{*}$, we also plot the unconditional\n", "distribution." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "plt.plot([1, 20], [(PD + 1 - PFA) / 2, (PD + 1 - PFA) / 2],\n", " linestyle='--', label='fre. makes correct decision')\n", "plt.vlines(t_optimal, 0, 1, linestyle='--', label='frequentist optimal sample size')\n", "\n", "N = time_arr_u.size\n", "plt.plot(T_arr, [np.sum(correctness_u[time_arr_u <= t] == 1) / N for t in T_arr],\n", " label=\"bayesian makes correct decision\")\n", "plt.legend()\n", "\n", "plt.xlabel('t')\n", "plt.ylabel('Probability')\n", "plt.title('Uncond. probability of making correct decisions before t')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Distribution of likelihood ratios at frequentist’s $t$\n", "\n", "Next we use simulations to construct distributions of likelihood ratios\n", "after $t$ draws.\n", "\n", "To serve as useful reference points, we also show likelihood ratios that\n", "correspond to the Bayesian cutoffs $\\alpha$ and $\\beta$.\n", "\n", "In order to exhibit the distribution more clearly, we report logarithms\n", "of likelihood ratios.\n", "\n", "The graphs below reports two distributions, one conditional on\n", "$f_0$ generating the data, the other conditional on $f_1$\n", "generating the data." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "Lα = (1 - π_star) * α / (π_star - π_star * α)\n", "Lβ = (1 - π_star) * β / (π_star - π_star * β)" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "L_min = min(L0_arr[:, t_idx].min(), L1_arr[:, t_idx].min())\n", "L_max = max(L0_arr[:, t_idx].max(), L1_arr[:, t_idx].max())\n", "bin_range = np.linspace(np.log(L_min), np.log(L_max), 50)\n", "n0 = plt.hist(np.log(L0_arr[:, t_idx]), bins=bin_range, alpha=0.4, label='f0 generates')\n", "n1 = plt.hist(np.log(L1_arr[:, t_idx]), bins=bin_range, alpha=0.4, label='f1 generates')\n", "\n", "plt.vlines(np.log(Lβ), 0, max(n0.max(), n1.max()), linestyle='--', color='r', label='log($L_β$)')\n", "plt.vlines(np.log(Lα), 0, max(n0.max(), n1.max()), linestyle='--', color='b', label='log($L_α$)')\n", "plt.legend()\n", "\n", "plt.xlabel('log(L)')\n", "plt.ylabel('n')\n", "plt.title('Cond. distribution of log likelihood ratio at frequentist t')\n", "\n", "plt.show()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The next graph plots the unconditional distribution of Bayesian times to\n", "decide, constructed as earlier by pooling the two conditional\n", "distributions." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "hide-output": false }, "outputs": [], "source": [ "plt.hist(np.log(np.concatenate([L0_arr[:, t_idx], L1_arr[:, t_idx]])),\n", " bins=50, alpha=0.4, label='unconditional dist. of log(L)')\n", "plt.vlines(np.log(Lβ), 0, max(n0.max(), n1.max()), linestyle='--', color='r', label='log($L_β$)')\n", "plt.vlines(np.log(Lα), 0, max(n0.max(), n1.max()), linestyle='--', color='b', label='log($L_α$)')\n", "plt.legend()\n", "\n", "plt.xlabel('log(L)')\n", "plt.ylabel('n')\n", "plt.title('Uncond. distribution of log likelihood ratio at frequentist t')\n", "\n", "plt.show()" ] } ], "metadata": { "date": 1619590835.5565767, "filename": "navy_captain.md", "kernelspec": { "display_name": "Python", "language": "python3", "name": "python3" }, "title": "Bayesian versus Frequentist Decision Rules" }, "nbformat": 4, "nbformat_minor": 4 }