Updates for spring 2022

KIPAC · Mar 14, 2022 · b5b2185 · b5b2185
1 parent d7e3651
commit b5b2185
Show file tree

Hide file tree

Showing 18 changed files with 752 additions and 149 deletions.
diff --git a/nb/01_01_Look At This Figure.ipynb b/nb/01_01_Look At This Figure.ipynb
@@ -5,8 +5,28 @@
    "id": "e71470e4",
    "metadata": {},
    "source": [
-    "# Experiments design, data presentation\n",
+    "# Experimental design, data presentation\n",
     "\n",
+    "### Goals:\n",
+    "\n",
+    "1. We are just getting started.  The goals here are to make sure that you comfortable with the format of the class, and to get you thinking about what we are trying to learn when we do experiments.  \n",
+    "\n",
+    "### Timing\n",
+    "\n",
+    "1. Try to finish this notebook in 15-20 minutes\n",
+    "\n",
+    "### Question and Answer Template\n",
+    "\n",
+    "You can go to the link below, and do \"file\" -> \"make a copy\" to make yourself a google doc that you can use to fill in the answers to the question in this weeks notebooks.\n",
+    "\n",
+    "https://docs.google.com/document/d/1RTjOCCsLfoN1M18KtLr6DxOT1FBVxqsSlBNswbc9nyE/edit?usp=sharing"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "145eb374",
+   "metadata": {},
+   "source": [
     "I am a big fan of the https://xkcd.com/ web comic.   \n",
     "\n",
     "At the end of 2020, after the covid first vaccine trial data were released, xkcd ran this comic:"
@@ -65,8 +85,8 @@
     "    2. Positive controls: ways to show that you can correctly measure \"non-zero\" or a \"positive result\".  \n",
     "#### 1.4 Describe how both types of controls are present and used in this study.  \n",
     "\n",
-    "(Note that it is a bit of a trick question; or at least a deliberately open-ended question.  In this case what you call the positive control and what you call the negative control depends a bit about what you are trying to measure.  The point of this excersize is to think through what we learn from this type of study.   To help frame the discussion\n",
-    "lets say that a \"null result\" would mean \"the vaccine doesn't work\" and a \"positive result\" would mean \"the vaccine does work\". \n",
+    "Note that it is a bit of a trick question; or at least a deliberately open-ended question.  In this case what you call the positive control and what you call the negative control depends a bit about what you are trying to measure.  The point of this exercise is to think through what we learn from this type of study.   To help frame the discussion\n",
+    "let's say that a \"null result\" would mean \"the vaccine doesn't work\" and a \"positive result\" would mean \"the vaccine works perfectly\". \n",
     "\n",
     "#### 1.5 Describe some things we might learn if we had the numbers that went into making this chart."
    ]
@@ -78,27 +98,11 @@
    "metadata": {},
    "outputs": [],
    "source": []
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "9a67d138",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "72db9974",
-   "metadata": {},
-   "outputs": [],
-   "source": []
   }
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -112,7 +116,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.10"
+   "version": "3.10.0"
   }
  },
  "nbformat": 4,

diff --git a/nb/01_02_Dice_Rolls_and_Histograms.ipynb b/nb/01_02_Dice_Rolls_and_Histograms.ipynb
@@ -1,12 +1,32 @@
 {
  "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "af5c1ecd",
+   "metadata": {},
+   "source": [
+    "# Data Presentation, making histograms of dice rolls\n",
+    "\n",
+    "### Goals:\n",
+    "\n",
+    "1. We are still just getting started.  The main idea here is to make sure that you are comfortable with using notebooks that run simple python code, and that you can get a sense of what the code is doing (no need to worry about the details).  \n",
+    "2. Also, we want to start talking about data presentation by making few very simple graphs. Almost all of these graphs will be \"histograms\", which chart the number of times that we see different results in our data.\n",
+    "3. Finally, we are going to explore some very common mistakes that people make when making histograms.\n",
+    "\n",
+    "\n",
+    "### Timing\n",
+    "\n",
+    "1. Try to finish this notebook in 15-20 minutes\n"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
    "id": "643bf531",
    "metadata": {},
    "outputs": [],
    "source": [
+    "# Standard setup\n",
     "%matplotlib inline\n",
     "import matplotlib.pyplot as plt\n",
     "import numpy as np"
@@ -81,6 +101,14 @@
     "print(tenRolls)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "5c5eceea",
+   "metadata": {},
+   "source": [
+    "If you aren't familiar with jupyter notebooks or python, make sure that you understand what happend in the two cells above, and how the output was displayed on the screen."
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "5e3034e8",
@@ -484,27 +512,11 @@
    "metadata": {},
    "outputs": [],
    "source": []
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "e3541040",
-   "metadata": {},
-   "outputs": [],
-   "source": []
-  },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "id": "b94dafc7",
-   "metadata": {},
-   "outputs": [],
-   "source": []
   }
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -518,7 +530,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.10"
+   "version": "3.10.0"
   }
  },
  "nbformat": 4,

diff --git a/nb/01_03_Hubble_Measurements.ipynb b/nb/01_03_Hubble_Measurements.ipynb
@@ -1,5 +1,23 @@
 {
  "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Measurements: a common sense view\n",
+    "\n",
+    "### Goals:\n",
+    "\n",
+    "1. To establish a common-sense understanding about how to interpret a set of measurement using a histogram.\n",
+    "2. To get practical knowledge of simple statistics, such as mean, median and standard deviation, by comparing them to our common-sense understanding.\n",
+    "3. To contemplate the grandeur of the universe and the mind-blowing fact that it is expanding.\n",
+    "\n",
+    "### Timing\n",
+    "\n",
+    "1. Try to finish this notebook in 20-25 minutes\n",
+    "\n"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -101,7 +119,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "These data are in the form of a table with three columns, the first column is the measured value and the next two columns are the estimated uncertainties."
+    "These data are in the form of a table with three columns, the first column is the measured value and the next two columns are the estimated uncertainties.  Let's have a look:"
    ]
   },
   {
@@ -323,15 +341,6 @@
     "#### 10.2 What does this suggest about using the mean or the median to summarize a set of measurements?  What about which statistic we might use to characterize the uncertainty?"
    ]
   },
-  {
-   "cell_type": "code",
-   "execution_count": null,
-   "metadata": {},
-   "outputs": [],
-   "source": [
-    " "
-   ]
-  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -342,7 +351,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -356,7 +365,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.10"
+   "version": "3.10.0"
   }
  },
  "nbformat": 4,

diff --git a/nb/02_01_More_Dice_Rolling.ipynb b/nb/02_01_More_Dice_Rolling.ipynb
@@ -1,5 +1,29 @@
 {
  "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "25c60130",
+   "metadata": {},
+   "source": [
+    "# Weighted Averages\n",
+    "\n",
+    "### Goals:\n",
+    "\n",
+    "1. To review the concept of weighted averages.\n",
+    "2. To understand when it makes sense to use weighted averages. \n",
+    "3. To understand how histograms and wieghted averages are tools that can be used to summarize large data sets a much smaller set of numbers.\n",
+    "\n",
+    "### Timing\n",
+    "\n",
+    "1. Try to finish this notebook in 30-35 minutes\n",
+    "\n",
+    "### Question and Answer Template\n",
+    "\n",
+    "You can go to the link below, and do \"file\" -> \"make a copy\" to make yourself a google doc that you can use to fill in the answers to the question in this weeks notebooks.\n",
+    "\n",
+    "https://docs.google.com/document/d/1ZmV0GQr0SfdIbLfKm5ibpRwBmCR8KGVTmVatiQr8sxI/edit?usp=sharing"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -279,7 +303,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -293,7 +317,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.10"
+   "version": "3.10.0"
   }
  },
  "nbformat": 4,

diff --git a/nb/02_02_Hubble_Constant_Uncertainties.ipynb b/nb/02_02_Hubble_Constant_Uncertainties.ipynb
@@ -1,5 +1,23 @@
 {
  "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "f896a5b3",
+   "metadata": {},
+   "source": [
+    "# Weighted Averages and Measurement Uncertainties\n",
+    "\n",
+    "### Goals:\n",
+    "\n",
+    "1. To review simple statistics, the mean and standard deviation.\n",
+    "2. To understand a new statistic, the variance, and why it is so useful.\n",
+    "3. To understand why using the inverse of the variance to do weighted averages, (i,e., \"inverse variance weighting\") is the standard way to combine measurements that have differnet uncertainties.\n",
+    "\n",
+    "### Timing\n",
+    "\n",
+    "1. Try to finish this notebook in 30-35 minutes\n"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": null,
@@ -219,9 +237,9 @@
    "id": "c640040b",
    "metadata": {},
    "source": [
-    "### Simulating a bunch of measurements of the Hubble constant\n",
+    "##### Simulating a bunch of measurements of the Hubble constant\n",
     "\n",
-    "Now we are going to pretend that we sent out 10 teams of scientist, and asked each of them to do some measurements of the Hubble constant, and that all the measurments are draw from the distribution above.   The groups of scientiest do a different number of measurements, but in total they have 100 measurements.\n",
+    "Now we are going to pretend that we sent out 5 teams of scientist, and asked each of them to do some measurements of the Hubble constant, and that all the measurments are draw from the distribution above.   The groups of scientiest do a different number of measurements, but in total they have 100 measurements.\n",
     "\n",
     "We are then going to consider two different ways of combining their results.\n",
     "\n",
@@ -236,18 +254,13 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "dataSample_0 = np.random.normal(loc=H0_mean, scale=H0_std, size=20)\n",
-    "dataSample_1 = np.random.normal(loc=H0_mean, scale=H0_std, size=4)\n",
-    "dataSample_2 = np.random.normal(loc=H0_mean, scale=H0_std, size=12)\n",
-    "dataSample_3 = np.random.normal(loc=H0_mean, scale=H0_std, size=10)\n",
-    "dataSample_4 = np.random.normal(loc=H0_mean, scale=H0_std, size=16)\n",
-    "dataSample_5 = np.random.normal(loc=H0_mean, scale=H0_std, size=7)\n",
-    "dataSample_6 = np.random.normal(loc=H0_mean, scale=H0_std, size=3)\n",
-    "dataSample_7 = np.random.normal(loc=H0_mean, scale=H0_std, size=8)\n",
-    "dataSample_8 = np.random.normal(loc=H0_mean, scale=H0_std, size=11)\n",
-    "dataSample_9 = np.random.normal(loc=H0_mean, scale=H0_std, size=9)\n",
-    "dataSamples = [dataSample_0, dataSample_1, dataSample_2, dataSample_3, dataSample_4,\n",
-    "               dataSample_5, dataSample_6, dataSample_7, dataSample_8, dataSample_9]\n",
+    "np.random.seed(1234)\n",
+    "dataSample_0 = np.random.normal(loc=H0_mean, scale=H0_std, size=50)\n",
+    "dataSample_1 = np.random.normal(loc=H0_mean, scale=H0_std, size=3)\n",
+    "dataSample_2 = np.random.normal(loc=H0_mean, scale=H0_std, size=27)\n",
+    "dataSample_3 = np.random.normal(loc=H0_mean, scale=H0_std, size=2)\n",
+    "dataSample_4 = np.random.normal(loc=H0_mean, scale=H0_std, size=18)\n",
+    "dataSamples = [dataSample_0, dataSample_1, dataSample_2, dataSample_3, dataSample_4]\n",
     "mergedSample = np.hstack(dataSamples)"
    ]
   },
@@ -298,8 +311,8 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "_ = plt.errorbar(means, np.arange(10), xerr=(errors), fmt=\".\")\n",
-    "_ = plt.xlim(68.,75.)\n",
+    "_ = plt.errorbar(means, np.arange(5), xerr=(errors), fmt=\".\")\n",
+    "_ = plt.xlim(67.,76.)\n",
     "_ = plt.xlabel(\"Mean of sub-sample\")\n",
     "_ = plt.ylabel(\"Group number\")"
    ]
@@ -367,14 +380,14 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "_ = plt.errorbar(means, np.arange(10), xerr=(errors), fmt=\".\", color='k')\n",
-    "_ = plt.xlim(68., 75)\n",
+    "_ = plt.errorbar(means, np.arange(5), xerr=(errors), fmt=\".\", color='k')\n",
+    "_ = plt.xlim(67., 76)\n",
     "_ = plt.xlabel(\"Mean of sub-sample\")\n",
     "_ = plt.ylabel(\"Experiment number\")\n",
-    "_ = plt.errorbar(overall_mean, 4.6, xerr=overall_error, yerr=5, fmt='o', color='g', label=\"Full Sample\")\n",
-    "_ = plt.errorbar(straight_mean, 4.2, xerr=straight_error, yerr=5, fmt='o', color='r', label=\"Mean\")\n",
-    "_ = plt.errorbar(weighted_mean, 4.4, xerr=weighted_error, yerr=5, fmt='o', color='b', label=\"Weighted Mean\")\n",
-    "_ = plt.scatter(H0_mean, 4.8, marker='o', color='cyan', label=\"True\")\n",
+    "_ = plt.errorbar(overall_mean, 2.6, xerr=overall_error, yerr=2.5, fmt='o', color='g', label=\"Full Sample\")\n",
+    "_ = plt.errorbar(straight_mean, 2.2, xerr=straight_error, yerr=2.5, fmt='o', color='r', label=\"Mean\")\n",
+    "_ = plt.errorbar(weighted_mean, 2.4, xerr=weighted_error, yerr=2.5, fmt='o', color='b', label=\"Weighted Mean\")\n",
+    "_ = plt.scatter(H0_mean, 2.5, marker='o', color='cyan', label=\"True\")\n",
     "_ = plt.legend()"
    ]
   },
@@ -459,7 +472,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "Python 3",
+   "display_name": "Python 3 (ipykernel)",
    "language": "python",
    "name": "python3"
   },
@@ -473,7 +486,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.8.10"
+   "version": "3.10.0"
   }
  },
  "nbformat": 4,