update day 3

UGA-IDD · Jul 17, 2024 · 824d1ed · 824d1ed
1 parent a80a8bf
commit 824d1ed
Show file tree

Hide file tree

Showing 67 changed files with 3,667 additions and 4,594 deletions.
diff --git a/_freeze/modules/Module05-DataImportExport/execute-results/html.json b/_freeze/modules/Module05-DataImportExport/execute-results/html.json
diff --git a/_freeze/modules/Module08-DataMergeReshape/execute-results/html.json b/_freeze/modules/Module08-DataMergeReshape/execute-results/html.json
diff --git a/_freeze/modules/Module10-DataVisualization/execute-results/html.json b/_freeze/modules/Module10-DataVisualization/execute-results/html.json
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-12-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-12-1.png
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-12-2.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-12-2.png
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-15-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-15-1.png
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-15-2.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-15-2.png
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-16-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-16-1.png
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-19-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-19-1.png
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-19-2.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-19-2.png
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-22-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-22-1.png
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-22-2.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-22-2.png
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-26-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-26-1.png
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-28-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-28-1.png
diff --git a/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-31-1.png b/_freeze/modules/Module10-DataVisualization/figure-revealjs/unnamed-chunk-31-1.png
diff --git a/_freeze/modules/Module12-Iteration/execute-results/html.json b/_freeze/modules/Module12-Iteration/execute-results/html.json
diff --git a/_freeze/modules/Module12-Iteration/figure-revealjs/unnamed-chunk-32-1.png b/_freeze/modules/Module12-Iteration/figure-revealjs/unnamed-chunk-32-1.png
diff --git a/_freeze/modules/Module12-Iteration/figure-revealjs/unnamed-chunk-34-1.png b/_freeze/modules/Module12-Iteration/figure-revealjs/unnamed-chunk-34-1.png
diff --git a/_freeze/modules/Module13-Functions/execute-results/html.json b/_freeze/modules/Module13-Functions/execute-results/html.json
@@ -0,0 +1,19 @@
+{
+  "hash": "24fe9f90add9d98a5efd4fc90b80f020",
+  "result": {
+    "engine": "knitr",
+    "markdown": "---\ntitle: \"Module 13: Functions\"\nformat: \n  revealjs:\n    scrollable: true\n    smaller: true\n    toc: false\n---\n\n\n## Learning Objectives\n\nAfter module 13, you should be able to:\n\n- Create your own function\n\n## Writing your own functions\n\nSo far, we have seen many functions (e.g., `c()`, `class()`, `mean()`, `tranform()`, `aggregate()` and many more\n\n**why create your own function?**\n\n1. to cut down on repetitive coding\n2. to organize code into manageable chunks\n3. to avoid running code unintentionally\n4. to use names that make sense to you\n\n## Writing your own functions\n\nHere we will write a function that multiplies some number (x) by 2:\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntimes_2 <- function(x) x*2\n```\n:::\n\nWhen you run the line of code above, you make it ready to use (no output yet!)\nLet's test it!\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntimes_2(x=10)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 20\n```\n\n\n:::\n:::\n\n\n## Writing your own functions: { }\n\nAdding the curly brackets - `{ }` - allows you to use functions spanning multiple lines:\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntimes_3 <- function(x) {\n  x*3\n}\ntimes_3(x=10)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 30\n```\n\n\n:::\n:::\n\n\n## Writing your own functions: `return`\n\nIf we want something specific for the function's output, we use `return()`. Note, if you want to return more than one object, you need to put it into a list using the `list()` function.\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntimes_4 <- function(x) {\n  output <- x * 4\n  return(list(output, x))\n}\ntimes_4(x = 10)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[[1]]\n[1] 40\n\n[[2]]\n[1] 10\n```\n\n\n:::\n:::\n\n\n\n## Function Syntax\n\nThis is a brief introduction. The syntax is:\n\n```\nfunctionName = function(inputs) {\n< function body >\nreturn(list(value1, value2))\n}\n```\n\nNote to create the function for use you need to \n\n1. Code/type the function\n2. Execute/run the lines of code\n\nOnly then will the function be available in the Environment pane and ready to use.\n\n## Writing your own functions: multiple arguments\n\nFunctions can take multiple arguments / inputs. Here the function has two arguments `x` and `y`\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntimes_2_plus_y <- function(x, y) {\n  out <- x * 2 + y\n  return(out)\n}\ntimes_2_plus_y(x = 10, y = 3)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 23\n```\n\n\n:::\n:::\n\n\n## Writing your own functions: arugment defaults\n\nFunctions can have default arguments. This lets us use the function without specifying the arguments\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntimes_2_plus_y <- function(x = 10, y = 3) {\n  out <- x * 2 + y\n  return(out)\n}\ntimes_2_plus_y()\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 23\n```\n\n\n:::\n:::\n\n\nWe got an answer b/c we put defaults into the function arguments.\n\n## Writing a simple function\n\nLet's write a function, `sqdif`, that:\n\n1. takes two numbers `x` and `y` with default values of 2 and 3.\n2. takes the difference\n3. squares this difference\n4. then returns the final value\n\n```\nfunctionName = function(inputs) {\n< function body >\nreturn(list(value1, value2))\n}\n```\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsqdif <- function(x=2,y=3){\n     output <- (x-y)^2\n     return(output)\n}\n\nsqdif()\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 1\n```\n\n\n:::\n\n```{.r .cell-code}\nsqdif(x=10,y=5)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 25\n```\n\n\n:::\n\n```{.r .cell-code}\nsqdif(10,5)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 25\n```\n\n\n:::\n:::\n\n\n## Writing your own functions: characters\n\nFunctions can have any kind of data type input. For example, here is a function with characters:\n\n\n::: {.cell}\n\n```{.r .cell-code}\nloud <- function(word) {\n  output <- rep(toupper(word), 5)\n  return(output)\n}\nloud(word = \"hooray!\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"HOORAY!\" \"HOORAY!\" \"HOORAY!\" \"HOORAY!\" \"HOORAY!\"\n```\n\n\n:::\n:::\n\n\n\n## Using functions with `aggregate()`\n\nYou can apply functions easily to groups with `aggregate()`. As a reminder, we learned `aggregate()` yesterday in Module 9. We will take a quick look at the data.\n\n\n::: {.cell}\n\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nhead(df)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n  observation_id IgG_concentration age gender     slum age_group seropos\n1           5772        0.31768953   2 Female Non slum     young       0\n2           8095        3.43682311   4 Female Non slum     young       0\n3           9784        0.30000000   4   Male Non slum     young       0\n4           9338      143.23630140   4   Male Non slum     young       1\n5           6369        0.44765343   1   Male Non slum     young       0\n6           6885        0.02527076   4   Male Non slum     young       0\n```\n\n\n:::\n:::\n\n\nThen, we used the following code to estimate the standard deviation of `IgG_concentration` for each unique combination of `age_group` and `slum` variables.  \n\n\n::: {.cell}\n\n```{.r .cell-code}\naggregate(\n\tIgG_concentration ~ age_group + slum,\n\tdata = df,\n\tFUN = sd # standard deviation\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n  age_group     slum IgG_concentration\n1     young    Mixed         174.89797\n2    middle    Mixed         162.08188\n3       old    Mixed         150.07063\n4     young Non slum         114.68422\n5    middle Non slum         177.62113\n6       old Non slum         141.22330\n7     young     Slum          61.85705\n8    middle     Slum         202.42018\n9       old     Slum          74.75217\n```\n\n\n:::\n:::\n\n\n\n## Using functions with `aggregate()`\n\nBut, lets say we want to do something different. Rather than taking the standard deviation and using a function that already exists (`sd()`), lets take the natural log of `IgG_concentration` and then get the mean.  To do this, we can create our own function and this plug it into the `FUN` argument.  \n\nStep 1 - code/type our own function\n\n::: {.cell}\n\n```{.r .cell-code}\nlog_mean_function <- function(x){\n\toutput <- mean(log(x))\n\treturn(output)\n}\n```\n:::\n\n\n</br>\n\nStep 2 - execute our function (i.e., run the lines of code), and you would not be able to see it in you Environment pane.\n\n\n::: {.cell layout-align=\"left\"}\n::: {.cell-output-display}\n![](images/log_mean_function.png){fig-align='left' width=100%}\n:::\n:::\n\n\n</br>\n\nStep 3 - use the function by plugging it in the `aggregate()` function in order to complete our task\n\n::: {.cell}\n\n```{.r .cell-code}\naggregate(\n\tIgG_concentration ~ age_group + slum,\n\tdata = df,\n\tFUN = log_mean_function\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n  age_group     slum IgG_concentration\n1     young    Mixed        0.50082888\n2    middle    Mixed        2.85916401\n3       old    Mixed        3.13971163\n4     young Non slum        0.14060433\n5    middle Non slum        2.30717077\n6       old Non slum        3.77021233\n7     young     Slum       -0.04611508\n8    middle     Slum        2.46490973\n9       old     Slum        3.52357989\n```\n\n\n:::\n:::\n\n\n\n## Example from Module 12\n\nIn the last Module 12, we used loops to loop through every country in the dataset, and get the median, first and third quartiles, and range for each country and stored those summary statistics in a data frame.\n\n\n\n::: {.cell}\n\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:length(countries)) {\n\t# Get the data for the current country only\n\tcountry_data <- subset(meas, country == countries[i])\n\t\n\t# Get the summary statistics for this country\n\tcountry_cases <- country_data$Cases\n\tcountry_quart <- quantile(\n\t\tcountry_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)\n\t)\n\tcountry_range <- range(country_cases, na.rm = TRUE)\n\t\n\t# Save the summary statistics into a data frame\n\tcountry_summary <- data.frame(\n\t\tcountry = countries[[i]],\n\t\tmin = country_range[[1]],\n\t\tQ1 = country_quart[[1]],\n\t\tmedian = country_quart[[2]],\n\t\tQ3 = country_quart[[3]],\n\t\tmax = country_range[[2]]\n\t)\n\t\n\t# Save the results to our container\n\tres[[i]] <- country_summary\n}\n```\n:::\n\n\n## Function instead of Loop\n\nHere we are going to set up a function that takes our data frame and outputs the median, first and third quartiles, and range of measles cases for a specified country.\n\nStep 1 - code/type our own function.  We specify two arguments, the first argument is our data frame and the second is one country's iso3 code.  Notice, I included common documentation for  \n\n\n::: {.cell}\n\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nget_country_stats <- function(df, iso3_code){\n\t\n\tcountry_data <- subset(df, iso3c == iso3_code)\n\t\n\t# Get the summary statistics for this country\n\tcountry_cases <- country_data$Cases\n\tcountry_quart <- quantile(\n\t\tcountry_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)\n\t)\n\tcountry_range <- range(country_cases, na.rm = TRUE)\n\t\n\tcountry_name <- unique(country_data$country)\n\t\n\tcountry_summary <- data.frame(\n\t\tcountry = country_name,\n\t\tmin = country_range[[1]],\n\t\tQ1 = country_quart[[1]],\n\t\tmedian = country_quart[[2]],\n\t\tQ3 = country_quart[[3]],\n\t\tmax = country_range[[2]]\n\t)\n\t\n\treturn(country_summary)\n}\n```\n:::\n\n\nStep 2 - execute our function (i.e., run the lines of code), and you would not be able to see it in you Environment pane.\n\n\n::: {.cell layout-align=\"left\"}\n::: {.cell-output-display}\n![](images/get_country_stats_function.png){fig-align='left' width=100%}\n:::\n:::\n\n\nStep 3 - use the function by pulling out stats for India and Pakistan\n\n::: {.cell}\n\n```{.r .cell-code}\nget_country_stats(df=meas, iso3_code=\"IND\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n  country  min    Q1 median      Q3    max\n1   India 3305 30813  47072 74828.5 252940\n```\n\n\n:::\n\n```{.r .cell-code}\nget_country_stats(df=meas, iso3_code=\"PAK\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n   country min   Q1 median      Q3   max\n1 Pakistan 386 2065   3903 13860.5 55543\n```\n\n\n:::\n:::\n\n\n\n## Summary\n\n- Simple functions take the form:\n```\nfunctionName = function(arguments) {\n\t< function body >\n\treturn(list(outputs))\n}\n```\n- We can specify arguments defaults when you create the function\n\n\n## Mini Exercise\n\nCreate your own function that saves a line plot of a time series of measles cases for a specified country.\n\nStep 1. Determine your arguments, which are the same as the last example\n\nStep 2. Begin your function by subsetting the data to include only the country specified in the arguments (i.e, `country_data`), this is the same as the first line of code in the last example.\n\nStep 3. Return to Module 10 to remember how to use the `plot()` function.  Hint you will need to specify the argument `type=\"l\" to make it a line plot.  \n\nStep 4. Return to your function and add code to create a new plot using the `country_data` object. Note you will need to use the `png()` function before the `plot()` function and end it with `dev.off()` in order to save the file.\n\nStep 5. Use the function to generate a plot for India and Pakistan\n\n# Mini Exercise Answer\n\n\n::: {.cell}\n\n```{.r .cell-code}\nget_time_series_plot <- function(df, iso3_code){\n\t\n\tcountry_data <- subset(df, iso3c == iso3_code)\n\t\n\tpng(filename=paste0(\"output/time_series_\", iso3_code, \".png\"))\n\tplot(country_data$time, country_data$Cases, type=\"l\", xlab=\"year\", ylab=\"Measles Cases\")\n\tdev.off()\n\t\n}\n\nget_time_series_plot(df=meas, iso3_code=\"IND\")\nget_time_series_plot(df=meas, iso3_code=\"PAK\")\n```\n:::\n",
+    "supporting": [],
+    "filters": [
+      "rmarkdown/pagebreak.lua"
+    ],
+    "includes": {
+      "include-after-body": [
+        "\n<script>\n  // htmlwidgets need to know to resize themselves when slides are shown/hidden.\n  // Fire the \"slideenter\" event (handled by htmlwidgets.js) when the current\n  // slide changes (different for each slide format).\n  (function () {\n    // dispatch for htmlwidgets\n    function fireSlideEnter() {\n      const event = window.document.createEvent(\"Event\");\n      event.initEvent(\"slideenter\", true, true);\n      window.document.dispatchEvent(event);\n    }\n\n    function fireSlideChanged(previousSlide, currentSlide) {\n      fireSlideEnter();\n\n      // dispatch for shiny\n      if (window.jQuery) {\n        if (previousSlide) {\n          window.jQuery(previousSlide).trigger(\"hidden\");\n        }\n        if (currentSlide) {\n          window.jQuery(currentSlide).trigger(\"shown\");\n        }\n      }\n    }\n\n    // hookup for slidy\n    if (window.w3c_slidy) {\n      window.w3c_slidy.add_observer(function (slide_num) {\n        // slide_num starts at position 1\n        fireSlideChanged(null, w3c_slidy.slides[slide_num - 1]);\n      });\n    }\n\n  })();\n</script>\n\n"
+      ]
+    },
+    "engineDependencies": {},
+    "preserve": {},
+    "postProcess": true
+  }
+}