-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
b62db7e
commit a80a8bf
Showing
9 changed files
with
1,668 additions
and
1 deletion.
There are no files selected for viewing
19 changes: 19 additions & 0 deletions
19
_freeze/modules/Module12-Function/execute-results/html.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
{ | ||
"hash": "ec7586ee7098ff5dac0fa2452f3fda03", | ||
"result": { | ||
"engine": "knitr", | ||
"markdown": "---\ntitle: \"Module 12: Function\"\nformat: \n revealjs:\n scrollable: true\n smaller: true\n toc: false\n---\n\n\n\n## Learning Objectives\n\nAfter module 12, you should be able to:\n\n- Create your own function\n\n## Writing your own functions\n\nSo far, we have seen many functions (e.g., `c()`, `class()`, `mean()`, `tranform()`, `aggregate()` and many more\n\n**why create your own function?**\n\n1. to cut down on repetitive coding\n2. to organize code into manageable chunks\n3. to avoid running code unintentionally\n4. to use names that make sense to you\n\n## Writing your own functions\n\nHere we will write a function that multiplies some number (x) by 2:\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntimes_2 <- function(x) x*2\n```\n:::\n\n\nWhen you run the line of code above, you make it ready to use (no output yet!)\nLet's test it!\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntimes_2(x=10)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 20\n```\n\n\n:::\n:::\n\n\n\n## Writing your own functions: { }\n\nAdding the curly brackets - `{ }` - allows you to use functions spanning multiple lines:\n\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntimes_3 <- function(x) {\n x*3\n}\ntimes_3(x=10)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 30\n```\n\n\n:::\n:::\n\n\n\n## Writing your own functions: `return`\n\nIf we want something specific for the function's output, we use `return()`. Note, if you want to return more than one object, you need to put it into a list using the `list()` function.\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntimes_4 <- function(x) {\n output <- x * 4\n return(list(output, x))\n}\ntimes_4(x = 10)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[[1]]\n[1] 40\n\n[[2]]\n[1] 10\n```\n\n\n:::\n:::\n\n\n\n\n## Function Syntax\n\nThis is a brief introduction. The syntax is:\n\n```\nfunctionName = function(inputs) {\n< function body >\nreturn(list(value1, value2))\n}\n```\n\nNote to create the function for use you need to \n\n1. Code/type the function\n2. Execute/run the lines of code\n\nOnly then will the function be available in the Environment pane and ready to use.\n\n## Writing your own functions: multiple arguments\n\nFunctions can take multiple arguments / inputs. Here the function has two arguments `x` and `y`\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntimes_2_plus_y <- function(x, y) {\n out <- x * 2 + y\n return(out)\n}\ntimes_2_plus_y(x = 10, y = 3)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 23\n```\n\n\n:::\n:::\n\n\n\n## Writing your own functions: arugment defaults\n\nFunctions can have default arguments. This lets us use the function without specifying the arguments\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\ntimes_2_plus_y <- function(x = 10, y = 3) {\n out <- x * 2 + y\n return(out)\n}\ntimes_2_plus_y()\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 23\n```\n\n\n:::\n:::\n\n\n\nWe got an answer b/c we put defaults into the function arguments.\n\n## Writing a simple function\n\nLet's write a function, `sqdif`, that:\n\n1. takes two numbers `x` and `y` with default values of 2 and 3.\n2. takes the difference\n3. squares this difference\n4. then returns the final value\n\n```\nfunctionName = function(inputs) {\n< function body >\nreturn(list(value1, value2))\n}\n```\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nsqdif <- function(x=2,y=3){\n output <- (x-y)^2\n return(output)\n}\n\nsqdif()\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 1\n```\n\n\n:::\n\n```{.r .cell-code}\nsqdif(x=10,y=5)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 25\n```\n\n\n:::\n\n```{.r .cell-code}\nsqdif(10,5)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] 25\n```\n\n\n:::\n:::\n\n\n\n## Writing your own functions: characters\n\nFunctions can have any kind of data type input. For example, here is a function with characters:\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nloud <- function(word) {\n output <- rep(toupper(word), 5)\n return(output)\n}\nloud(word = \"hooray!\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n[1] \"HOORAY!\" \"HOORAY!\" \"HOORAY!\" \"HOORAY!\" \"HOORAY!\"\n```\n\n\n:::\n:::\n\n\n\n\n## Using functions with `aggregate()`\n\nYou can apply functions easily to groups with `aggregate()`. As a reminder, we learned `aggregate()` yesterday in Module 9. We will take a quick look at the data.\n\n\n\n::: {.cell}\n\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nhead(df)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n observation_id IgG_concentration age gender slum age_group seropos\n1 5772 0.31768953 2 Female Non slum young 0\n2 8095 3.43682311 4 Female Non slum young 0\n3 9784 0.30000000 4 Male Non slum young 0\n4 9338 143.23630140 4 Male Non slum young 1\n5 6369 0.44765343 1 Male Non slum young 0\n6 6885 0.02527076 4 Male Non slum young 0\n```\n\n\n:::\n:::\n\n\n\nThen, we used the following code to estimate the standard deviation of `IgG_concentration` for each unique combination of `age_group` and `slum` variables. \n\n\n\n::: {.cell}\n\n```{.r .cell-code}\naggregate(\n\tIgG_concentration ~ age_group + slum,\n\tdata = df,\n\tFUN = sd # standard deviation\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n age_group slum IgG_concentration\n1 young Mixed 174.89797\n2 middle Mixed 162.08188\n3 old Mixed 150.07063\n4 young Non slum 114.68422\n5 middle Non slum 177.62113\n6 old Non slum 141.22330\n7 young Slum 61.85705\n8 middle Slum 202.42018\n9 old Slum 74.75217\n```\n\n\n:::\n:::\n\n\n\n\n## Using functions with `aggregate()`\n\nBut, lets say we want to do something different. Rather than taking the standard deviation and using a function that already exists (`sd()`), lets take the natural log of `IgG_concentration` and then get the mean. To do this, we can create our own function and this plug it into the `FUN` argument. \n\nStep 1 - code/type our own function\n\n\n::: {.cell}\n\n```{.r .cell-code}\nlog_mean_function <- function(x){\n\toutput <- mean(log(x))\n\treturn(output)\n}\n```\n:::\n\n\n\n</br>\n\nStep 2 - execute our function (i.e., run the lines of code), and you would not be able to see it in you Environment pane.\n\n\n\n::: {.cell layout-align=\"left\"}\n::: {.cell-output-display}\n![](images/log_mean_function.png){fig-align='left' width=100%}\n:::\n:::\n\n\n\n</br>\n\nStep 3 - use the function by plugging it in the `aggregate()` function in order to complete our task\n\n\n::: {.cell}\n\n```{.r .cell-code}\naggregate(\n\tIgG_concentration ~ age_group + slum,\n\tdata = df,\n\tFUN = log_mean_function\n)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n age_group slum IgG_concentration\n1 young Mixed 0.50082888\n2 middle Mixed 2.85916401\n3 old Mixed 3.13971163\n4 young Non slum 0.14060433\n5 middle Non slum 2.30717077\n6 old Non slum 3.77021233\n7 young Slum -0.04611508\n8 middle Slum 2.46490973\n9 old Slum 3.52357989\n```\n\n\n:::\n:::\n\n\n\n\n## Example from Module 12\n\nIn the last Module 12, we used loops to loop through every country in the dataset, and get the median, first and third quartiles, and range for each country and stored those summary statistics in a data frame.\n\n\n\n\n::: {.cell}\n\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nfor (i in 1:length(countries)) {\n\t# Get the data for the current country only\n\tcountry_data <- subset(meas, country == countries[i])\n\t\n\t# Get the summary statistics for this country\n\tcountry_cases <- country_data$Cases\n\tcountry_quart <- quantile(\n\t\tcountry_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)\n\t)\n\tcountry_range <- range(country_cases, na.rm = TRUE)\n\t\n\t# Save the summary statistics into a data frame\n\tcountry_summary <- data.frame(\n\t\tcountry = countries[[i]],\n\t\tmin = country_range[[1]],\n\t\tQ1 = country_quart[[1]],\n\t\tmedian = country_quart[[2]],\n\t\tQ3 = country_quart[[3]],\n\t\tmax = country_range[[2]]\n\t)\n\t\n\t# Save the results to our container\n\tres[[i]] <- country_summary\n}\n```\n:::\n\n\n\n## Function instead of Loop\n\nHere we are going to set up a function that takes our data frame and outputs the median, first and third quartiles, and range of measles cases for a specified country.\n\nStep 1 - code/type our own function. We specify two arguments, the first argument is our data frame and the second is one country's iso3 code. Notice, I included common documentation for \n\n\n\n::: {.cell}\n\n:::\n\n::: {.cell}\n\n```{.r .cell-code}\nget_country_stats <- function(df, iso3_code){\n\t\n\tcountry_data <- subset(df, iso3c == iso3_code)\n\t\n\t# Get the summary statistics for this country\n\tcountry_cases <- country_data$Cases\n\tcountry_quart <- quantile(\n\t\tcountry_cases, na.rm = TRUE, probs = c(0.25, 0.5, 0.75)\n\t)\n\tcountry_range <- range(country_cases, na.rm = TRUE)\n\t\n\tcountry_name <- unique(country_data$country)\n\t\n\tcountry_summary <- data.frame(\n\t\tcountry = country_name,\n\t\tmin = country_range[[1]],\n\t\tQ1 = country_quart[[1]],\n\t\tmedian = country_quart[[2]],\n\t\tQ3 = country_quart[[3]],\n\t\tmax = country_range[[2]]\n\t)\n\t\n\treturn(country_summary)\n}\n```\n:::\n\n\n\nStep 2 - execute our function (i.e., run the lines of code), and you would not be able to see it in you Environment pane.\n\n\n\n::: {.cell layout-align=\"left\"}\n::: {.cell-output-display}\n![](images/get_country_stats_function.png){fig-align='left' width=100%}\n:::\n:::\n\n\n\nStep 3 - use the function by pulling out stats for India and Pakistan\n\n\n::: {.cell}\n\n```{.r .cell-code}\nget_country_stats(df=meas, iso3_code=\"IND\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n country min Q1 median Q3 max\n1 India 3305 30813 47072 74828.5 252940\n```\n\n\n:::\n\n```{.r .cell-code}\nget_country_stats(df=meas, iso3_code=\"PAK\")\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n country min Q1 median Q3 max\n1 Pakistan 386 2065 3903 13860.5 55543\n```\n\n\n:::\n:::\n\n\n\n\n## Summary\n\n- Simple functions take the form:\n```\nfunctionName = function(arguments) {\n\t< function body >\n\treturn(list(outputs))\n}\n```\n- We can specify arguments defaults when you create the function\n\n\n## Mini Exercise\n\nCreate your own function that saves a line plot of a time series of measles cases for a specified country.\n\nStep 1. Determine your arguments, which are the same as the last example\n\nStep 2. Begin your function by subsetting the data to include only the country specified in the arguments (i.e, `country_data`), this is the same as the first line of code in the last example.\n\nStep 3. Return to Module 10 to remember how to use the `plot()` function. Hint you will need to specify the argument `type=\"l\" to make it a line plot. \n\nStep 4. Return to your function and add code to create a new plot using the `country_data` object. Note you will need to use the `png()` function before the `plot()` function and end it with `dev.off()` in order to save the file.\n\nStep 5. Use the function to generate a plot for India and Pakistan\n\n# Mini Exercise Answer\n\n\n\n::: {.cell}\n\n```{.r .cell-code}\nget_time_series_plot <- function(df, iso3_code){\n\t\n\tcountry_data <- subset(df, iso3c == iso3_code)\n\t\n\tpng(filename=paste0(\"output/time_series_\", iso3_code, \".png\"))\n\tplot(country_data$time, country_data$Cases, type=\"l\", xlab=\"year\", ylab=\"Measles Cases\")\n\tdev.off()\n\t\n}\n\nget_time_series_plot(df=meas, iso3_code=\"IND\")\nget_time_series_plot(df=meas, iso3_code=\"PAK\")\n```\n:::\n", | ||
"supporting": [], | ||
"filters": [ | ||
"rmarkdown/pagebreak.lua" | ||
], | ||
"includes": { | ||
"include-after-body": [ | ||
"\n<script>\n // htmlwidgets need to know to resize themselves when slides are shown/hidden.\n // Fire the \"slideenter\" event (handled by htmlwidgets.js) when the current\n // slide changes (different for each slide format).\n (function () {\n // dispatch for htmlwidgets\n function fireSlideEnter() {\n const event = window.document.createEvent(\"Event\");\n event.initEvent(\"slideenter\", true, true);\n window.document.dispatchEvent(event);\n }\n\n function fireSlideChanged(previousSlide, currentSlide) {\n fireSlideEnter();\n\n // dispatch for shiny\n if (window.jQuery) {\n if (previousSlide) {\n window.jQuery(previousSlide).trigger(\"hidden\");\n }\n if (currentSlide) {\n window.jQuery(currentSlide).trigger(\"shown\");\n }\n }\n }\n\n // hookup for slidy\n if (window.w3c_slidy) {\n window.w3c_slidy.add_observer(function (slide_num) {\n // slide_num starts at position 1\n fireSlideChanged(null, w3c_slidy.slides[slide_num - 1]);\n });\n }\n\n })();\n</script>\n\n" | ||
] | ||
}, | ||
"engineDependencies": {}, | ||
"preserve": {}, | ||
"postProcess": true | ||
} | ||
} |
Oops, something went wrong.