forked from gjkerns/IPSUR
-
Notifications
You must be signed in to change notification settings - Fork 0
/
IPSUR.html
20323 lines (15085 loc) · 926 KB
/
IPSUR.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml"
lang="en" xml:lang="en">
<head>
<title>Introduction to Probability and Statistics Using R</title>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8"/>
<meta name="generator" content="Org-mode"/>
<meta name="generated" content="2011-10-05 02:34:30 EDT"/>
<meta name="author" content="G. Jay Kerns"/>
<meta name="description" content=""/>
<meta name="keywords" content=""/>
<style type="text/css">
<!--/*--><![CDATA[/*><!--*/
html { font-family: Times, serif; font-size: 12pt; }
.title { text-align: center; }
.todo { color: red; }
.done { color: green; }
.tag { background-color: #add8e6; font-weight:normal }
.target { }
.timestamp { color: #bebebe; }
.timestamp-kwd { color: #5f9ea0; }
.right {margin-left:auto; margin-right:0px; text-align:right;}
.left {margin-left:0px; margin-right:auto; text-align:left;}
.center {margin-left:auto; margin-right:auto; text-align:center;}
p.verse { margin-left: 3% }
pre {
border: 1pt solid #AEBDCC;
background-color: #F3F5F7;
padding: 5pt;
font-family: courier, monospace;
font-size: 90%;
overflow:auto;
}
table { border-collapse: collapse; }
td, th { vertical-align: top; }
th.right { text-align:center; }
th.left { text-align:center; }
th.center { text-align:center; }
td.right { text-align:right; }
td.left { text-align:left; }
td.center { text-align:center; }
dt { font-weight: bold; }
div.figure { padding: 0.5em; }
div.figure p { text-align: center; }
div.inlinetask {
padding:10px;
border:2px solid gray;
margin:10px;
background: #ffffcc;
}
textarea { overflow-x: auto; }
.linenr { font-size:smaller }
.code-highlighted {background-color:#ffff00;}
.org-info-js_info-navigation { border-style:none; }
#org-info-js_console-label { font-size:10px; font-weight:bold;
white-space:nowrap; }
.org-info-js_search-highlight {background-color:#ffff00; color:#000000;
font-weight:bold; }
/*]]>*/-->
</style>
<link rel="stylesheet" type="text/css" href="css/stylesheet.css" />
<script type="text/javascript" src="http://orgmode.org/org-info.js"></script>
<script type="text/javascript" >
<!--/*--><![CDATA[/*><!--*/
org_html_manager.set("TOC_DEPTH", "2");
org_html_manager.set("LINK_HOME", "http://ipsur.org/index.html");
org_html_manager.set("LINK_UP", "IPSUR.html ");
org_html_manager.set("LOCAL_TOC", "0");
org_html_manager.set("VIEW_BUTTONS", "0");
org_html_manager.set("MOUSE_HINT", "underline");
org_html_manager.set("FIXED_TOC", "0");
org_html_manager.set("TOC", "0");
org_html_manager.set("VIEW", "info");
org_html_manager.setup(); // activate after the parameters are set
/*]]>*///-->
</script>
<script type="text/javascript">
<!--/*--><![CDATA[/*><!--*/
function CodeHighlightOn(elem, id)
{
var target = document.getElementById(id);
if(null != target) {
elem.cacheClassElem = elem.className;
elem.cacheClassTarget = target.className;
target.className = "code-highlighted";
elem.className = "code-highlighted";
}
}
function CodeHighlightOff(elem, id)
{
var target = document.getElementById(id);
if(elem.cacheClassElem)
elem.className = elem.cacheClassElem;
if(elem.cacheClassTarget)
target.className = elem.cacheClassTarget;
}
/*]]>*///-->
</script>
<script type="text/javascript" src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML">
<!--/*--><![CDATA[/*><!--*/
MathJax.Hub.Config({
// Only one of the two following lines, depending on user settings
// First allows browser-native MathML display, second forces HTML/CSS
// config: ["MMLorHTML.js"], jax: ["input/TeX"],
jax: ["input/TeX", "output/HTML-CSS"],
extensions: ["tex2jax.js","TeX/AMSmath.js","TeX/AMSsymbols.js",
"TeX/noUndefined.js"],
tex2jax: {
inlineMath: [ ["\\(","\\)"] ],
displayMath: [ ['$$','$$'], ["\\[","\\]"], ["\\begin{displaymath}","\\end{displaymath}"] ],
skipTags: ["script","noscript","style","textarea","pre","code"],
ignoreClass: "tex2jax_ignore",
processEscapes: false,
processEnvironments: true,
preview: "TeX"
},
showProcessingMessages: true,
displayAlign: "center",
displayIndent: "2em",
"HTML-CSS": {
scale: 100,
availableFonts: ["STIX","TeX"],
preferredFont: "TeX",
webFont: "TeX",
imageFont: "TeX",
showMathMenu: true,
},
MMLorHTML: {
prefer: {
MSIE: "MML",
Firefox: "MML",
Opera: "HTML",
other: "HTML"
}
}
});
/*]]>*///-->
</script>
</head>
<body>
<div id="org-div-home-and-up" style="text-align:right;font-size:70%;white-space:nowrap;">
<a accesskey="h" href="IPSUR.html "> UP </a>
|
<a accesskey="H" href="http://ipsur.org/index.html"> HOME </a>
</div>
<div id="preamble">
</div>
<div id="content">
<h1 class="title">Introduction to Probability and Statistics Using R</h1>
<link REL="SHORTCUT ICON" HREF="http://ipsur.org/img/favicon.ico">
<div id="table-of-contents">
<h2>Table of Contents</h2>
<div id="text-table-of-contents">
<ul>
<li><a href="#sec-1">1 An Introduction to Probability and Statistics</a>
<ul>
<li><a href="#sec-1-1">1.1 Probability</a></li>
<li><a href="#sec-1-2">1.2 Statistics</a></li>
<li><a href="#sec-1-3">1.3 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-introduction-to-R">2 An Introduction to R</a>
<ul>
<li><a href="#sec-download-install-R">2.1 Downloading and Installing \(\mathsf{R}\)</a></li>
<li><a href="#sec-Communicating-with-R">2.2 Communicating with \(\mathsf{R}\)</a></li>
<li><a href="#sec-Basic-R-Operations">2.3 Basic \(\mathsf{R}\) Operations and Concepts</a></li>
<li><a href="#sec-Getting-Help">2.4 Getting Help</a></li>
<li><a href="#sec-External-Resources">2.5 External Resources</a></li>
<li><a href="#sec-2-6">2.6 Other Tips</a></li>
<li><a href="#sec-2-7">2.7 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-Describing-Data-Distributions">3 Data Description</a>
<ul>
<li><a href="#sec-Types-of-Data">3.1 Types of Data</a></li>
<li><a href="#sec-features-of-data">3.2 Features of Data Distributions</a></li>
<li><a href="#sec-Descriptive-Statistics">3.3 Descriptive Statistics</a></li>
<li><a href="#sec-Exploratory-Data-Analysis">3.4 Exploratory Data Analysis</a></li>
<li><a href="#sec-multivariate-data">3.5 Multivariate Data and Data Frames</a></li>
<li><a href="#sec-Comparing-Data-Sets">3.6 Comparing Populations</a></li>
<li><a href="#sec-3-7">3.7 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-Probability">4 Probability</a>
<ul>
<li><a href="#sec-Sample-Spaces">4.1 Sample Spaces</a></li>
<li><a href="#sec-Events">4.2 Events</a></li>
<li><a href="#sec-Interpreting-Probabilities">4.3 Model Assignment</a></li>
<li><a href="#sec-Properties-of-Probability">4.4 Properties of Probability</a></li>
<li><a href="#sec-Methods-of-Counting">4.5 Counting Methods</a></li>
<li><a href="#sec-Conditional-Probability">4.6 Conditional Probability</a></li>
<li><a href="#sec-Independent-Events">4.7 Independent Events</a></li>
<li><a href="#sec-Bayes--Rule">4.8 Bayes' Rule</a></li>
<li><a href="#sec-Random-Variables">4.9 Random Variables</a></li>
<li><a href="#sec-4-10">4.10 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-Discrete-Distributions">5 Discrete Distributions</a>
<ul>
<li><a href="#sec-discrete-random-variables">5.1 Discrete Random Variables</a></li>
<li><a href="#sec-disc-uniform-dist">5.2 The Discrete Uniform Distribution</a></li>
<li><a href="#sec-binom-dist">5.3 The Binomial Distribution</a></li>
<li><a href="#sec-expectation-and-mgfs">5.4 Expectation and Moment Generating Functions</a></li>
<li><a href="#sec-empirical-distribution">5.5 The Empirical Distribution</a></li>
<li><a href="#sec-other-discrete-distributions">5.6 Other Discrete Distributions</a></li>
<li><a href="#sec-functions-discrete-rvs">5.7 Functions of Discrete Random Variables</a></li>
<li><a href="#sec-5-8">5.8 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-Continuous-Distributions">6 Continuous Distributions</a>
<ul>
<li><a href="#sec-continuous-random-variables">6.1 Continuous Random Variables</a></li>
<li><a href="#sec-The-Continuous-Uniform">6.2 The Continuous Uniform Distribution</a></li>
<li><a href="#sec-The-Normal-Distribution">6.3 The Normal Distribution</a></li>
<li><a href="#sec-Functions-of-Continuous">6.4 Functions of Continuous Random Variables</a></li>
<li><a href="#sec-Other-Continuous-Distributions">6.5 Other Continuous Distributions</a></li>
<li><a href="#sec-6-6">6.6 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-Multivariable-Distributions">7 Multivariate Distributions</a>
<ul>
<li><a href="#sec-Joint-Probability-Distributions">7.1 Joint and Marginal Probability Distributions</a></li>
<li><a href="#sec-Joint-and-Marginal-Expectation">7.2 Joint and Marginal Expectation</a></li>
<li><a href="#sec-Conditional-Distributions">7.3 Conditional Distributions</a></li>
<li><a href="#sec-Independent-Random-Variables">7.4 Independent Random Variables</a></li>
<li><a href="#sec-Exchangeable-Random-Variables">7.5 Exchangeable Random Variables</a></li>
<li><a href="#sec-The-Bivariate-Normal">7.6 The Bivariate Normal Distribution</a></li>
<li><a href="#sec-Transformations-Multivariate">7.7 Bivariate Transformations of Random Variables</a></li>
<li><a href="#sec-Remarks-for-the-Multivariate">7.8 Remarks for the Multivariate Case</a></li>
<li><a href="#sec-Multinomial">7.9 The Multinomial Distribution</a></li>
<li><a href="#sec-7-10">7.10 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-Sampling-Distributions">8 Sampling Distributions</a>
<ul>
<li><a href="#sec-simple-random-samples">8.1 Simple Random Samples</a></li>
<li><a href="#sec-sampling-from-normal-dist">8.2 Sampling from a Normal Distribution</a></li>
<li><a href="#sec-Central-Limit-Theorem">8.3 The Central Limit Theorem</a></li>
<li><a href="#sec-Samp-Dist-Two-Samp">8.4 Sampling Distributions of Two-Sample Statistics</a></li>
<li><a href="#sec-Simulated-Sampling-Distributions">8.5 Simulated Sampling Distributions</a></li>
<li><a href="#sec-8-6">8.6 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-Estimation">9 Estimation</a>
<ul>
<li><a href="#sec-Point-Estimation-1">9.1 Point Estimation</a></li>
<li><a href="#sec-Confidence-Intervals-for-Means">9.2 Confidence Intervals for Means</a></li>
<li><a href="#sec-Conf-Interv-for-Diff-Means">9.3 Confidence Intervals for Differences of Means</a></li>
<li><a href="#sec-Confidence-Intervals-Proportions">9.4 Confidence Intervals for Proportions</a></li>
<li><a href="#sec-Confidence-Intervals-for-Variances">9.5 Confidence Intervals for Variances</a></li>
<li><a href="#sec-Fitting-Distributions">9.6 Fitting Distributions</a></li>
<li><a href="#sec-Sample-Size-and-MOE">9.7 Sample Size and Margin of Error</a></li>
<li><a href="#sec-Other-Topics">9.8 Other Topics</a></li>
<li><a href="#sec-9-9">9.9 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-Hypothesis-Testing">10 Hypothesis Testing</a>
<ul>
<li><a href="#sec-Introduction-Hypothesis">10.1 Introduction</a></li>
<li><a href="#sec-Tests-for-Proportions">10.2 Tests for Proportions</a></li>
<li><a href="#sec-One-Sample-Tests">10.3 One Sample Tests for Means and Variances</a></li>
<li><a href="#sec-Two-Sample-Tests-for-Means">10.4 Two-Sample Tests for Means and Variances</a></li>
<li><a href="#sec-Other-Hypothesis-Tests">10.5 Other Hypothesis Tests</a></li>
<li><a href="#sec-Analysis-of-Variance">10.6 Analysis of Variance</a></li>
<li><a href="#sec-Sample-Size-and-Power">10.7 Sample Size and Power</a></li>
<li><a href="#sec-10-8">10.8 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-simple-linear-regression">11 Simple Linear Regression</a>
<ul>
<li><a href="#sec-Basic-Philosophy">11.1 Basic Philosophy</a></li>
<li><a href="#sec-SLR-Estimation">11.2 Estimation</a></li>
<li><a href="#sec-Model-Utility-SLR">11.3 Model Utility and Inference</a></li>
<li><a href="#sec-Residual-Analysis-SLR">11.4 Residual Analysis</a></li>
<li><a href="#sec-Other-Diagnostic-Tools-SLR">11.5 Other Diagnostic Tools</a></li>
<li><a href="#sec-11-6">11.6 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-multiple-linear-regression">12 Multiple Linear Regression</a>
<ul>
<li><a href="#sec-The-MLR-Model">12.1 The Multiple Linear Regression Model</a></li>
<li><a href="#sec-Estimation-and-Prediction-MLR">12.2 Estimation and Prediction</a></li>
<li><a href="#sec-Model-Utility-and-MLR">12.3 Model Utility and Inference</a></li>
<li><a href="#sec-Polynomial-Regression">12.4 Polynomial Regression</a></li>
<li><a href="#sec-Interaction">12.5 Interaction</a></li>
<li><a href="#sec-Qualitative-Explanatory-Variables">12.6 Qualitative Explanatory Variables</a></li>
<li><a href="#sec-Partial-F-Statistic">12.7 Partial <i>F</i> Statistic</a></li>
<li><a href="#sec-Residual-Analysis-MLR">12.8 Residual Analysis and Diagnostic Tools</a></li>
<li><a href="#sec-Additional-Topics-MLR">12.9 Additional Topics</a></li>
<li><a href="#sec-12-10">12.10 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-resampling-methods">13 Resampling Methods</a>
<ul>
<li><a href="#sec-Introduction-Resampling">13.1 Introduction</a></li>
<li><a href="#sec-Bootstrap-Standard-Errors">13.2 Bootstrap Standard Errors</a></li>
<li><a href="#sec-Bootstrap-Confidence-Intervals">13.3 Bootstrap Confidence Intervals</a></li>
<li><a href="#sec-Resampling-in-Hypothesis">13.4 Resampling in Hypothesis Tests</a></li>
<li><a href="#sec-13-5">13.5 Exercises</a></li>
</ul>
</li>
<li><a href="#cha-Nonparametric-Statistics">14 Nonparametric Statistics</a></li>
<li><a href="#cha-Categorical-Data-Analysis">15 Categorical Data Analysis</a></li>
<li><a href="#cha-Time-Series">16 Time Series</a></li>
<li><a href="#cha-R-Session-Information">17 R Session Information</a></li>
<li><a href="#cha-GNU-Free-Documentation">18 GNU Free Documentation License</a>
<ul>
<li><a href="#sec-18-1">18.1 0. PREAMBLE</a></li>
<li><a href="#sec-18-2">18.2 1. APPLICABILITY AND DEFINITIONS</a></li>
<li><a href="#sec-18-3">18.3 2. VERBATIM COPYING</a></li>
<li><a href="#sec-18-4">18.4 3. COPYING IN QUANTITY</a></li>
<li><a href="#sec-18-5">18.5 4. MODIFICATIONS</a></li>
<li><a href="#sec-18-6">18.6 5. COMBINING DOCUMENTS</a></li>
<li><a href="#sec-18-7">18.7 6. COLLECTIONS OF DOCUMENTS</a></li>
<li><a href="#sec-18-8">18.8 7. AGGREGATION WITH INDEPENDENT WORKS</a></li>
<li><a href="#sec-18-9">18.9 8. TRANSLATION</a></li>
<li><a href="#sec-18-10">18.10 9. TERMINATION</a></li>
<li><a href="#sec-18-11">18.11 10. FUTURE REVISIONS OF THIS LICENSE</a></li>
<li><a href="#sec-18-12">18.12 11. RELICENSING</a></li>
<li><a href="#sec-18-13">18.13 ADDENDUM: How to use this License for your documents</a></li>
</ul>
</li>
<li><a href="#cha-History">19 History</a></li>
<li><a href="#cha-data">20 Data</a>
<ul>
<li><a href="#sec-Data-Structures">20.1 Data Structures</a></li>
<li><a href="#sec-Importing-A-Data">20.2 Importing Data</a></li>
<li><a href="#sec-Creating-New-Data">20.3 Creating New Data Sets</a></li>
<li><a href="#sec-Editing-Data-Sets">20.4 Editing Data</a></li>
<li><a href="#sec-Exporting-a-Data">20.5 Exporting Data</a></li>
<li><a href="#sec-Reshaping-a-Data">20.6 Reshaping Data</a></li>
</ul>
</li>
<li><a href="#cha-Mathematical-Machinery">21 Mathematical Machinery</a>
<ul>
<li><a href="#sec-The-Algebra-of">21.1 Set Algebra</a></li>
<li><a href="#sec-Differential-and-Integral">21.2 Differential and Integral Calculus</a></li>
<li><a href="#sec-Sequences-and-Series">21.3 Sequences and Series</a></li>
<li><a href="#sec-The-Gamma-Function">21.4 The Gamma Function</a></li>
<li><a href="#sec-Linear-Algebra">21.5 Linear Algebra</a></li>
<li><a href="#sec-Multivariable-Calculus">21.6 Multivariable Calculus</a></li>
</ul>
</li>
<li><a href="#cha-Writing-Reports-with">22 Writing Reports with \(\mathsf{R}\)</a>
<ul>
<li><a href="#sec-What-to-Write">22.1 What to Write</a></li>
<li><a href="#sec-How-to-Write">22.2 How to Write It with R</a></li>
<li><a href="#sec-Formatting-Tables">22.3 Formatting Tables</a></li>
<li><a href="#sec-Other-Formats">22.4 Other Formats</a></li>
</ul>
</li>
<li><a href="#cha-Instructions-for-Instructors">23 Instructions for Instructors</a>
<ul>
<li><a href="#sec-Generating-This-Document">23.1 Generating This Document</a></li>
<li><a href="#sec-How-to-Use-Document">23.2 How to Use This Document</a></li>
<li><a href="#sec-Ancillary-Materials">23.3 Ancillary Materials</a></li>
<li><a href="#sec-Modifying-This-Document">23.4 Modifying This Document</a></li>
</ul>
</li>
<li><a href="#cha-RcmdrTestDrive-Story">24 <code>RcmdrTestDrive</code> Story</a>
<ul>
<li><a href="#sec-24-1">24.1 Case File: ALU-179 ``Murder Madness in Toon Town”</a></li>
</ul>
</li>
</ul>
</div>
</div>
<div id="outline-container-1" class="outline-2">
<h2 id="sec-1"><span class="section-number-2">1</span> An Introduction to Probability and Statistics</h2>
<div class="outline-text-2" id="text-1">
<p>
This chapter has proved to be the hardest to write, by far. The trouble is that there is so much to say – and so many people have already said it so much better than I could. When I get something I like I will release it here.
</p>
<p>
In the meantime, there is a lot of information already available to a person with an Internet connection. I recommend to start at Wikipedia, which is not a flawless resource but it has the main ideas with links to reputable sources.
</p>
<p>
In my lectures I usually tell stories about Fisher, Galton, Gauss, Laplace, Quetelet, and the Chevalier de Mere.
</p>
</div>
<div id="outline-container-1-1" class="outline-3">
<h3 id="sec-1-1"><span class="section-number-3">1.1</span> Probability</h3>
<div class="outline-text-3" id="text-1-1">
<p>
The common folklore is that probability has been around for millennia but did not gain the attention of mathematicians until approximately 1654 when the Chevalier de Mere had a question regarding the fair division of a game's payoff to the two players, if the game had to end prematurely.
</p>
</div>
</div>
<div id="outline-container-1-2" class="outline-3">
<h3 id="sec-1-2"><span class="section-number-3">1.2</span> Statistics</h3>
<div class="outline-text-3" id="text-1-2">
<p>
Statistics concerns data; their collection, analysis, and interpretation. In this book we distinguish between two types of statistics: descriptive and inferential.
</p>
<p>
Descriptive statistics concerns the summarization of data. We have a data set and we would like to describe the data set in multiple ways. Usually this entails calculating numbers from the data, called descriptive measures, such as percentages, sums, averages, and so forth.
</p>
<p>
Inferential statistics does more. There is an inference associated with the data set, a conclusion drawn about the population from which the data originated.
</p>
<p>
I would like to mention that there are two schools of thought of statistics: frequentist and bayesian. The difference between the schools is related to how the two groups interpret the underlying probability (see Section <a href="#sec-4-3">Interpreting Probabilities</a>). The frequentist school gained a lot of ground among statisticians due in large part to the work of Fisher, Neyman, and Pearson in the early twentieth century. That dominance lasted until inexpensive computing power became widely available; nowadays the bayesian school is garnering more attention and at an increasing rate.
</p>
<p>
This book is devoted mostly to the frequentist viewpoint because that is how I was trained, with the conspicuous exception of Sections <a href="#sec-Bayes--Rule">Bayes' Rule</a> and <a href="#sec-7-3">Conditional Distributions</a>. I plan to add more bayesian material in later editions of this book.
</p>
</div>
</div>
<div id="outline-container-1-3" class="outline-3">
<h3 id="sec-1-3"><span class="section-number-3">1.3</span> Exercises</h3>
<div class="outline-text-3" id="text-1-3">
</div>
</div>
</div>
<div id="outline-container-cha-introduction-to-R" class="outline-2">
<h2 id="cha-introduction-to-R"><a name="sec-2" id="sec-2"></a><span class="section-number-2">2</span> An Introduction to R</h2>
<div class="outline-text-2" id="text-cha-introduction-to-R">
<p>
Every \(\mathsf{R}\) book I have ever seen has had a section/chapter that is an introduction to \(\mathsf{R}\), and so does this one. The goal of this chapter is for a person to get up and running, ready for the material that follows. See Section <a href="#sec-2-5">External Resources</a> for links to other material which the reader may find useful.
</p>
<p>
<b>What do I want them to know?</b>
</p><ul>
<li>Where to find \(\mathsf{R}\) to install on a home computer, and a few comments to help with the usual hiccups that occur when installing something.
</li>
<li>Abbreviated remarks about the available options to interact with \(\mathsf{R}\).
</li>
<li>Basic operations (arithmetic, entering data, vectors) at the command prompt.
</li>
<li>How and where to find help when they get in trouble.
</li>
<li>Other little shortcuts I am usually asked when introducing \(\mathsf{R}\).
</li>
</ul>
</div>
<div id="outline-container-sec-download-install-R" class="outline-3">
<h3 id="sec-download-install-R"><a name="sec-2-1" id="sec-2-1"></a><span class="section-number-3">2.1</span> Downloading and Installing \(\mathsf{R}\)</h3>
<div class="outline-text-3" id="text-sec-download-install-R">
<p>
The instructions for obtaining \(\mathsf{R}\) largely depend on the user's hardware and operating system. The \(\mathsf{R}\) Project has written an \(\mathsf{R}\) Installation and Administration manual with complete, precise instructions about what to do, together with all sorts of additional information. The following is just a primer to get a person started.
</p>
</div>
<div id="outline-container-2-1-1" class="outline-4">
<h4 id="sec-2-1-1"><span class="section-number-4">2.1.1</span> Installing \(\mathsf{R}\)</h4>
<div class="outline-text-4" id="text-2-1-1">
<p>
Visit one of the links below to download the latest version of \(\mathsf{R}\)
for your operating system:
</p>
<dl>
<dt>Microsoft Windows:</dt><dd><a href="http://cran.r-project.org/bin/windows/base/">http://cran.r-project.org/bin/windows/base/</a>
</dd>
<dt>MacOS:</dt><dd><a href="http://cran.r-project.org/bin/macosx/">http://cran.r-project.org/bin/macosx/</a>
</dd>
<dt>Linux:</dt><dd><a href="http://cran.r-project.org/bin/linux/">http://cran.r-project.org/bin/linux/</a>
</dd>
</dl>
<p>
On Microsoft Windows, click the <code>R-x.y.z.exe</code> installer to start installation. When it asks for "Customized startup options", specify <code>Yes</code>. In the next window, be sure to select the SDI (single document interface) option; this is useful later when we discuss three dimensional plots with the <code>rgl</code> package \cite{rgl}.
</p>
</div>
<div id="outline-container-2-1-1-1" class="outline-5">
<h5 id="sec-2-1-1-1"><span class="section-number-5">2.1.1.1</span> Installing \(\mathsf{R}\) on a USB drive (Windows)</h5>
<div class="outline-text-5" id="text-2-1-1-1">
<p>
With this option you can use \(\mathsf{R}\) portably and without administrative privileges. There is an entry in the \(\mathsf{R}\) for Windows FAQ about this. Here is the procedure I use:
</p><ol>
<li>Download the Windows installer above and start installation as usual. When it asks <i>where</i> to install, navigate to the top-level directory of the USB drive instead of the default <code>C</code> drive.
</li>
<li>When it asks whether to modify the Windows registry, uncheck the box; we do NOT want to tamper with the registry.
</li>
<li>After installation, change the name of the folder from <code>R-x.y.z</code> to just plain \(\mathsf{R}\). (Even quicker: do this in step 1.)
</li>
<li><a href="http://ipsur.r-forge.r-project.org/book/download/R.exe">Download this shortcut</a> and move it to the top-level directory of the USB drive, right beside the \(\mathsf{R}\) folder, not inside the folder. Use the downloaded shortcut to run \(\mathsf{R}\).
</li>
</ol>
<p>
Steps 3 and 4 are not required but save you the trouble of navigating to the <code>R-x.y.z/bin</code> directory to double-click <code>Rgui.exe</code> every time you want to run the program. It is useless to create your own shortcut to <code>Rgui.exe</code>. Windows does not allow shortcuts to have relative paths; they always have a drive letter associated with them. So if you make your own shortcut and plug your USB drive into some <i>other</i> machine that happens to assign your drive a different letter, then your shortcut will no longer be pointing to the right place.
</p>
</div>
</div>
</div>
<div id="outline-container-sub-installing-loading-packages" class="outline-4">
<h4 id="sub-installing-loading-packages"><a name="sec-2-1-2" id="sec-2-1-2"></a><span class="section-number-4">2.1.2</span> Installing and Loading Add-on Packages</h4>
<div class="outline-text-4" id="text-sub-installing-loading-packages">
<p>
There are <i>base</i> packages (which come with \(\mathsf{R}\) automatically), and <i>contributed</i> packages (which must be downloaded for installation). For example, on the version of \(\mathsf{R}\) being used for this document the default base packages loaded at startup are
</p>
<pre class="src src-R">getOption(<span style="color: #8b2252;">"defaultPackages"</span>)
</pre>
<pre class="example">
[1] "datasets" "utils" "grDevices" "graphics"
[5] "stats" "methods"
</pre>
<p>
The base packages are maintained by a select group of volunteers, called \(\mathsf{R}\) Core. In addition to the base packages, there are literally thousands of additional contributed packages written by individuals all over the world. These are stored worldwide on mirrors of the Comprehensive \(\mathsf{R}\) Archive Network, or <code>CRAN</code> for short. Given an active Internet connection, anybody is free to download and install these packages and even inspect the source code.
</p>
<p>
To install a package named <code>foo</code>, open up \(\mathsf{R}\) and type <code>install.packages("foo")</code>\index{install.packages@\texttt{install.packages}}. To install <code>foo</code> and additionally install all of the other packages on which <code>foo</code> depends, instead type <code>install.packages("foo", depends = TRUE)</code>.
</p>
<p>
The general command <code>install.packages()</code> will (on most operating systems) open a window containing a huge list of available packages; simply choose one or more to install.
</p>
<p>
No matter how many packages are installed onto the system, each one must first be loaded for use with the <code>library</code>\index{library@\texttt{library}} function. For instance, the <code>foreign</code> package \cite{foreign} contains all sorts of functions needed to import data sets into \(\mathsf{R}\) from other software such as SPSS, SAS, <i>etc</i>. But none of those functions will be available until the command <code>library(foreign)</code> is issued.
</p>
<p>
Type <code>library()</code> at the command prompt (described below) to see a list of all available
packages in your library.
</p>
<p>
For complete, precise information regarding installation of \(\mathsf{R}\) and add-on packages, see the <a href="http://cran.r-project.org/manuals.html">\(\mathsf{R}\) Installation and Administration manual</a>.
</p>
</div>
</div>
</div>
<div id="outline-container-sec-Communicating-with-R" class="outline-3">
<h3 id="sec-Communicating-with-R"><a name="sec-2-2" id="sec-2-2"></a><span class="section-number-3">2.2</span> Communicating with \(\mathsf{R}\)</h3>
<div class="outline-text-3" id="text-sec-Communicating-with-R">
</div>
<div id="outline-container-2-2-1" class="outline-4">
<h4 id="sec-2-2-1"><span class="section-number-4">2.2.1</span> One line at a time</h4>
<div class="outline-text-4" id="text-2-2-1">
<p>
This is the most basic method and is the first one that beginners will use.
</p><ul>
<li>RGui (Microsoft \(\circledR\) Windows)
</li>
<li>Terminal
</li>
<li>Emacs/ESS, XEmacs
</li>
<li>JGR
</li>
</ul>
</div>
</div>
<div id="outline-container-2-2-2" class="outline-4">
<h4 id="sec-2-2-2"><span class="section-number-4">2.2.2</span> Multiple lines at a time</h4>
<div class="outline-text-4" id="text-2-2-2">
<p>
For longer programs (called <i>scripts</i>) there is too much code to write all at once at the command prompt. Furthermore, for longer scripts it is convenient to be able to only modify a certain piece of the script and run it again in \(\mathsf{R}\). Programs called <i>script editors</i> are specially designed to aid the communication and code writing process. They have all sorts of helpful features including \(\mathsf{R}\) syntax highlighting, automatic code completion, delimiter matching, and dynamic help on the \(\mathsf{R}\) functions as they are being written. Even more, they often have all of the text editing features of programs like Microsoft\(\circledR\)Word. Lastly, most script editors are fully customizable in the sense that the user can customize the appearance of the interface to choose what colors to display, when to display them, and how to display them.
</p>
<dl>
<dt>\(\mathsf{R}\) Editor (Windows):\index{R Editor@\textsf{R} Editor}</dt><dd>In Microsoft\(\circledR\) Windows, \(\mathsf{R}\) Gui has its own built-in script editor, called \(\mathsf{R}\) Editor. From the console window, select <code>File</code> \(\triangleright\) <code>New Script</code>. A script window opens, and the lines of code can be written in the window. When satisfied with the code, the user highlights all of the commands and presses \textsf{Ctrl+R}. The commands are automatically run at once in \(\mathsf{R}\) and the output is shown. To save the script for later, click <code>File</code> \(\triangleright\) <code>Save as...</code> in \(\mathsf{R}\) Editor. The script can be reopened later with <code>File</code> \(\triangleright\)} <code>Open Script...</code> in <code>RGui</code>. Note that \(\mathsf{R}\) Editor does not have the fancy syntax highlighting that the others do.
</dd>
<dt>\(\mathsf{R}\) WinEdt:\index{RWinEdt@\textsf{R}WinEdt}</dt><dd>This option is coordinated with WinEdt for \LaTeX{} and has additional features such as code highlighting, remote sourcing, and a ton of other things. However, one first needs to download and install a shareware version of another program, WinEdt, which is only free for a while – pop-up windows will eventually appear that ask for a registration code. \(\mathsf{R}\) WinEdt is nevertheless a very fine choice if you already own WinEdt or are planning to purchase it in the near future.
</dd>
<dt>Tinn \(\mathsf{R}\) / Sciviews K:\index{Tinn R@Tinn \textsf{R}}\index{Sciviews K}</dt><dd>This one is completely free and has all of the above mentioned options and more. It is simple enough to use that the user can virtually begin working with it immediately after installation. But Tinn \(\mathsf{R}\) proper is only available for Microsoft\(\circledR\) Windows operating systems. If you are on MacOS or Linux, a comparable alternative is Sci-Views - Komodo Edit.
</dd>
<dt>Emacs/ESS:\index{Emacs}\index{ESS}</dt><dd>Emacs is an all purpose text editor. It can do absolutely anything with respect to modifying, searching, editing, and manipulating, text. And if Emacs can't do it, then you can write a program that extends Emacs to do it. Once such extension is called <code>ESS</code>, which stands for <i>E</i>-macs <i>S</i>-peaks <i>S</i>-tatistics. With ESS a person can speak to \(\mathsf{R}\), do all of the tricks that the other script editors offer, and much, much, more. Please see the following for installation details, documentation, reference cards, and a whole lot more: <a href="http://ess.r-project.org">http://ess.r-project.org</a>.
<i>Fair warning</i>: if you want to try Emacs and if you grew up with Microsoft\(\circledR\) Windows or Macintosh, then you are going to need to relearn everything you thought you knew about computers your whole life. (Or, since Emacs is completely customizable, you can reconfigure Emacs to behave the way you want.) I have personally experienced this transformation and I will never go back.
</dd>
<dt>JGR (read ``Jaguar''):\index{JGR}</dt><dd>This one has the bells and whistles of <code>RGui</code> plus it is based on Java, so it works on multiple operating systems. It has its own script editor like \(\mathsf{R}\) Editor but with additional features such as syntax highlighting and code-completion. If you do not use Microsoft\(\circledR\) Windows (or even if you do) you definitely want to check out this one.
</dd>
<dt>Kate, Bluefish, <i>etc</i></dt><dd>There are literally dozens of other text editors available, many of them free, and each has its own (dis)advantages. I only have mentioned the ones with which I have had substantial personal experience and have enjoyed at some point. Play around, and let me know what you find.
</dd>
</dl>
</div>
</div>
<div id="outline-container-2-2-3" class="outline-4">
<h4 id="sec-2-2-3"><span class="section-number-4">2.2.3</span> Graphical User Interfaces (GUIs)</h4>
<div class="outline-text-4" id="text-2-2-3">
<p>
By the word ``GUI'' I mean an interface in which the user communicates with \(\mathsf{R}\) by way of points-and-clicks in a menu of some sort. Again, there are many, many options and I only mention ones that I have used and enjoyed. Some of the other more popular script editors can be downloaded from the \(\mathsf{R}\)-Project website at <a href="http://www.sciviews.org/_rgui/">http://www.sciviews.org/_rgui/</a>. On the left side of the screen (under <b>Projects</b>) there are several choices available.
</p>
<dl>
<dt>\(\mathsf{R}\) Commander</dt><dd>provides\index{R Commander@\textsf{R} Commander} a point-and-click interface to many basic statistical tasks. It is called the ``Commander'' because every time one makes a selection from the menus, the code corresponding to the task is listed in the output window. One can take this code, copy-and-paste it to a text file, then re-run it again at a later time without the \(\mathsf{R}\) Commander's assistance. It is well suited for the introductory level. <code>Rcmdr</code> also allows for user-contributed ``Plugins'' which are separate packages on <code>CRAN</code> that add extra functionality to the <code>Rcmdr</code> package. The plugins are typically named with the prefix <code>RcmdrPlugin</code> to make them easy to identify in the <code>CRAN</code> package list. One such plugin is the <code>RcmdrPlugin.IPSUR</code> package which accompanies this text.
</dd>
<dt>Poor Man's GUI\index{Poor Man's GUI}</dt><dd>is an alternative to the <code>Rcmdr</code> which is based on GTk instead of Tcl/Tk. It has been a while since I used it but I remember liking it very much when I did. One thing that stood out was that the user could drag-and-drop data sets for plots. See here for more information: <a href="http://wiener.math.csi.cuny.edu/pmg/">http://wiener.math.csi.cuny.edu/pmg/</a>.
</dd>
<dt>Rattle\index{Rattle}</dt><dd>is a data mining toolkit which was designed to manage/analyze very large data sets, but it provides enough other general functionality to merit mention here. See \cite{rattle} for more information.
</dd>
<dt>Deducer\index{Deducer}</dt><dd>is relatively new and shows promise from what I have seen, but I have not actually used it in the classroom yet.
</dd>
</dl>
</div>
</div>
</div>
<div id="outline-container-sec-Basic-R-Operations" class="outline-3">
<h3 id="sec-Basic-R-Operations"><a name="sec-2-3" id="sec-2-3"></a><span class="section-number-3">2.3</span> Basic \(\mathsf{R}\) Operations and Concepts</h3>
<div class="outline-text-3" id="text-sec-Basic-R-Operations">
<p>
The \(\mathsf{R}\) developers have written an introductory document entitled ``An Introduction to \(\mathsf{R}\)''. There is a sample session included which shows what basic interaction with \(\mathsf{R}\) looks like. I recommend that all new users of \(\mathsf{R}\) read that document, but bear in mind that there are concepts mentioned which will be unfamiliar to the beginner.
</p>
<p>
Below are some of the most basic operations that can be done with \(\mathsf{R}\). Almost every book about \(\mathsf{R}\) begins with a section like the one below; look around to see all sorts of things that can be done at this most basic level.
</p>
</div>
<div id="outline-container-sub-Arithmetic" class="outline-4">
<h4 id="sub-Arithmetic"><a name="sec-2-3-1" id="sec-2-3-1"></a><span class="section-number-4">2.3.1</span> Arithmetic</h4>
<div class="outline-text-4" id="text-sub-Arithmetic">
<pre class="src src-R">2 + 3 <span style="color: #b22222;"># </span><span style="color: #b22222;">add</span>
4 * 5 / 6 <span style="color: #b22222;"># </span><span style="color: #b22222;">multiply and divide</span>
7^8 <span style="color: #b22222;"># </span><span style="color: #b22222;">7 to the 8th power</span>
</pre>
<pre class="example">
[1] 5
[1] 3.333333
[1] 5764801
</pre>
<p>
Notice the comment character <code>#</code>\index{#@\texttt{\#}}. Anything typed after a <code>#</code> symbol is ignored by \(\mathsf{R}\). We know that \(20/6\) is a repeating decimal, but the above example shows only 7 digits. We can change the number of digits displayed with <code>options</code>\index{options@\texttt{options}}:
</p>
<pre class="src src-R">options(digits = 16)
10/3 <span style="color: #b22222;"># </span><span style="color: #b22222;">see more digits</span>
sqrt(2) <span style="color: #b22222;"># </span><span style="color: #b22222;">square root</span>
exp(1) <span style="color: #b22222;"># </span><span style="color: #b22222;">Euler's constant, e</span>
pi
options(digits = 7) <span style="color: #b22222;"># </span><span style="color: #b22222;">back to default</span>
</pre>
<pre class="example">
[1] 3.333333333333333
[1] 1.414213562373095
[1] 2.718281828459045
[1] 3.141592653589793
</pre>
<p>
Note that it is possible to set <code>digits</code>\index{digits@\texttt{digits}} up to 22, but setting them over 16 is not recommended (the extra significant digits are not necessarily reliable). Above notice the <code>sqrt</code>\index{sqrt@\texttt{sqrt}} function for square roots and the <code>exp</code>\index{exp@\texttt{exp}} function for powers of \(\mathrm{e}\), Euler's number.
</p>
</div>
</div>
<div id="outline-container-sub-Assignment-Object-names" class="outline-4">
<h4 id="sub-Assignment-Object-names"><a name="sec-2-3-2" id="sec-2-3-2"></a><span class="section-number-4">2.3.2</span> Assignment, Object names, and Data types</h4>
<div class="outline-text-4" id="text-sub-Assignment-Object-names">
<p>
It is often convenient to assign numbers and values to variables (objects) to be used later. The proper way to assign values to a variable is with the <code><-</code> operator (with a space on either side). The <code>=</code> symbol works too, but it is recommended by the \(\mathsf{R}\) masters to reserve <code>=</code> for specifying arguments to functions (discussed later). In this book we will follow their advice and use <code><-</code> for assignment. Once a variable is assigned, its value can be printed by simply entering the variable name by itself.
</p>
<pre class="src src-R">x <span style="color: #008b8b;"><-</span> 7*41/pi <span style="color: #b22222;"># </span><span style="color: #b22222;">don't see the calculated value</span>
x <span style="color: #b22222;"># </span><span style="color: #b22222;">take a look</span>
</pre>
<pre class="example">
[1] 91.35494
</pre>
<p>
When choosing a variable name you can use letters, numbers, dots ``\texttt{.}'', or underscore ``\texttt{_}'' characters. You cannot use mathematical operators, and a leading dot may not be followed by a number. Examples of valid names are: <code>x</code>, <code>x1</code>, <code>y.value</code>, and <code>!y_hat</code>. (More precisely, the set of allowable characters in object names depends on one's particular system and locale; see An Introduction to \(\mathsf{R}\) for more discussion on this.)
</p>
<p>
Objects can be of many <i>types</i>, <i>modes</i>, and <i>classes</i>. At this level, it is not necessary to investigate all of the intricacies of the respective types, but there are some with which you need to become familiar:
</p>
<dl>
<dt>integer:</dt><dd>the values \(0\), \(\pm1\), \(\pm2\), …; these are represented exactly by \(\mathsf{R}\).
</dd>
<dt>double:</dt><dd>real numbers (rational and irrational); these numbers are not represented exactly (save integers or fractions with a denominator that is a power of 2, see \cite{Venables2010}).
</dd>
<dt>character:</dt><dd>elements that are wrapped with pairs of ="= or ';
</dd>
<dt>logical:</dt><dd>includes <code>TRUE</code>, <code>FALSE</code>, and <code>NA</code> (which are reserved words); the <code>NA</code>\index{NA@\texttt{NA}} stands for ``not available'', <i>i.e.</i>, a missing value.
</dd>
</dl>
<p>
You can determine an object's type with the <code>typeof</code>\index{typeof@\texttt{typeof}} function. In addition to the above, there is the <code>complex</code>\index{complex@\texttt{complex}}\index{as.complex@\texttt{as.complex}} data type:
</p>
<pre class="src src-R">sqrt(-1) <span style="color: #b22222;"># </span><span style="color: #b22222;">isn't defined</span>
sqrt(-1+0i) <span style="color: #b22222;"># </span><span style="color: #b22222;">is defined</span>
sqrt(as.complex(-1)) <span style="color: #b22222;"># </span><span style="color: #b22222;">same thing</span>
(0 + 1i)^2 <span style="color: #b22222;"># </span><span style="color: #b22222;">should be -1</span>
typeof((0 + 1i)^2)
</pre>
<pre class="example">
[1] NaN
[1] 0+1i
[1] 0+1i
[1] -1+0i
[1] "complex"
</pre>
<p>
Note that you can just type <code>(1i)^2</code> to get the same answer. The <code>NaN</code>\index{NaN@\texttt{NaN}} stands for ``not a number''; it is represented internally as <code>double</code>\index{double}.
</p>
</div>
</div>
<div id="outline-container-sub-Vectors" class="outline-4">
<h4 id="sub-Vectors"><a name="sec-2-3-3" id="sec-2-3-3"></a><span class="section-number-4">2.3.3</span> Vectors</h4>
<div class="outline-text-4" id="text-sub-Vectors">
<p>
All of this time we have been manipulating vectors of length 1. Now let us move to vectors with multiple entries.
</p>
</div>
<div id="outline-container-2-3-3-1" class="outline-5">
<h5 id="sec-2-3-3-1"><span class="section-number-5">2.3.3.1</span> Entering data vectors</h5>
<div class="outline-text-5" id="text-2-3-3-1">
<p>
<b>The long way:</b>\index{c@\texttt{c}} If you would like to enter the data <code>74,31,95,61,76,34,23,54,96</code> into \(\mathsf{R}\), you may create a data vector with the <code>c</code> function (which is short for <i>concatenate</i>).
</p>
<pre class="src src-R">x <span style="color: #008b8b;"><-</span> c(74, 31, 95, 61, 76, 34, 23, 54, 96)
x
</pre>
<pre class="example">
[1] 74 31 95 61 76 34 23 54 96
</pre>
<p>
The elements of a vector are usually coerced by \(\mathsf{R}\) to the the most general type of any of the elements, so if you do <code>c(1, "2")</code> then the result will be <code>c("1", "2")</code>.
</p>
<p>
<b>A shorter way:</b> \index{scan@\texttt{scan}}: The <code>scan</code> method is useful when the data are stored somewhere else. For instance, you may type <code>x <- scan()</code> at the command prompt and \(\mathsf{R}\) will display <code>1:</code> to indicate that it is waiting for the first data value. Type a value and press <code>Enter</code>, at which point \(\mathsf{R}\) will display <code>2:</code>, and so forth. Note that entering an empty line stops the scan. This method is especially handy when you have a column of values, say, stored in a text file or spreadsheet. You may copy and paste them all at the <code>1:</code> prompt, and \(\mathsf{R}\) will store all of the values instantly in the vector <code>x</code>.
</p>
<p>
<b>Repeated data; regular patterns:</b> the <code>seq</code>\index{seq@\texttt{seq}} function will generate all sorts of sequences of numbers. It has the arguments <code>from</code>, <code>to</code>, <code>by</code>, and <code>length.out</code> which can be set in concert with one another. We will do a couple of examples to show you how it works.
</p>
<pre class="src src-R">seq(from = 1, to = 5)
seq(from = 2, by = -0.1, length.out = 4)
</pre>
<pre class="example">
[1] 1 2 3 4 5
[1] 2.0 1.9 1.8 1.7
</pre>
<p>
Note that we can get the first line much quicker with the colon operator.
</p>
<pre class="src src-R">1:5
</pre>
<pre class="example">
[1] 1 2 3 4 5
</pre>
<p>
The vector <code>LETTERS</code>\index{LETTERS@\texttt{LETTERS}} has the 26 letters of the English alphabet in uppercase and <code>letters</code>\index{letters@\texttt{letters}} has all of them in lowercase.
</p>
</div>
</div>
<div id="outline-container-2-3-3-2" class="outline-5">
<h5 id="sec-2-3-3-2"><span class="section-number-5">2.3.3.2</span> Indexing data vectors</h5>
<div class="outline-text-5" id="text-2-3-3-2">
<p>
Sometimes we do not want the whole vector, but just a piece of it. We can access the intermediate parts with the <code>[]</code>\index{[]@\texttt{{[}{]}}} operator. Observe (with <code>x</code> defined above)
</p>
<pre class="src src-R">x[1]
x[2:4]
x[c(1,3,4,8)]
x[-c(1,3,4,8)]
</pre>
<pre class="example">
[1] 74
[1] 31 95 61
[1] 74 95 61 54
[1] 31 76 34 23 96
</pre>
<p>
Notice that we used the minus sign to specify those elements that we do <i>not</i> want.
</p>
<pre class="src src-R">LETTERS[1:5]
letters[-(6:24)]
</pre>
<pre class="example">
[1] "A" "B" "C" "D" "E"
[1] "a" "b" "c" "d" "e" "y" "z"
</pre>
</div>
</div>
</div>
<div id="outline-container-sub-Functions-and-Expressions" class="outline-4">
<h4 id="sub-Functions-and-Expressions"><a name="sec-2-3-4" id="sec-2-3-4"></a><span class="section-number-4">2.3.4</span> Functions and Expressions</h4>
<div class="outline-text-4" id="text-sub-Functions-and-Expressions">
<p>
A function takes arguments as input and returns an object as output. There are functions to do all sorts of things. We show some examples below.
</p>
<pre class="src src-R">x <span style="color: #008b8b;"><-</span> 1:5
sum(x)
length(x)
min(x)
mean(x) <span style="color: #b22222;"># </span><span style="color: #b22222;">sample mean</span>
sd(x) <span style="color: #b22222;"># </span><span style="color: #b22222;">sample standard deviation</span>
</pre>
<pre class="example">
[1] 15
[1] 5
[1] 1
[1] 3
[1] 1.581139
</pre>
<p>
It will not be long before the user starts to wonder how a particular function is doing its job, and since \(\mathsf{R}\) is open-source, anybody is free to look under the hood of a function to see how things are calculated. For detailed instructions see the article ``Accessing the Sources'' by Uwe Ligges \cite{Ligges2006}. In short:
</p>
<p>
<b>Type the name of the function</b> without any parentheses or arguments. If you are lucky then the code for the entire function will be printed, right there looking at you. For instance, suppose that we would like to see how the <code>intersect</code>\index{intersect@\texttt{intersect}} function works:
</p>
<pre class="src src-R">intersect
</pre>
<pre class="example">
function (x, ...)
UseMethod("intersect")
<environment: namespace:prob>
</pre>
<p>
<b>If instead</b> it shows <code>UseMethod(something)</code>\index{UseMethod@\texttt{UseMethod}} then you will need to choose the <i>class</i> of the object to be inputted and next look at the <i>method</i> that will be <i>dispatched</i> to the object. For instance, typing <code>rev</code>\index{rev@\texttt{rev}} says
</p>
<pre class="src src-R">rev
</pre>
<pre class="example">
function (x)
UseMethod("rev")
<environment: namespace:base>
</pre>
<p>
The output is telling us that there are multiple methods associated with the <code>rev</code> function. To see what these are, type
</p>
<pre class="src src-R">methods(rev)
</pre>
<pre class="example">