forked from swcarpentry/shell-novice
-
Notifications
You must be signed in to change notification settings - Fork 0
/
instructors.html
151 lines (149 loc) · 15.4 KB
/
instructors.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
<meta name="generator" content="pandoc">
<title>Software Carpentry: The Unix Shell</title>
<link rel="shortcut icon" type="image/x-icon" href="/favicon.ico" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap.css" />
<link rel="stylesheet" type="text/css" href="css/bootstrap/bootstrap-theme.css" />
<link rel="stylesheet" type="text/css" href="css/swc.css" />
<link rel="alternate" type="application/rss+xml" title="Software Carpentry Blog" href="http://software-carpentry.org/feed.xml"/>
<meta charset="UTF-8" />
<!-- HTML5 shim, for IE6-8 support of HTML5 elements -->
<!--[if lt IE 9]>
<script src="http://html5shim.googlecode.com/svn/trunk/html5.js"></script>
<![endif]-->
</head>
<body class="lesson">
<div class="container card">
<div class="banner">
<a href="http://software-carpentry.org" title="Software Carpentry">
<img alt="Software Carpentry banner" src="img/software-carpentry-banner.png" />
</a>
</div>
<article>
<div class="row">
<div class="col-md-10 col-md-offset-1">
<a href="index.html"><h1 class="title">The Unix Shell</h1></a>
<h2 class="subtitle">Instructor’s Guide</h2>
<ul>
<li>Why do we learn to use the shell?
<ul>
<li>Allows users to automate repetitive tasks</li>
<li>And capture small data manipulation steps that are normally not recorded to make research reproducible</li>
</ul></li>
<li>The Problem
<ul>
<li>Running the same workflow on several samples can be unnecessarily labour intensive</li>
<li>Manual manipulation of data files:
<ul>
<li>is often not captured in documentation</li>
<li>is hard to reproduce</li>
<li>is hard to troubleshoot, review, or improve</li>
</ul></li>
</ul></li>
<li>The Shell
<ul>
<li>Workflows can be automated through the use of shell scripts</li>
<li>Built-in commands allow for easy data manipulation (e.g. sort, grep, etc.)</li>
<li>Every step can be captured in the shell script and allow reproducibility and easy troubleshooting</li>
</ul></li>
</ul>
<h2 id="overall">Overall</h2>
<p>Many people have questioned whether we should still teach the shell. After all, anyone who wants to rename several thousand data files can easily do so interactively in the Python interpreter, and anyone who’s doing serious data analysis is probably going to do most of their work inside the IPython Notebook or R Studio. So why teach the shell?</p>
<p>The first answer is, “Because so much else depends on it.” Installing software, configuring your default editor, and controlling remote machines frequently assume a basic familiarity with the shell, and with related ideas like standard input and output. Many tools also use its terminology (for example, the <code>%ls</code> and <code>%cd</code> magic commands in IPython).</p>
<p>The second answer is, “Because it’s an easy way to introduce some fundamental ideas about how to use computers.” As we teach people how to use the Unix shell, we teach them that they should get the computer to repeat things (via tab completion, <code>!</code> followed by a command number, and <code>for</code> loops) rather than repeating things themselves. We also teach them to take things they’ve discovered they do frequently and save them for later re-use (via shell scripts), to give things sensible names, and to write a little bit of documentation (like comment at the top of shell scripts) to make their future selves’ lives better.</p>
<p>The third answer is, “Because it enables use of many domain-specific tools and compute resources researchers cannot access otherwise.” Familiarity with the shell is very useful for remote accessing machines, using high-performance computing infrastructure, and running new specialist tools in many disciplines. We do not teach HPC or domain-specific skills here but lay the groundwork for further development of these skills. In particular, understanding the syntax of commands, flags, and help systems is useful for domain specific tools and understanding the file system (and how to navigate it) is useful for remote access.</p>
<p>Finally, and perhaps most importantly, teaching people the shell lets us teach them to think about programming in terms of function composition. In the case of the shell, this takes the form of pipelines rather than nested function calls, but the core idea of “small pieces, loosely joined” is the same.</p>
<p>All of this material can be covered in three hours as long as learners using Windows do not run into roadblocks such as:</p>
<ul>
<li>not being able to figure out where their home directory is (particularly if they’re using Cygwin);</li>
<li>not being able to run a plain text editor; and</li>
<li>the shell refusing to run scripts that include DOS line endings.</li>
</ul>
<h2 id="preparing-to-teach">Preparing to Teach</h2>
<ul>
<li>For a great general list of tips, see <a href="http://software-carpentry.org/blog/2015/03/teaching-tips.html">this swcarpentry blog post</a></li>
<li>Use the <code>data</code> directory for in-workshop exercises and live coding examples. You can clone the shell-novice directory or use the <code>Download ZIP</code> button on the right to get the entire repository; we also now provide a zip file of the <code>data</code> directory that can be downloaded on its own from the repository by right-click + save. See the “Get Ready” box on the front page of the website for more details.<br />
</li>
<li>Website: various practices have been used.
<ul>
<li>Option 1: Can give links to learners before the lesson so they can follow along, catch up, and see exercises (particularly if you’re following the lesson content without many changes).</li>
<li>Option 2: Don’t show the website to the learners during the lesson, as it can be distracting - students may read instead of listen, and having another window open is an additional cognitive load.</li>
<li>In any case, make sure to point to website as a post-workshop reference.</li>
</ul></li>
<li>Content: Unless you have a truly generous amount of time (4+ hours), it is likely that you will not cover ALL the material in this lesson in a single half-day session. Plan ahead on what you might skip, what you really want to emphasize, etc.</li>
<li>Exercises: Think in advance about how you might want to handle exercises during the lesson. How are you assigning them (website, slide, handout)? Do you want everyone to try it and then you show the solution? Have a learner show the solution? Have groups each do a different exercise and present their solutions?</li>
<li><code>reference.md</code> can be printed out and given to students as a reference, your choice.</li>
<li>Other preparation: Feel free to add your own examples or side comments, but know that it shouldn’t be necessary - the topics and commands can be taught as given on the lesson pages! If you think there is a place where the lesson is lacking, feel free to raise an <a href="https://github.com/swcarpentry/shell-novice/issues">issue</a> or submit a <a href="https://github.com/swcarpentry/shell-novice/pulls">pull request</a>.</li>
<li>Optional setup:
<ul>
<li>Run <code>tools/gen-nene.py</code> to regenerate random data files if needed (some are already in the <code>data</code> directory, so you don’t have to do this).</li>
<li>Similarly, run <code>tools/gen-sequence.py</code> to regenerate random sequence data if needed.</li>
</ul></li>
</ul>
<h2 id="teaching-notes">Teaching Notes</h2>
<ul>
<li>Time estimates:
<ul>
<li><span class="citation">@gvwilson</span>: 3 hours</li>
<li><span class="citation">@christinalk</span>: in a usual 3 hour session, I can usually cover all of files/dirs, creating things, pipes filters, and most (although not all) of the scripts/loops lesson. I have never taught all 6 modules in 3 hours.</li>
</ul></li>
<li><p>Super cool online resource! http://explainshell.com/ will dissect any shell command you type in and display help text for each piece.</p></li>
<li><p>Resources for “splitting” your shell so that recent commands remain in view: https://github.com/rgaiacs/swc-shell-split-window</p></li>
<li><p>Running a text editor from the command line can be the biggest stumbling block during the entire lesson: many will try to run the same editor as the instructor (which may leave them trapped in the awful nether hell that is Vim), or will not know how to navigate to the right directory to save their file, or will run a word processor rather than a plain text editor. The quickest way past these problems is to have more knowledgeable learners help those who need it.</p></li>
<li><p>Tab completion sounds like a small thing: it isn’t. Re-running old commands using <code>!123</code> or <code>!wc</code> isn’t a small thing either, and neither are wildcard expansion and <code>for</code> loops. Each one is an opportunity to repeat one of the big ideas of Software Carpentry: if the computer <em>can</em> repeat it, some programmer somewhere will almost certainly have built some way for the computer <em>to</em> repeat it.</p></li>
<li><p>Building up a pipeline with four or five stages, then putting it in a shell script for re-use and calling that script inside a <code>for</code> loop, is a great opportunity to show how “seven plus or minus two” connects to programming. Once we have figured out how to do something moderately complicated, we make it re-usable and give it a name so that it only takes up one slot in working memory rather than several. It is also a good opportunity to talk about exploratory programming: rather than designing a program up front, we can do a few useful things and then retroactively decide which are worth encapsulating for future re-use.</p></li>
<li><p>If everything is going well, you can drive home the point that file extensions are essentially there to help computers (and human readers) understand file content and are not a requirement of files (covered briefly in <a href="01-filedir.html">Navigating Files and Directories</a>). This can be done in the <a href="03-pipefilter.html">Pipes and Filters</a> section by showing that you can redirect standard output to a file without the .txt extension (e.g., lengths), and that the resulting file is still a perfectly usable text file. Make the point that if double-clicked in the GUI, the computer will probably ask you what you want to do.</p></li>
<li><p>We have to leave out many important things because of time constraints, including file permissions, job control, and SSH. If learners already understand the basic material, this can be covered instead using the online lessons as guidelines. These limitations also have follow-on consequences:</p></li>
<li><p>It’s hard to discuss <code>#!</code> (shebang) without first discussing permissions, which we don’t do. <code>#!</code> is also <a href="http://www.in-ulm.de/~mascheck/various/shebang/">pretty complicated</a>, so even if we did discuss permissions, we probably still wouldn’t want to discuss <code>#!</code>.</p></li>
<li><p>Installing Bash and a reasonable set of Unix commands on Windows always involves some fiddling and frustration. Please see the latest set of installation guidelines for advice, and try it out yourself <em>before</em> teaching a class.</p></li>
<li><p>On Windows machines if <code>nano</code> hasn’t been properly installed with the <a href="http://github.com/swcarpentry/windows-installer">Software Carpentry Windows Installer</a> it is possible to use <code>notepad</code> as an alternative. There will be a GUI interface and line endings are treated differently, but otherwise, for the purposes of this lesson, <code>notepad</code> and <code>nano</code> can be used almost interchangeably.</p></li>
<li><p>On Windows, it appears that:</p>
<div class="sourceCode"><pre class="sourceCode bash"><code class="sourceCode bash">$ <span class="kw">cd</span>
$ <span class="kw">cd</span> Desktop</code></pre></div>
<p>will always put someone on their desktop. Have them create the example directory for the shell exercises there so that they can find it easily and watch it evolve.</p></li>
<li><p>Stay within POSIX-compliant commands, as all the teaching materials do. Your particular shell may have extensions beyond POSIX that are not available on other machines, especially the default OSX bash and Windows bash emulators. For example, POSIX <code>ls</code> does not have an <code>--ignore=</code> or <code>-I</code> option, and POSIX <code>head</code> takes <code>-n 10</code> or <code>-10</code>, but not the long form of <code>--lines=10</code>.</p></li>
</ul>
<h2 id="windows">Windows</h2>
<p>Installing Bash and a reasonable set of Unix commands on Windows always involves some fiddling and frustration. Please see the latest set of installation guidelines for advice, and try it out yourself <em>before</em> teaching a class. Options we have explored include:</p>
<ol style="list-style-type: decimal">
<li><a href="http://msysgit.github.io/">msysGit</a> (also called “Git Bash”),</li>
<li><a href="http://www.cygwin.com/">Cygwin</a>,</li>
<li>using a desktop virtual machine, and</li>
<li>having learners connect to a remote Unix machine (typically a VM in the cloud).</li>
</ol>
<p>Cygwin was the preferred option until mid-2013, but once we started teaching Git, msysGit proved to work better. Desktop virtual machines and cloud-based VMs work well for technically sophisticated learners, and can reduce installation and configuration at the start of the workshop, but:</p>
<ol style="list-style-type: decimal">
<li>they don’t work well on underpowered machines,</li>
<li>they’re confusing for novices (because simple things like copy and paste work differently),</li>
<li>learners leave the workshop without a working environment on their operating system of choice, and</li>
<li>learners may show up without having downloaded the VM or the wireless will go down (or become congested) during the lesson.</li>
</ol>
<p>Whatever you use, please <em>test it yourself</em> on a Windows machine <em>before</em> your workshop: things may always have changed behind your back since your last workshop. And please also make use of our <a href="http://github.com/swcarpentry/windows-installer">Software Carpentry Windows Installer</a>.</p>
</div>
</div>
</article>
<div class="footer">
<a class="label swc-blue-bg" href="http://software-carpentry.org">Software Carpentry</a>
<a class="label swc-blue-bg" href="https://github.com/swcarpentry/shell-novice">Source</a>
<a class="label swc-blue-bg" href="mailto:[email protected]">Contact</a>
<a class="label swc-blue-bg" href="LICENSE.html">License</a>
</div>
</div>
<!-- Javascript placed at the end of the document so the pages load faster -->
<script src="http://software-carpentry.org/v5/js/jquery-1.9.1.min.js"></script>
<script src="css/bootstrap/bootstrap-js/bootstrap.js"></script>
<script src='https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML'></script>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','https://www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-37305346-2', 'auto');
ga('send', 'pageview');
</script>
</body>
</html>