[Review]: Introduction to deep learning #25

svenvanderburg · 2023-09-01T14:36:11Z

Lesson Title

Introduction to deep learning

Lesson Repository URL

https://github.com/carpentries-incubator/deep-learning-intro

Lesson Website URL

https://carpentries-incubator.github.io/deep-learning-intro/

Lesson Description

This is a hands-on introduction to the first steps in Deep Learning, intended for researchers who are familiar with (non-deep) Machine Learning.

The use of Deep Learning has seen a sharp increase of popularity and applicability over the last decade. While Deep Learning can be a useful tool for researchers from a wide range of domains, taking the first steps in the world of Deep Learning can be somewhat intimidating. This introduction aims to cover the basics of Deep Learning in a practical and hands-on manner, so that upon completion, you will be able to train your first neural network and understand what next steps to take to improve the model.

We start with explaining the basic concepts of neural networks, and then go through the different steps of a Deep Learning workflow. Learners will learn how to prepare data for deep learning, how to implement a basic Deep Learning model in Python with Keras, how to monitor and troubleshoot the training process and how to implement different layer types such as convolutional layers.

Author Usernames

@dsmits @psteinb @cpranav93 @colinsauze @CunliangGeng

Zenodo DOI

10.5281/zenodo.8308392

Differences From Existing Lessons

No response

Confirmation of Lesson Requirements

is the original work of the author(s), or that any content derived from another source is reused with permission and appropriate attribution
aligns with The Carpentries Code of Conduct
is published under a CC-BY or CC0 license
uses The Carpentries lesson template or Carpentries Workbench without significant customisation/adaptation.

JOSE Submission Requirements

the lesson repository includes paper.md and paper.bib files as described in the JOSE submission guide for learning modules

Potential Reviewers

No response

The text was updated successfully, but these errors were encountered:

svenvanderburg · 2023-09-01T14:39:49Z

We're still running a final round of comments for the paper (see carpentries-incubator/deep-learning-intro#364), the plan is to submit the paper on the 15th of September through: https://openjournals.readthedocs.io/en/jose/submitting.html#submitting-your-paper . I'm not sure how that relates to this review, are the 2 independent or should we wait for this review process to finish before submitting to JOSE?

tobyhodges · 2023-09-01T14:44:18Z

Great to see this submission, @svenvanderburg. As a listed contributor, I have a conflict of interest acting as editor for this one. I am going to find another community member who can fulfill the role for this review, and will post back here when it's ready.

To answer your question about review order: we have been asking lesson developers to submit the lesson for review here, before the JOSE review. See #11 and the related review in JOSE for an example.

svenvanderburg · 2023-09-15T11:08:32Z

@tobyhodges any update on finding an editor?

svenvanderburg · 2023-10-04T09:37:47Z

@tobyhodges ? 😇

tobyhodges · 2023-10-10T17:14:50Z

I have reached out to a potential guest editor for this review and am waiting for confirmation. Hoping to be able to follow up very soon!

svenvanderburg · 2023-10-11T11:23:10Z

I have reached out to a potential guest editor for this review and am waiting for confirmation. Hoping to be able to follow up very soon!

Perfect, thank you for the update 🙏

tobyhodges · 2023-10-17T12:47:53Z

Good news: @brownsarahm has kindly agreed to act as Guest Editor for this review. I am extremely grateful to her for being willing to take this on.

brownsarahm · 2023-10-31T13:10:28Z

I'll work on this in small bits, but this way it's all in one place and the authors could work on the (very minor) accessibility issues and one small note on setup that I have checked so far.

Editor Checklist - Intro to Deep Learning

Accessibility

All figures are also described in image alternative text or elsewhere in the lesson body.
The lesson uses appropriate heading levels:
- h2 is used for sections within a page.
- no “jumps” are present between heading levels e.g. h2->h4.
- no page contains more than one h1 element i.e. none of the source files include first-level headings.
The contrast ratio of text in all figures is at least 4.5:1.
the check boxes in the prereqs render poorly in the workbench + are an accessibility flag bc they're unlabeled "form elements" these should be removed
listed browser support is not matched to jupyter browser support notably, Windows' current default browser is a chromium browser and therefore supported (and on general web compatibility, better than safari by a nontrivial margin) this should be updated to include more windows users without having to install
most table headers are empty, but this might not be a problem
in ep 2 there is a link to the sklearn docs with the text "here" this is rated "suspcicous" in accessibility terms, recommend "explained in the scikit-learn docs" instead of explained "here"
minor syntax issue (: instead of =) causing "missing alt text" on an image in :
- ep 1
- ep 4 link to source

Content

The lesson teaches data and/or computational skills that could promote efficient, open, and reproducible research.
All exercises have solutions.
Opportunities for formative assessments are included and distributed throughout the lesson sufficiently to track learner progress. (We aim for at least one formative assessment every 10-15 minutes.)
Any data sets used in the lesson are published under a permissive open license i.e. CC0 or equivalent.

Datasets and licenses

penguin in CC0
weather is CC-BY
CIFAR10 license is unclear/not easy to find/ may not exist, but it is public data

Design

Learning objectives are defined for the lesson and every episode.
The target audience of the lesson is identified specifically and in sufficient detail.

to fix:

question, objectives, and keypoints all display with "" around them

Repository

The lesson repository includes:

a CC-BY or CC0 license.
a CODE_OF_CONDUCT.md file that links to The Carpentries Code of Conduct.
a list of lesson maintainers.
tabs to display Issues and Pull Requests for the project.
replace this with any further comments relating to the lesson repository.

Structure

Estimated times are included in every episode for teaching and completing exercises.
Episodes lengths are appropriate for the management of cognitive load throughout the lesson.

comment:

all episodes are 55min + episode 3 is over 3 hours long?

Supporting information

The lesson includes:

a list of required prior skills and/or knowledge.
setup and installation instructions.
a glossary of key terms or links out to definitions in an external glossary e.g. Glosario.

other setup note:

it says "open a terminal" but that is insufficient information for a Windows user (they have multiple terminals and to launch jupyter only the "anaconda prompt" by default will work; CMD or powershell typically will not)

General

replace this with any other comments that do not fit into any of the previous sections.

svenvanderburg · 2023-12-20T10:25:21Z

@brownsarahm any update on the progress?

svenvanderburg · 2024-01-17T13:00:24Z

@brownsarahm any update? Can you give us an indication when we can expect this to be done?

brownsarahm · 2024-01-24T15:58:57Z

Hi! Sorry, a bunch of unexpected things happened last fall, and then when you sent the first check-in I was off of work for the holidays. And the second came while I was in a deadline crunch,

This is now back in my active queue. I should finish the pre-reviewer stuff within a week and I'm looking for reviewers starting now.

svenvanderburg · 2024-01-31T07:06:50Z

Hi @brownsarahm. Cool, thanks for the update 🤗

brownsarahm · 2024-01-31T13:54:30Z

editorial checks are done and @tobyhodges and I are working on finding reviewers next.

The comment above has a few things for you all to look at now, but thanks for resolving all of the previously identified ones already!

tobyhodges · 2024-01-31T13:58:53Z

Thanks @brownsarahm. I have suggested a few reviewers to Sarah but if any of my fellow authors can also suggest anyone they think would be suitable, I am sure it would be helpful. (Please do not tag anyone here by their GitHub handle.)

brownsarahm · 2024-02-16T17:11:06Z

@svenvanderburg Do you have any updates in response to my final editorial check?

In particular, do you have responses to the concerns about:

episode lengths
dataset licenses

and ideally before we assign reviewers, it would be nice to resolve, but these are minor:

" rendering on episode metadata
typos breaking alt-text

svenvanderburg · 2024-02-18T11:14:15Z

Hey @brownsarahm.

Sorry, I totally missed that it was final! Thanks for extra pinging me @brownsarahm :)

Some answers:

Episode lengths

Indeed episode 3 takes a bit longer than the other ones, but not twice longer. In fact, the timing for the other episode is too optimistic: Here is a PR with more realistic timing. In comparison to other lessons the episodes are a bit longer, this is because we want to finish the full deep learning workflow in each episode. When teaching, this is not really a problem though, the workshop is actually pretty balanced in terms of cognitive load because in every episode we go through this deep learning workflow once, and conceptually it makes sense to have the cuts between episodes at these points. What do you think? We could maybe write this explanation in the introductory instructor notes?

Dataset licenses

If I understand correctly, the only problem is with the CIFAR-10 dataset. It is so widely used, but I never realized it doesn't actually have a license judging from the official website.

I found a paper that extracts the statement about citation as their license:
Running example. In this paper, we download the CIFAR-10 dataset from its official website. Also on the CIFAR-10 website, we find the following request from the dataset creators: “Please cite (Krizhevsky et al., 2009) if you intend to use this dataset," alongside a link to the paper. We extract this as the dataset’s license.

It is actually a crawled dataset, and in the paper I don't read anything about the crawled images being under open-source license....

Anyway: do you think it is a problem that we use this dataset? It is such a central dataset in the field and so widely used. We would have to change the entire episode if we use a different dataset. We can write a comment about in the instructor notes.

Small issues

I resolved the two small issues you referred to, will be reviewed soon by one of my colleagues. We will soon pick up any other remaining issues.

Small fixes from Carpentries Lab review carpentries-incubator/deep-learning-intro#444

Would be good to enter the next phase of reviewing. Let us know what we have to do to help this progress.

brownsarahm · 2024-02-19T23:28:29Z

Timing

Thanks for the explanation. I think at one level your strategy for putting breaks in the content makes sense. I am not certain if your claim about cognitive load is true, but also not certain that it is false.

However 3.5 hours without a break is a long time and most instructors will not want to give a break in the middle of an episode.

Maybe an instructor note reminding about breaks? (as context, I'm a maintainer on instructor training and we get a lot of complaints about not enough breaks there and we have them ~every 90 minutes).

Dataset License

In my understanding of carpentries policy a permissive license is required. Since the dataset was crawled, it likely has zero consent to be using the images from the original owners of the images. I think it probably does not have the same risks as imagenet, but should be checked. On the other hand, this is clearly, to me, an intended use by the people who curated the images into a dataset, despite them not putting a license on it and them possibly not having appropriate rights to the images either.

Whether it is okay or not is going to be up to Carpentries policy about the situation where there is no license. @tobyhodges can you help navigate that or let us know who else in the carpentries should be looped in?

tobyhodges · 2024-02-20T13:45:58Z

Thanks for tagging me @brownsarahm. I need to do a bit more reading and thinking about this, and will come back soon with a full response.

tobyhodges · 2024-02-26T19:40:42Z

Thanks for your patience while I took some time to read through the relevant pages and documents, and to reflect on the most appropriate course of action. I am sorry to say that I think we should replace the dataset in the lesson.

The lack of a license file in the dataset is somewhat problematic, even though the authors clearly intend for the dataset to be re-used and usage in the lesson is within the terms stated on their website. But my biggest concerns are with the unethical way in which the data was "collected." Images were scraped and modified for the dataset without any attempt at seeking permission from the copyright owners or giving them attribution, which feels unethical to me regardless of any arguments over its legality. (I am not a lawyer but it seems like the use may fall under "fair dealing" in Canadian copyright law, where the researchers who published the dataset are based.)

In Collaborative Lesson Development Training, alongside considerations of licensing, size, and complexity, we ask lesson developers to consider the ethics of the example datasets they include in their lessons. I would like to apply the same standard to lesson reviews in The Carpentries Lab.

I acknowledge that replacing the dataset will require significant new work on the part of the authors, and perhaps I should have noticed sooner and avoided some of this inconvenience. @svenvanderburg for my part, I would like to devote some time in the coming weeks to try to make the necessary changes (as I am already one of the authors). I hope I will be able to propose some alternative datasets soon, and of course it would help to have input from others with more DL experience than I have. However, please also be aware that you can withdraw the lesson from review here if you prefer.

Finally, many thanks to @brownsarahm for catching this and looping me into the discussion.

svenvanderburg · 2024-02-28T07:12:28Z

Timing

Ha, no 3,5 hour teaching without breaks is absurd! At the Netherlands eScience Center we usually teach in a schedule like [for this recently taught workshop](this https://esciencecenter-digital-skills.github.io/2024-02-05-ds-dl-intro/#schedule). Never more than 90 minutes of teaching! And I think beta pilots copied that schedule in rough lines. It doesn't matter that it is in the middle of an episode.

In addition, we swap instructors halfway the episodes which makes teaching load lighter as well.

See carpentries-incubator/deep-learning-intro#446 for addressing this, do you agree @brownsarahm ? And thanks for bringing this up, this is a great outcome of the review. Since we always use the same schedule no matter what lesson material we use we had a blind spot here.

License

Thanks @tobyhodges for digging into the CIFAR10 license. I agree we should change it, indeed it goes against everything the Carpentries stands for...

So, the remaining issues to fix before the review are:

(and some more small comments from Sarah that we will definitely pick up the coming period but are not essential to do before the review)

Can you confirm this @brownsarahm ?

brownsarahm · 2024-02-28T14:19:04Z

Yes, this is correct, these two issues would get it to a point where it is ready for review.

svenvanderburg · 2024-05-01T06:24:10Z

@brownsarahm The two big remaining issues (#446 and #445) are resolved! Please proceed to the next step of the review process 🚀

brownsarahm · 2024-05-06T15:21:00Z

tiny, tiny errors introduced by the update:

the index refers to the content as a workshop, but I think "lesson" is the more correct Carpentries term. the later use of "workshop" I think is correct.
outlook refers to cifar-10 in a way that I think should be swapped to dollarstreet

tangent/option: the change to the dollar street data also opens a really great opportunity to do bias evaluation and talk about the importance of having many dimensions of diversity in a dataset. This is definitely out of scope for the reivew, but this dataset was published in a neurips paper showing some of this; the paper might be good to include in outlook as a reference (I learned about this dataset at a talk after I raised the concern here)

I will move forward on inviting reviewers!

svenvanderburg · 2024-05-08T07:36:32Z

Wow, you're so sharp @brownsarahm!
We will address all your 3 comments/suggestions in: carpentries-incubator/deep-learning-intro#462, carpentries-incubator/deep-learning-intro#461 and carpentries-incubator/deep-learning-intro#460.

I was already planning on using the dollarstreet dataset as example to open up a discussion on ethical AI next time we teach the lesson. I'm really happy that we use this dataset now.

Great, we're looking forward to the review!

brownsarahm · 2024-05-15T12:52:04Z

@likeajumprope thank you for volunteering to review lessons for The Carpentries Lab. Please can you confirm if you are happy to review this Introduction to Deep Learning lesson?

You can read more about the lesson review process in our Reviewer Guide.

likeajumprope · 2024-05-15T12:59:57Z

@likeajumprope thank you for volunteering to review lessons for The Carpentries Lab. Please can you confirm if you are happy to review this Introduction to Deep Learning lesson?

You can read more about the lesson review process in our Reviewer Guide.

Yes I am happy to accept the invitation for review.

brownsarahm · 2024-05-20T20:32:50Z

@mike-ivs thank you for volunteering to review lessons for The Carpentries Lab. Please can you confirm if you are happy to review this Introduction to Deep Learning lesson?

You can read more about the lesson review process in our Reviewer Guide.

mike-ivs · 2024-05-20T21:37:23Z

@mike-ivs thank you for volunteering to review lessons for The Carpentries Lab. Please can you confirm if you are happy to review this Introduction to Deep Learning lesson?

You can read more about the lesson review process in our Reviewer Guide.

Happy to review the lesson. We'll actually be teaching the beta lesson again next week!

brownsarahm · 2024-05-23T20:43:10Z

@svenvanderburg we have moved to the next phase!

I think we expect the reviews within about 6 weeks.

mike-ivs · 2024-07-05T07:33:25Z

Hi all,

I'm still working through the review and have currently gone through all of the supplementary material (instructor notes/glossary/references/etc) and the Summary&Setup, episode 1, and episode 2. I hope to get through the remaining episodes by the end of next week.

I'm pretty happy overall with the lesson, but will wait to post the "reviewer checklist" summary until i've finished all the episodes. In the meantime i'll post my comments/etc here so that there's something to get started with. And of course, some of the comments are suggestions/questions so feel free to answer as you wish!

Most of my comments are clarity/cognitive load related which is inevitable given the topic, but I do think the lesson does a stellar job of teaching Deep Learning already!

supplementary material (i.e. bonus material, instructor notes, references, etc)

Overall i'm happy with the non-episode content.
There are just a few small typos, wording tweaks, and a url fix (pesky non-static internet!) which I'll submit in a separate pull request and link back here (PR here).

Summary and Setup

Good!
We use a cloud environment when we teach this, and downloading+uploading the datasets is always fiddly, and so we use the Episode 3 instructor notes to download them in-line in Python. I think it's worth moving the download code(s) into a specific "Instructor callout" in the episodes themselves - the same way that breaks are mentioned.
It might be worth exposing the "dataset download codes" to the learner view as well? Of course, there are pros and cons to selecting the "downloading on-the-fly" approach

Episode 1 - Introduction

Overview questions & objectives

I think the Episode 1 & 2 "questions & objectives" may have gotten a bit mixed up at some point.
Questions: "what is a neural network" should be in this episode instead of episode 2
Objectives: "Identify the inputs and outputs of a deep neural network." should be moved to episode 2
Questions and objectives could be re-ordered chronologically according to when they are covered in the episode (the other episodes do this I believe)

Figure `fig/01_AI_ML_DL_differences.png`

The alt text of fig/01_AI_ML_DL_differences.png uses the acronyms AI/ML/DL/NN. NN is the only one that isn't defined in the episode, so worth defining somewhere, or just expanding out the acronyms.
The cited source for fig/01_AI_ML_DL_differences.png doesn't seem to line up with that specific image. I think it might belong to Intel according to this source that cites it, the Intel Introduction to AI course Week 1 slides, and the colour palette compared to Intels other infographic style icons. But i'm not certain...
Is it worth putting in a generative AI subset within the DL circle, like this example?

Activation functions

Activation functions (and specifically ReLU) functions are first mentioned here but not really introduced.
Shortly after there is an Activation Functions Challenge but there isn't really any episode content to prepare learners for this challenge.
The acronym ReLU isn't defined anywhere, and the mathematical/programmatical definition is scattered through the lesson.
I think Activation functions could benefit from a dedicated section in a similar way to how the Loss and optimisers are treated. This section could then provide the challenges for learners to work through. Like Loss and optimisers, this dedicated section could even be deferred to later episodes to ease the upfront cognitive load.
Very minor point - we mention ReLU for the first time and provide it in a equation example but the neuron example just below that equation fig/01_neuron.png shows a sigmoid.

Neural network images

Figure fig/01_xor_exercise.png in this exercise is the first time we see a NN with values embedded in the image, and so it might be worth including a bit more explanation to reduce the initial shock / cognitive load of understanding it.
Similarly, figure fig/01_deep_network.png is another different way of looking at NN architecture and might be a bit heavy in terms of cognitive load (I appreciate DL/NNs are quite complex full stop!)

A few hyperlinks

In the Examples of DL in research section: is it possible to find alternative non-paywall examples to replace the paywall examples?
In the Installing Keras and other dependencies section: it looks like the setup url is broken.

Episode 2 - Classification by a neural network using Keras

Overview questions & objectives

RECAP I think the Episode 1 & 2 "questions & objectives" may have gotten a bit mixed up at some point.
RECAP Questions: "what is a neural network" should be in episode 1 instead of this episode
RECAP Objectives: "Identify the inputs and outputs of a deep neural network." should be moved to this episode

Instructor note on the episode goal

this note is really good at explaining "we will go through the full workflow once, and then go through later in greater detail"!
Is it worth trying to emphasise this even more in the actual learner content of the lesson?

Palmer penguin links

quite a few of these links are now outdated (and one now points to a spam site!). (PR here).

One hot encoding

a question that was raised a few times when we delivered this lesson in past workshops was "why do we use 3 columns of 0/1 instead of just one column 0/1/2 for the 3 labels?" i.e. why one-hot instead of label encoding (without knowing the names)
It might be worth elaborating on why we want to "make things more complex" via one-hot encoding i.e. discuss the "ranking issue" of label encoding.
This may slow things down due to extra detail, but this was a point where we risked our learners falling behind without a proper explanation.

Random seeds

it might be worth emphasising when and when not to use random seeds, in case learners simply "copy" what they learnt here?

Phrasing

there are just a few words/sentences that could be tweaked - (PR here).

Chinstraps absent from confusion matrix

Do we ever explain/explore why the chinstraps are not present in the the predictions in the confusion matrix? is it the specific random seed? the train/test has stratify/shuffle=True so the right steps were taken to avoid this issue... (short of a bad seed)
It might be worth answering so that learners can see how to tackle the "black box" nature of DL/NNs. Leaving it unanswered or saying "a bad model" doesn't seem very satisfying or good practice.

General comments

Is it worth putting acronyms+definitions in the glossary? There are a few.
There is already an issue raised for this, but i noticed a few places where "workshop/classes" was used over "lesson/episode"

Once again, it's a nice lesson :)

svenvanderburg · 2024-07-09T05:56:55Z

@mike-ivs thank for your comments so far! Super useful. 🙏 Looking forward to the rest!

likeajumprope · 2024-07-09T08:27:59Z

Hi all, Just to chime in: I am on it 😊 Feedback coming soon. Best, Johanna From: Sven van der Burg ***@***.***> Date: Tuesday, 9 July 2024 at 07:57 To: carpentries-lab/reviews ***@***.***> Cc: Johanna Bayer ***@***.***>, Mention ***@***.***> Subject: Re: [carpentries-lab/reviews] [Review]: Introduction to deep learning (Issue #25) @mike-ivs<https://github.com/mike-ivs> thank for your comments so far! Super useful. 🙏 Looking forward to the rest! — Reply to this email directly, view it on GitHub<#25 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AFVBFNU5TDJ6YLAEHUVGGRLZLN3T5AVCNFSM6AAAAAA4HVHIKOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDEMJWGYZTQMBWG4>. You are receiving this because you were mentioned.Message ID: ***@***.***>

brownsarahm · 2024-07-23T14:02:33Z

Checking in @likeajumprope and @mike-ivs, could you each provide an updated ETA for your review this week(or the review itself if you happen to be done)?

mike-ivs · 2024-07-24T04:17:32Z

My apologies, life got in the way since my last post!

I've had a look at the changes so far (carpentries-incubator/deep-learning-intro#482) and am very happy with them :) I'll submit my relevant link/typo PRs shortly. (PR here).

In terms of ETAs I aim to get the rest of the review finished up by the 2nd August at the latest, hopefully by the end of this week.

svenvanderburg · 2024-07-30T04:46:28Z

@mike-ivs no worries. Looking forward to the rest of your comments :)

mike-ivs · 2024-07-31T09:36:10Z

As promised, here's the remaining comments. I'll post the reviewer checklist after this along with overall comments/summary. Again, i've very happy with the lesson and most of my comments are aimed at improving the clarity / further reducing the cognitive-load of a fairly heavy topic! (the lesson does a very good job already).

Episode 3 - Monitor the training process

2) Identify inputs and outputs

we don't explicitly mention what the outputs are here i.e. BASEL_sunshine for Day=i+1. It might transition things nicely into the data prepping section where we pull out the i+1 labels

4) Choose a pretrained model or start building architecture from scratch

small typo in "function as a it proved"
here we define the create_nn function
- we pull X_data from the global scope rather than passing it in explicitly as an argument (which makes me feel uncomfortable!)
- in the "Try to reduce the degree of overfitting" exercise we redefine create_nn but this time pass in node1 and node2 as arguments. We could reduce repetition by doing this the first time round?
- in the "BatchNorm" section we redefine create_nn again. Is it worth giving this a different name for clarity/to avoid overriding the earlier model?

6) Train the model

we introduce batch_size "for the first time" here, but we have already introduced it in the "Intermezzo:Batch gradient descent" section. I'd reword to say something like "As we discussed earlier"

9) Refine the model

"Despite avoiding severe cases of overfitting," change to "In addition to avoiding severe cases of overfitting,"
"Instead of comparing training runs for different number of epochs, early stopping allows to simply set the number of epochs to a desired maximum value." This bit confused me... early stopping would cause different numbers of epochs, wouldn't it? The max epochs is set in the model fitting, not by the early_stopping.

Episode 4: Advanced layer types

Dropout

""Let us add a dropout layer after each pooling layertowards the end of the network, that randomly drops 80% of the nodes.""
- slight typo in "layertowards"
- we only actually add one dropout after the final pooling in create_nn_with_dropout(), not after ever pooling like we said
- slight inconsistency: in the "Challenge: Vary dropout rate" the solution DOES have a dropout layer after every pooling layer

`pip install keras_tuner`

I realise it's in a potentially "watch only" section, but should we move this to pre-lesson installation instructions? Is there a good enough reason install part-way through? It's potentially an optional section, so I don't feel too strongly about it either way.

Episode 5: Transfer Learning

it might be worth giving an overview (perhaps with a "conceptual" diagram) at the start of the episode of how we'll "wrap around" an existing model in order to use it. This might iron out the cognitive load of section 4.

2) Identify inputs and outputs

we should at least mention them (the inputs/outputs), perhaps by simply referring to previous episode's work, and acknowledging that we'll cover them in the beginning of section 4

4) Choose a pre-trained model or start building architecture from scratch

the beginning of section 4 encroaches a bit on the (lightweight) section 2 + 3.
- i.e. the inputs are defined in section 4, and the upscale layer is a "kind of" data prep conceptually
- it might be worth making it clearer earlier in the episode that steps 2+3 are somewhat dependant on the pretrained model. If we mention this is steps 2+3 this will help the workflow steps stick to their "conceptual lanes" a bit more (I appreciate that in the wild they merge together!). Section 4 is pretty chockablock with a bit of everything already so this could help reduce it's cognitive load by introducing things earlier.
referring to top/head of the NN
- it might not be clear what the "top/head" of an NN is when learning this: i.e. is it the first or last layer?
- we build our NNs from bottom(input) to top(output), but read them top to bottom in the code/keras summary. The word "deep" also implies that the bottom is "further in" (I guess it's too late to call it TALL learning.)
- A diagram would help clear this up by explaining where we add our own architecture wrapper around the existing model.
- maybe a figure like this (source), maybe something even simpler

Outlook

very minor, I'd bump instructor note to above the paragraph it references... just in case instructors need a reminder beforehand! (they all prep beforehand right??)

mike-ivs · 2024-07-31T09:45:40Z

Summary

Very happy with lesson overall and I would say it is pretty much ready to graduate beyond the incubator.

Quite a few of my comments are suggestions, and mostly aimed at improving clarity / reducing cognitive load on an inescapably concept-heavy topic.

The lesson contributors+maintainers+testers have done a great job!

Reviewer Checklist

Accessibility

The alternative text of all figures is accurate and sufficiently detailed *.
- Large and/or complex figures may not be described completely in the alt text of the image and instead be described elsewhere in the main body of the episode.
The lesson content does not make extensive use of colloquialisms, region- or culture-specific references, or idioms.
The lesson content does not make extensive use of contractions (“can’t” instead of “cannot”, “we’ve” instead of “we have”, etc).

* To view the alternative text of an image, we recommend using
the WAVE Web Accessibility Evaluation Tool or associated browser extensions.
You can also inspect the source HTML of the image element in the developer tools of your web browser,
or consult the source (R)Markdown file for the relevant page in the lesson repository on GitHub.
For more information about what makes good alternative text for an image,
read How to Design Great Alt Text: An Introduction,
and Writing Alt Text for Data Visualization

Content

Design

Learning objectives for the lesson and its episodes are clear, descriptive, and measurable. They focus on the skills being taught and not the functions/tools e.g. “filter the rows of a data frame based on the contents of one or more columns,” rather than “use the filter function on a data frame.”
The target audience identified for the lesson is specific and realistic.

Supporting information

The list of required prior skills and/or knowledge is complete and accurate.
The setup and installation instructions are complete, accurate, and easy to follow.
No key terms are missing from the lesson glossary or are not linked to definitions in an external glossary e.g. Glosario.

svenvanderburg · 2024-08-05T07:01:28Z

Great! @mike-ivs thank you for your review! 🙏

brownsarahm · 2024-08-30T19:35:21Z

Hi @likeajumprope checking in for n ETA on your review

svenvanderburg · 2024-09-09T08:59:37Z

Happy belated 1 year anniversary! 🎉

On the 1st of September we had our anniversary celebrating that we are now 1 year under review 🎉🙈😂

Sorry for the sarcasm ;) If we can do anything to speed up the process (also for future reviews), please let us know. Thanks again everyone for your valuable input to the lesson!

likeajumprope · 2024-09-09T12:50:37Z

Happy belated 1 year anniversary! 🎉

On the 1st of September we had our anniversary celebrating that we are now 1 year under review 🎉🙈😂

Sorry for the sarcasm ;) If we can do anything to speed up the process (also for future reviews), please let us know. Thanks again everyone for your valuable input to the lesson!

Hi, I have been invited to the project end of May this year. I know that this is also quite a stretch, but this is free labor that I am doing in addition to my work. :) I am 3/4 done and will send a summary asap.

tobyhodges · 2024-09-09T15:01:50Z

Thank you for keeping us updated, @likeajumprope. I want to stress how much we appreciate the time and effort that you, @mike-ivs, @brownsarahm, and others are volunteering to make this review happen. We could not do it without you and I recognise that activities like this one always compete for time with many other things that often must take higher priority. Please let me or @brownsarahm know if there is anything that we can do to support you.

@svenvanderburg I understand your frustration but please remain respectful in your communications. Everyone is trying their best while faced with many competing priorities. Consider for example that delays on my part have contributed considerably more to the extended duration of the process -- and I am one of the authors, plus compensated for the time I spend on it! We cannot say the same for the guest editor or reviewers. I note also that although the process may be slower than you might like, the duration of this review has been fairly typical of what we have seen so far on the Lab and at other places that operate open peer review.

Writing as an author of the lesson, instead of focusing on the time it can take to get reviewer feedback I choose to reflect on how valuable the input we have received so far has been. For example, the editorial comments from @brownsarahm have helped us make a big improvement to the example data used in the lesson. And @mike-ivs has provided a pretty forensic analysis of how the order and emphasis of our lesson content could be adjusted to make it flow as well as possible. I am looking forward to finding out how @likeajumprope's comments will help us make the lesson even better!

svenvanderburg · 2024-09-10T10:59:01Z

Thanks @tobyhodges for your kind and mediating words. Sorry if this came across as disrespectful, it was not ment that way, but I can see how it can feel like that. Like my wife says (who is a primary school teacher): It's only funny if everyone can laugh about it. I think I should have put a bit more emphasis on that I really appreciate everyone's valuable time spent, also yours @likeajumprope 🙏

I'm looking forward to your comments @likeajumprope :)

svenvanderburg · 2024-10-14T12:41:29Z

@likeajumprope can you update us on the progress of your review? (Where no progress is also a valid update 😉 )

svenvanderburg assigned tobyhodges Sep 1, 2023

svenvanderburg mentioned this issue Sep 1, 2023

Submit to Carpentries Lab carpentries-incubator/deep-learning-intro#333

Open

svenvanderburg mentioned this issue Sep 15, 2023

Add JOSE paper carpentries-incubator/deep-learning-intro#366

Merged

5 tasks

svenvanderburg mentioned this issue Nov 1, 2023

Issues from carpentries lab submission carpentries-incubator/deep-learning-intro#423

Closed

svenvanderburg mentioned this issue Jan 24, 2024

[Lab Submission] the check boxes in the prereqs render poorly in the workbench + are an accessibility flag bc they're unlabeled "form elements" these should be removed carpentries-incubator/deep-learning-intro#439

Closed

brownsarahm added the 1/editor-checks Editor is conducting initial checks on the lesson before seeking reviewers label Jan 31, 2024

tobyhodges assigned brownsarahm and unassigned tobyhodges Feb 15, 2024

psteinb mentioned this issue Feb 27, 2024

CIFAR10 not having a license carpentries-incubator/deep-learning-intro#445

Closed

svenvanderburg mentioned this issue Apr 23, 2024

Fix minor comments from Carpentries Lab review carpentries-incubator/deep-learning-intro#449

Closed

brownsarahm added 2/seeking-reviewers Editor is looking for reviewers to assign to this lesson and removed 1/editor-checks Editor is conducting initial checks on the lesson before seeking reviewers labels May 6, 2024

brownsarahm added 3/reviewer(s)-assigned Reviewers have been assigned; review in progress and removed 2/seeking-reviewers Editor is looking for reviewers to assign to this lesson labels May 23, 2024

svenvanderburg mentioned this issue Jul 9, 2024

Incorporate comments from Carpentries Lab Review mike-ivs carpentries-incubator/deep-learning-intro#479

Closed

svenvanderburg mentioned this issue Aug 5, 2024

Resolve 2nd round of comments Mike carpentries-incubator/deep-learning-intro#515

Open

[Review]: Introduction to deep learning #25

[Review]: Introduction to deep learning #25

Comments

svenvanderburg commented Sep 1, 2023

Lesson Title

Lesson Repository URL

Lesson Website URL

Lesson Description

Author Usernames

Zenodo DOI

Differences From Existing Lessons

Confirmation of Lesson Requirements

JOSE Submission Requirements

Potential Reviewers

svenvanderburg commented Sep 1, 2023

tobyhodges commented Sep 1, 2023

svenvanderburg commented Sep 15, 2023

svenvanderburg commented Oct 4, 2023

tobyhodges commented Oct 10, 2023

svenvanderburg commented Oct 11, 2023

tobyhodges commented Oct 17, 2023

brownsarahm commented Oct 31, 2023 • edited Loading

Editor Checklist - Intro to Deep Learning

Accessibility

Content

Design

Repository

Structure

Supporting information

General

svenvanderburg commented Dec 20, 2023

svenvanderburg commented Jan 17, 2024

brownsarahm commented Jan 24, 2024

svenvanderburg commented Jan 31, 2024

brownsarahm commented Jan 31, 2024

tobyhodges commented Jan 31, 2024

brownsarahm commented Feb 16, 2024

svenvanderburg commented Feb 18, 2024 • edited Loading

Episode lengths

Dataset licenses

Small issues

brownsarahm commented Feb 19, 2024

Timing

Dataset License

tobyhodges commented Feb 20, 2024

tobyhodges commented Feb 26, 2024

svenvanderburg commented Feb 28, 2024 • edited Loading

Timing

License

brownsarahm commented Feb 28, 2024

svenvanderburg commented May 1, 2024

brownsarahm commented May 6, 2024

svenvanderburg commented May 8, 2024

brownsarahm commented May 15, 2024

likeajumprope commented May 15, 2024

brownsarahm commented May 20, 2024

mike-ivs commented May 20, 2024

brownsarahm commented May 23, 2024

mike-ivs commented Jul 5, 2024 • edited Loading

supplementary material (i.e. bonus material, instructor notes, references, etc)

Summary and Setup

Episode 1 - Introduction

Overview questions & objectives

Figure fig/01_AI_ML_DL_differences.png

Activation functions

Neural network images

A few hyperlinks

Episode 2 - Classification by a neural network using Keras

Overview questions & objectives

Instructor note on the episode goal

Palmer penguin links

One hot encoding

Random seeds

Phrasing

Chinstraps absent from confusion matrix

General comments

svenvanderburg commented Jul 9, 2024

likeajumprope commented Jul 9, 2024 via email

brownsarahm commented Jul 23, 2024

mike-ivs commented Jul 24, 2024 • edited Loading

brownsarahm commented Oct 31, 2023 •

edited

Loading

svenvanderburg commented Feb 18, 2024 •

edited

Loading

svenvanderburg commented Feb 28, 2024 •

edited

Loading

mike-ivs commented Jul 5, 2024 •

edited

Loading

Figure `fig/01_AI_ML_DL_differences.png`

mike-ivs commented Jul 24, 2024 •

edited

Loading

`pip install keras_tuner`