Other criteria for higher-level badges

Introduction

This document describes the criteria for badges beyond the "passing" level (the "silver" and "gold" levels). Please post an issue if you have comments on these criteria. These criteria build on the criteria for the "passing" level. Note that passing is by itself an achievement; at the time of this writing only about 10% of projects that are pursuing a badge achieve the passing level.

These higher levels are named "silver" and "gold":

Silver (formerly "passing+1") is a more stringent set of criteria than the passing level, but these criteria are still expected to be achievable by small and single-organization projects.
Gold (formerly "passing+2") is an even more stringent set of criteria than the silver level, but include criteria that are possibly not achievable by small or single-organization projects.

Historically we called these levels "passing+1" and "passing+2" so that we could create their names as a separate process. We considered naming systems such as the LEED certification naming system of certified, silver, gold, platinum and how the Linux Foundation ranks membership (silver, gold, platinum). An alternative is the Olympic system naming (bronze, silver, gold). We chose these names in part because we could add yet another level later (presumably calling it platinum).

Note that in several cases SHOULD criteria become MUST in higher level badges, and SUGGESTED criteria at lower levels become SHOULD or MUST in higher level badges.

Silver (passing+1) criteria

You must achieve the lower (passing) badge. In addition, some SHOULD will become MUST, and some SUGGESTED will become SHOULD or MUST.

Upgrade of SHOULD and SUGGESTED

Upgrade some "passing" level SHOULD and SUGGESTED:

Upgrade: Basics

Upgrade contribution_requirements from SHOULD to MUST. "The information on how to contribute MUST include the requirements for acceptable contributions (e.g., a reference to any required coding standard)."
Upgrade report_tracker from SHOULD to MUST - "The project MUST use an issue tracker for tracking individual issues."

Note: The Linux kernel project has indicated that using an issue tracker is difficult at their scale.
Unchanged:
- floss_license_osi - "It is SUGGESTED that any required license(s) be approved by the Open Source Initiative (OSI)."
- english - "The project SHOULD include documentation in English and be able to accept bug reports and comments about code in English."

Upgrade: Change Control

Unchanged:
- repo_distributed - "It is SUGGESTED that common distributed version control software be used (e.g., git)."
- version_semver - "It is SUGGESTED that the Semantic Versioning (SemVer) format be used for releases."
- version_tags - "It is SUGGESTED that projects identify each release within their version control system. For example, it is SUGGESTED that those using git identify each release using git tags."

Upgrade: Reporting

Unchanged:
- enhancement_responses: "The project SHOULD respond to a majority of enhancement requests in the last 2-12 months (inclusive)."

Upgrade: Quality

tests_documented_added. Upgrade from SUGGESTED to MUST. "The project MUST include, in its documented instructions for change proposals, the policy that tests are to be added for major new functionality."
This is a change from, "It is SUGGESTED that this policy on adding tests be documented in the instructions for change proposals."
warnings_strict - Projects MUST be maximally strict with warnings, where practical. ^{[warnings_strict]}

Details: Some warnings cannot be effectively enabled on some projects. What is needed is evidence that the project is striving to enable warning flags where it can, so that errors are detected early.
Unchanged:
- build_common_tools - "It is SUGGESTED that common tools be used for building the software."
- build_floss_tools - "The project SHOULD be buildable using only FLOSS tools."
- test_invocation - "A test suite SHOULD be invocable in a standard way for that language."
- test_most - "It is SUGGESTED that the test suite cover most (or ideally all) the code branches, input fields, and functionality."
  
  NOTE: Statement/branch coverage is covered separately; they are increased, so we are not changing the level of this one.
- test_continuous_integration - "It is SUGGESTED that the project implement continuous integration (where new or changed code is frequently integrated into a central code repository and automated tests are run on the result)."
  
  NOTE: This is upgraded to MUST in gold, and silver adds an intermediate criterion called automated_integration_testing.

Upgrade: Security

Upgrade crypto_weaknesses from SHOULD to MUST. The default security mechanisms within the software produced by the project MUST NOT depend on cryptographic algorithms or modes with known serious weaknesses (e.g., the SHA-1 cryptographic hash algorithm or the CBC mode in SSH).
Unchanged:
- crypto_call
- crypto_pfs
- vulnerabilities_critical_fixed "Projects SHOULD fix all critical vulnerabilities rapidly after they are reported."
  
  NOTE: We'd like this to always be true, but some vulnerabilities are hard to fix, so it's difficult to mandate this. We could require activities to actively work to fix it, and that is worth considering, but are uncertain that would really help users. So we have left it this way.

Upgrade: Analysis

Upgrade static_analysis_common_vulnerabilities from SUGGESTED to MUST: "A project MUST use at least one static analysis tool with rules or approaches to look for common vulnerabilities in the analyzed language or environment, if there is at least one FLOSS tool that can implement this criterion in the selected language."

NOTE: We'd like all projects to use this kind of static analysis tool, but there may not be one in the chosen language, or it may only be proprietary (and some developers will therefore not use it).
Upgrade dynamic_analysis_unsafe from SUGGESTED to MUST. If the software produced by the project includes software written using a memory-unsafe language (e.g., C or C++), then the project MUST use at least one dynamic tool (e.g., a fuzzer or web application scanner) be routinely used in combination with a mechanism to detect memory safety problems such as buffer overwrites. If the project does not produce software written in a memory-unsafe language, choose "not applicable" (N/A).

NOTE: This would mean that C/C++ would be required to use something like ASAN during some testing and/or fuzz testing. See: consider giving links to asan/msan/tsan/ubsan and libFuzzer
Unchanged:
- static_analysis_often - "It is SUGGESTED that static source code analysis occur on every commit or at least daily."
- dynamic_analysis - "It is SUGGESTED that at least one dynamic analysis tool be applied to any proposed major production release of the software before its release."
  
  Note: There are good arguments for increasing this at silver, clearly using different kinds of tools can find different things. However, while these tools can find problems, they often miss many, and it's often harder to backtrack the problem to determine how to fix it. We encourage use of these tools, but that doesn't mean we should mandate them in all cases at this level. We've chosen instead to focus on requiring these kinds of tools for memory-unsafe languages, and thus upgraded dynamic_analysis_unsafe.
- dynamic_analysis_enable_assertions - "It is SUGGESTED that the software include many run-time assertions that are checked during dynamic analysis."

Basics

The project SHOULD have a legal mechanism where all developers of non-trivial amounts of project software assert that they are legally authorized to make these contributions. The most common and easily-implemented approach for doing this is by using a Developer Certificate of Origin (DCO), where users add "signed-off-by" in their commits and the project links to the DCO website. However, this MAY be implemented as a Contributor License Agreement (CLA), or other legal mechanism. ^[dco]

Details: The DCO is the recommended mechanism because it's easy to implement, tracked in the source code, and git directly supports a "signed-off" feature using "commit -s". To be most effective it is best if the project documentation explains what "signed-off" means for that project. A CLA is a legal agreement that defines the terms under which intellectual works have been licensed to an organization or project. A contributor assignment agreement (CAA) is a legal agreement that transfers rights in an intellectual work to another party; projects are not required to have CAAs, since having CAA increases the risk that potential contributors will not contribute, especially if the receiver is a for-profit organization. The Apache Software Foundation CLAs (the individual contributor license and the corporate CLA) are examples of CLAs, for projects which determine that the risks of these kinds of CLAs to the project are less than their benefits.
The project MUST adopt a code of conduct and post it in a standard location. ^{[code_of_conduct]}

Details: Projects may be able to improve the civility of their community and to set expectations about acceptable conduct by adopting a code of conduct. This can help avoid problems before they occur and make the project a more welcoming place to encourage contributions. This should focus only on behavior within the community/workplace of the project. Example codes of conduct are the Linux kernel code of conflict, the Contributor Covenant Code of Conduct, the Debian Code of Conduct, the Ubuntu Code of Conduct, the Fedora Code of Conduct, the GNOME Code Of Conduct, the KDE Community Code of Conduct">, the Python Community Code of Conduct, the The Ruby Community Conduct Guideline, and the The Rust Code of Conduct.

Rationale Suggested in issue#608 by Dan Kohn and in the NYC 2016 brainstorm session. The long list of examples shows that many widely-used FLOSS projects have a code of conduct (obviously there's no way to list them all!).
The project MUST clearly define and document its project governance model (the way it makes decisions, including key roles). ^[governance]

Details: There needs to be some well-established documented way to make decisions and resolve disputes. In small projects, this may be as simple as "the project owner and lead makes all final decisions". There are various governance models, including benevolent dictator and formal meritocracy; for more details, see Governance models. Both centralized (e.g., single-maintainer) and decentralized (e.g., group maintainers) approaches have been successfully used in projects. The governance information does not need to document the possibility of creating a project fork, since that is always possible for FLOSS projects.

Rationale: There are many different governance models used by a wide array of successful projects. Therefore, we do not believe that we should specify a particular governance model. However, we do think it is important to have a governance model, and clearly define it, so that all participants and potential participants will know how decisions will be made. This was inspired by the OW2 Open-source Maturity Model, in particular RDMP-1 and STK-1.
The project MUST clearly define and publicly document the key roles in the project and their responsibilities, including any tasks those roles must perform. It MUST be clear who has which role(s), though this might not be documented in the same way. ^{[roles_responsibilities]}

Details: The documentation for governance and roles and responsibilities may be in one place.

Rationale: Much knowledge about the project roles builds up over the years, and is not sufficiently passed down to new people. Documenting the roles can help recruit, train, and mentor new project members. Projects may choose document the roles and responsibilities in one place, and identify who has the roles separately, so that the project doesn't need to update the role information when people change roles. The goal is to make underlying assumptions clear.

Documentation

The project MUST have a documented roadmap that describes what the project intends to do and not do for at least the next year. ^{[documentation_roadmap]}

Details: The project might not achieve the roadmap, and that's fine; the purpose of the roadmap is to help potential users and constributors understand the intended direction of the project. It need not be detailed.
The project MUST include documentation of the architecture (aka high-level design) of the software produced by the project. If the project does not produce software, select "not applicable" (N/A). ^{[documentation_architecture]}

Details: A software architecture explains a program's fundamental structures, i.e., the program's major components, the relationships among them, and the key properties of these components and relationships.

Rationale: Documenting the basic design makes it easier for potential new developers to understand its basics. This is related to know_secure_design, as well as implement_secure_design and proposed documentation_security.
The project MUST document what the user can and cannot expect in terms of security from the software produced by the project. The project MUST identify the security requirements that the software is intended to meet and an assurance case that justifies why these requirements are met. The assurance case MUST include: a description of the threat model, clear identification of trust boundaries, and evidence that common security weaknesses have been countered. ^{[documentation_security]}

Details: An assurance case is "a documented body of evidence that provides a convincing and valid argument that a specified set of critical claims regarding a system’s properties are adequately justified for a given application in a given environment" ("Software Assurance Using Structured Assurance Case Models", Thomas Rhodes et al, NIST Interagency Report 7608). Trust boundaries are boundaries where data or execution changes its level of trust, e.g., a server's boundaries in a typical web application.

Rationale: Writing the specification helps the developers think about the interface (including the API) the developers are providing, as well letting any user or researcher know what to expect. Many sources discuss the rationale for an "assurance case". This was inspired by Security specification and facilitation of bug bounties and by the NYC 2016 brainstorming session.
The project MUST provide a "quick start" guide for new users to help them quickly do something with the software. ^{[documentation_quick_start]} Details: The idea is to show users how to get started and make the software do anything at all. This is critically important for potential users to get started.

Rationale: This is based on a conversation with Mike Milinkovich, Executive Director of the Eclipse Foundation, about the OSS project criteria and "what is important". He believes, based on his long experience, that it is critically important that any project have some sort of "quick start" guide to help someone get started and do something with the software; this feeling of accomplishment and demonstration that it works builds understanding and confidence in the user. coreinfrastructure#645
The project MUST make an effort to keep the documentation consistent with the current version of the project results (including software produced by the project). Any known documentation defects making it inconsistent MUST be fixed. If the documentation is generally current, but erroneously includes some older information that is no longer true, just treat that as a defect, then track and fix as usual. ^{[documentation_current]}

Details: The documentation MAY include information about differences or changes between versions of the software and/or link to older versions of the documentation. The intent of this criterion is that an effort is made to keep the documentation consistent, not that the documentation must be perfect.

Rationale: It's difficult to keep documentation up-to-date, so the criterion is worded this way to make it more practical. Information on differences or changes between versions of the software helps users of older versions and users who are transitioning from older versions.
The project repository front page and/or website MUST identify and hyperlink to any achievements, including this best practices badge, within 48 hours of public recognition that the achievement has been attained. ^{[documentation_achievements]}

Details: An achievement is any set of external criteria that the project has specifically worked to meet, including some badges. This information does not need to be on the project website front page. A project using GitHub can put achievements on the repository front page by adding them to the README file.

Rationale: Users and potential co-developers need to be able to see what achievements have been attained by a project they are considering using or contributing to. This information can help them determine if they should. In addition, if projects identify their achievements, other projects will be encouraged to follow suit and also make those achievements, benefitting everyone.

Other

If the project sites (website, repository, and download URLs) store passwords for authentication of external users, the passwords MUST be stored as iterated hashes with a per-user salt by using a key stretching (iterated) algorithm (e.g., PBKDF2, Bcrypt or Scrypt). If the project sites do not store passwords for this purpose, select N/A.

Details: Note that the use of GitHub meets this criterion. This criterion only applies to passwords used for authentication of external users into the project sites. If the project sites must log in to other sites, they may need to store passwords for that purpose differently (since using an algorithm like Bcrypt would make those passwords useless). This applies criterion crypto_password_storage to the project sites, similar to sites_https.

Accessibility and Internationalization

The project (both project sites and project results) SHOULD follow accessibility best practices so that persons with disabilities can still participate in the project and use the project results where it is reasonable to do so. ^{[accessibility_best_practices]}

Details: For web applications, see the Web Content Accessibility Guidelines (WCAG 2.0) and its supporting document Understanding WCAG 2.0; see also W3C accessibility information. For GUI applications, consider using the environment-specific accessibility guidelines (such as Gnome, KDE, XFCE, Android, iOS, Mac, and Windows). Some TUI applications (e.g. ncurses programs) can do certain things to make themselves more accessible (such as alpine's force-arrow-cursor setting). Most command-line applications are fairly accessible as-is. This criterion is often N/A, e.g., for program libraries. Here are some examples of actions to take or issues to consider:
- Provide text alternatives for any non-text content so that it can be changed into other forms people need, such as large print, braille, speech, symbols or simpler language (WCAG 2.0 guideline 1.1)
- Color is not used as the only visual means of conveying information, indicating an action, prompting a response, or distinguishing a visual element. (WCAG 2.0 guideline 1.4.1)
- The visual presentation of text and images of text has a contrast ratio of at least 4.5:1, except for large text, incidental text, and logotypes (WCAG 2.0 guideline 1.4.3)
- Make all functionality available from a keyboard (WCAG guideline 2.1)
- A GUI or web-based project SHOULD test with at least one screen-reader on the target platform(s) (e.g. NVDA, Jaws, or WindowEyes on Windows; VoiceOver on Mac & iOS; Orca on Linux/BSD; TalkBack on Android). TUI programs MAY work to reduce overdraw to prevent redundant reading by screen-readers.
The software produced by the project SHOULD be internationalized to enable easy localization for the target audience's culture, region, or language. If internationalization (i18n) does not apply (e.g., the software doesn't generate text intended for end-users and doesn't sort human-readable text), select "not applicable" (N/A). ^{[internationalization]}

Details: Localization "refers to the adaptation of a product, application or document content to meet the language, cultural and other requirements of a specific target market (a locale)." Internationalization is the "design and development of a product, application or document content that enables easy localization for target audiences that vary in culture, region, or language." (See W3C's "Localization vs. Internationalization".) Software meets this criterion simply by being internationalized. No localization for another specific language is required, since once software has been internationalized it's possible for others to work on localization.

Rationale: When software is internationalized, the software can be used by far more people. By itself, that's valuable. In addition, software that can be used by far more people is more likely to lead to larger communities, which increases the likelihood of contributions and reviews.

Continuity

The project MUST be able to continue with minimal interruption if any one person is incapacitated or killed. In particular, the project MUST be able to create and close issues, accept proposed changes, and release versions of software, within a week of confirmation that an individual is incapacitated or killed. This MAY be done by ensuring someone else has any necessary keys, passwords, and legal rights to continue the project. Individuals who run a FLOSS project MAY do this by providing keys in a lockbox and a will providing any needed legal rights (e.g., for DNS names). ^{[access_continuity]}
The project SHOULD have a "bus factor" of 2 or more. ^[bus_factor]

Details: A "bus factor" (aka "truck factor") is the minimum number of project members that have to suddenly disappear from a project ("hit by a bus") before the project stalls due to lack of knowledgeable or competent personnel. The truck-factor tool can estimate this for projects on GitHub. For more information, see Assessing the Bus Factor of Git Repositories by Cosentino et al.

Change Control

The project MUST maintain the most often used older versions of the product or provide an upgrade path to newer versions. If the upgrade path is difficult, the project MUST document how to perform the upgrade (e.g., the interfaces that have changed and detailed suggested steps to help upgrade). ^{[maintenance_or_update]}

Rationale: This was inspired by DFCT-1.2

Reporting

The project MUST give credit to the reporter(s) of all vulnerability reports resolved in the last 12 months, except for the reporter(s) who request anonymity. (N/A allowed). ^{[vulnerability_report_credit]}

Details: If there have been no vulnerabilities resolved in the last 12 months, choose "not applicable" (N/A).

Rationale: It is only fair to credit those who provide vulnerability reports. In many cases, the only reporter requirement is that they receive credit. This is also important long-term, because giving credit encourages additional reporting.

Rationale: Recommended in the NYC 2016 brainstorming session.
The project MUST have a documented process for responding to vulnerability reports. ^{[vulnerability_response_process]}

Details: This is strongly related to vulnerability_report_process, which requires that there be a documented way to report vulnerabilities. It also related to vulnerability_report_response, which requires response to vulnerability reports within a certain time frame.

Rationale: This is inspired by Apache Project Maturity Model QU30.

Quality

Test

An automated test suite MUST be applied on each check-in to a shared repository for at least one branch. This test suite MUST produce a report on test success or failure. ^{[automated_integration_testing]}

Details: This requirement can be viewed as a subset of test_continuous_integration, but focused on just testing, without requiring continuous integration.

Rationale: This is inspired by continuous integration. Continuous integration provides much more rapid feedback on whether or not changes will cause test failures, including regressions. The term "continuous integration" (CI) is defined in Wikipedia as "merging all developer working copies to a shared mainline several times a day". Martin Fowler says that "Continuous Integration is a software development practice where members of a team integrate their work frequently, usually each person integrates at least daily - leading to multiple integrations per day. Each integration is verified by an automated build (including test) to detect integration errors as quickly as possible. Many teams find that this approach leads to significantly reduced integration problems and allows a team to develop cohesive software more rapidly." However, while merging all developer working copies at this pace can be very useful, in practice many projects do not or cannot always do this. In practice, many developers maintain at least some branches that are not merged for longer than a day.
The project MUST have a formal written policy that as major new functionality is added, tests for it MUST be added to an automated test suite. ^{[test_policy_mandated]}

Rationale: This ensures that major new functionality is tested. This is related to the criterion test_policy, but is rewritten to be stronger.
The project MUST add regression tests to an automated test suite for at least 50% of the bugs fixed within the last six months. ^{[regression_tests_added50]}

Rationale: Regression tests prevent undetected resurfacing of defects. If a defect has happened before, there is an increased likelihood that it will happen again. We only require 50% of bugs to have regression tests; not all bugs are equally likely to recur, and in some cases it is extremely difficult to build robust tests for them. Thus, there is a diminishing point of return for adding regression tests. The 50% value could be argued as being arbitrary, however, requiring less than 50% would mean that projects could get the badge even if a majority of their bugs in the time frame would not have regression tests. Projects may, of course, choose to have much larger percentages. We choose six months, as with other requirements, so that projects that have done nothing in the past (or recorded nothing in the past) can catch up in a reasonable period of time.
The project MUST have FLOSS automated test suite(s) that provide at least 80% statement coverage if there is at least one FLOSS tool that can measure this criterion in the selected language. Many FLOSS tools are available to measure test coverage, including gcov/lcov, Blanket.js, Istanbul, and JCov. Note that meeting this criterion is not a guarantee that the test suite is thorough, instead, failing to meet this criterion is a strong indicator of a poor test suite. ^{[test_statement_coverage80]}

Rationale: Statement coverage is widely used as a test quality measure; it's often a first "starter" measure for test quality. It's well-supported, including by gcov/lcov and codecov.io. Bad test suites could also meet this requirement, but it's generally agreed that any good test suite will meet this requirement, so it provides a useful way to filter out clearly-bad test suites. After all, if your tests aren't even running many of the program's statements, you don't have very good tests. Only FLOSS test suites are considered, to ensure that the test suite can be examined and improved over time.

A good automated test suite enables rapid response to vulnerability reports. If a vulnerability is reported to a project, the project may be able to quickly repair it, but that is not enough. A good automated test suite is necessary so the project can rapidly gain confidence that the repair doesn't break anything else so it can field the update.

It could be argued that anything less than 100% is unacceptable, but this is not a widely held belief. There are many ways to determine if a program is correct - testing is only one of them. Some conditions are hard to create during testing, and the return-on-investment to get those last few percentages is arguably not worth it. The time working to get 100% statement coverage might be much better spent on checking the results more thoroughly (which statement coverage does not measure).

The 80% suggested here is supported by various sources. The defaults of codecov.io. They define 70% and below as red, 100% as perfectly green, and anything between 70..100 as a range between red and green. This renders ~80% as yellow, and somewhere between ~85% and 90% it starts looking pretty green.

The paper "Minimum Acceptable Code Coverage" by Steve Cornett claims, "Code coverage of 70-80% is a reasonable goal for system test of most projects with most coverage metrics. Use a higher goal for projects specifically organized for high testability or that have high failure costs. Minimum code coverage for unit testing can be 10-20% higher than for system testing... Empirical studies of real projects found that increasing code coverage above 70-80% is time consuming and therefore leads to a relatively slow bug detection rate. Your goal should depend on the risk assessment and economics of the project... Although 100% code coverage may appear like a best possible effort, even 100% code coverage is estimated to only expose about half the faults in a system. Low code coverage indicates inadequate testing, but high code coverage guarantees nothing."

"TestCoverage" by Martin Fowler (17 April 2012) points out the problems with coverage measures. he states that "Test coverage is a useful tool for finding untested parts of a codebase. Test coverage is of little use as a numeric statement of how good your tests are... The trouble is that high coverage numbers are too easy to reach with low quality testing... If you are testing thoughtfully and well, I would expect a coverage percentage in the upper 80s or 90s. I would be suspicious of anything like 100%... Certainly low coverage numbers, say below half, are a sign of trouble. But high numbers don't necessarily mean much, and lead to ignorance-promoting dashboards."

Coding standards

The project MUST identify the specific coding style guides for the primary languages it uses, and require that contributions generally comply with it. ^{[coding_standards]}

Details: In most cases this is done by referring to some existing style guide(s), possibly listing differences. These style guides can include ways to improve readability and ways to reduce the likelihood of defects (including vulnerabilities). Many programming languages have one or more widely-used style guides. Examples of style guides include Google's style guides and SEI CERT Coding Standards.
The project MUST automatically enforce its selected coding style(s) if there is at least one FLOSS tool that can do so in the selected language(s). ^{[coding_standards_enforced]}

Details: This MAY be implemented using static analysis tool(s) and/or by forcing the code through code reformatters. In many cases the tool configuration is included in the project's repository (since different projects may choose different configurations). Projects MAY allow style exceptions (and typically will); where exceptions occur, they MUST be rare and documented in the code at their locations, so that these exceptions can be reviewed and so that tools can automatically handle them in the future. Examples of such tools include ESLint (JavaScript) and Rubocop (Ruby).

Externally-maintained components

The project MUST list external dependencies in a computer-processable way. ^{[external_dependencies]}

Details: Typically this is done using the conventions of package manager and/or build system. Note that this helps implement installation_development_quick.

Rationale: Inspired by the NYC 2016 brainstorming session.
The project MUST either (1) make it easy to identify and update reused externally-maintained components or (2) use the standard components provided by the system or programming language. Then, if a vulnerability is found in a reused component, it will be easy to update that component. ^{[updateable_reused_components]}

Details: A typical way to meet this criterion is to use system and programming language package management systems. Many FLOSS programs are distributed with "convenience libraries" that are local copies of standard libraries (possibly forked). By itself, that's fine. However, if the program must use these local (forked) copies, then updating the "standard" libraries as a security update will leave these additional copies still vulnerable. This is especially an issue for cloud-based systems; if the cloud provider updates their "standard" libaries but the program won't use them, then the updates don't actually help. See, e.g., "Chromium: Why it isn't in Fedora yet as a proper package" by Tom Callaway.

Rationale: A very common problem is to have obsolete components with known vulnerabilities. This is OWASP Top 10 (2013) number A9 (using known vulnerable components). See also The Unfortunate Reality of Insecure Libraries. This partly deals with vendoring, where code is copied in with the express intent of only using that copied version? (These are intentional forks.) There's a risk of divergence and failure to apply security fixes both ways. For an example, see "LZ4: vendoring in the kernel" by Jonathan Corbet (LWN, February 1, 2017), based on a 2017 linux.conf.au talk by Robert Lefkowitz. Lefkowitz talked about the process of "vendoring" - the copying of code from other projects into one's own repository rather than accepting a dependency on those projects - and LZ4 in the Linux kernel.
The project SHOULD avoid using deprecated or obsolete functions and APIs where FLOSS alternatives are available in the set of technology it uses (its "technology stack") and to a supermajority of the users the project supports (so that users have ready access to the alternative). ^{[interfaces_current]}

Build

Build systems for native binaries MUST honor the relevant compiler and linker (environment) variables passed in to them (e.g., CC, CFLAGS, CXX, CXXFLAGS, and LDFLAGS) and pass them to compiler and linker invocations. A build system MAY extend them with additional flags; it MUST NOT simply replace provided values with its own. ^{[build_standard_variables]}

Details: It should be easy to enable special build features like Address Sanitizer (ASAN), or to comply with distribution hardening best practices (e.g., by easily turning on compiler flags to do so). If no native binaries are being generated, select "N/A".

Rationale: See Build system should honor CC, CFLAGS, CXX, CXXFLAGS
The build and installation system SHOULD preserve debugging information if they are requested in the relevant flags (e.g., "install -s" is not used). If there is no build or installation system (e.g., typical JavaScript libraries), this is N/A. ^{[build_preserve_debug]}

Details: E.G., setting CFLAGS (C) or CXXFLAGS (C++) should create the relevant debugging information if those languages are used, and they should not be stripped during installation. Debugging information is needed for support and analysis, and also useful for measuring the presence of hardening features in the compiled binaries.
The build system for the software produced by the project MUST NOT recursively build subdirectories if there are cross-dependencies in the subdirectories. ^{[build_non_recursive]}

Details: The project build system's internal dependency information needs to be accurate, otherwise, changes to the project may not build correctly. Incorrect builds can lead to defects (including vulnerabilities). A common mistake in large build systems is to use a "recursive build" or "recursive make", that is, a hierarchy of subdirectories containing source files, where each subdirectory is independently built. Unless each subdirectory is fully independent, this is a mistake, because the dependency information is incorrect.

Rationale: For more information, see "Recursive Make Considered Harmful" by Peter Miller (note that this incorrect approach can be used in any build system, not just make). Note that "Non-recursive Make Considered Harmful" agrees that recursive builds are bad; its argument is that for large projects you should use a tool other than make. In many cases it is better to automatically determine the dependencies, but this is not always accurate or practical, so we did not require that dependencies be automatically generated.
The project MUST be able to repeat the process of generating information from source files and get exactly the same bit-for-bit result. If no building occurs (e.g., scripting languages where the source code is used directly instead of being compiled), select "N/A". ^{[build_repeatable]}

Details: GCC and clang users may find the -frandom-seed option useful; in some cases, this can be resolved by forcing some sort order. More suggestions can be found at the reproducible build site.

Rationale: This is a step towards having a reproducible build. This criterion is much easier to meet, because it does not require that external parties be able to reproduce the results - merely that the project can. Supporting full reproducible builds requires that projects provide external parties enough information about their build environment(s), which can be harder to do - so we have split this requirement up. See the reproducible build criterion.

Installation

The project MUST provide a way for end-users to easily install and uninstall the software produced by the project using a commonly-used convention. ^{[installation_common]}

Details: Examples include using a language-level package manager (such as npm, pip, maven, or bundler), system-level package manager (such as apt-get or dnf), "make install/uninstall" (supporting DESTDIR), a container in a standard format, or a virtual machine image in a standard format. The installation and uninstallation process (e.g., its packaging) MAY be implemented by a third party as long as it is FLOSS.
The installation system for end-users MUST honor standard conventions for selecting the location where built artifacts are written to at installation time. For example, if it installs files on a POSIX system it MUST honor the DESTDIR environment variable. If there is no installation system or no standard convention, select "N/A". ^{[installation_standard_variables]}

Rationale : This supports capturing the artifacts (e.g., for analysis) without interfering with the build or installation system due to system-wide changes. See DESTDIR honored at install time This doesn't apply when there's no "installation" process, or when POSIX filesystems aren't supported during installation (e.g., Windows-only programs). See Build system should honor CC, CFLAGS, CXX, CXXFLAGS
The project MUST provide a way for potential developers to quickly install all the project results and support environment necessary to make changes, including the tests and test environment. This MUST be performed with a commonly-used convention. ^{[installation_development_quick]}

Details: This MAY be implemented using a generated container and/or installation script(s). External dependencies would typically be installed by invoking system and/or language package manager(s), per external_dependencies.

Rationale: Recommended in the NYC 2016 brainstorming session.

Security

The project MUST implement secure design principles (from "know_secure_design"), where applicable. ^{[implement_secure_design]}

Details: For example, the project results should have fail-safe defaults (access decisions should deny by default, and projects' installation should be secure by default). They should also have complete mediation (every access that might be limited must be checked for authority and be non-bypassable). Note that in some cases principles will conflict, in which case a choice must be made (e.g., many mechanisms can make things more complex, contravening "economy of mechanism" / keep it simple). If the project is not producing software, this may be N/A.

Rationale: This was inspired by the NYC 2016 brainstorming session.
The project results MUST check all inputs from potentially untrusted sources to ensure they are valid (a whitelist), and reject invalid inputs, if there are any restrictions on the data at all. ^{[input_validation]}

Details:The project results MUST check all inputs from potentially untrusted sources to ensure they are valid (a whitelist), and reject invalid inputs, if there are any restrictions on the data at all.
Hardening mechanisms SHOULD be used in the software produced by the project so that software defects are less likely to result in security vulnerabilities.^[hardening]

Details: Hardening mechanisms may include HTTP headers like Content Security Policy (CSP), compiler flags to mitigate attacks (such as -fstack-protector), or compiler flags to eliminate undefined behavior. For our purposes least privilege is not considered a hardening mechanism (least privilege is important, but separate).

Cryptography

The project SHOULD support multiple cryptographic algorithms, so users can quickly switch if one is broken. Common symmetric key algorithms include AES, Twofish, and Serpent. Common cryptographic hash algorithm alternatives include SHA-2 (including SHA-224, SHA-256, SHA-384 AND SHA-512) and SHA-3. ^{[crypto_algorithm_agility]}

Rationale: The advantage of crypto agility is that if one crypto algorithm is broken, other algorithms can be used instead. Many protocols, including TLS and IPSEC, are specifically designed to support crypto agility. There is disagreement by some experts who argue that this negotiation can itself be a point of attack, and that people should instead simply choose and stay with with one good algorithm. The problem with this position is that no one can be certain about what that "one good algorithm" is; a new attack could be found at any time. See the discussion at Remove requirement for supporting alternative crypto algorithms (crypto_alternatives)?
The project MUST support storing authentication credentials (such as passwords and dynamic tokens) and private cryptographic keys in files that are separate from other information (such as configuration files, databases, and logs), and permit users to update and replacement them without code recompilation. This is N/A if the project never processes authentication credentials and private cryptographic keys. (N/A allowed). ^{[crypto_credential_agility]}
The software produced by the project SHOULD support secure protocols for all of its network communications, such as SSHv2 or later, TLS1.2 or later (HTTPS), IPsec, SFTP, and SNMPv3. Insecure protocols such as FTP, HTTP, telnet, SSLv3 or earlier, and SSHv1 SHOULD be disabled by default, and only enabled if the user specifically configures it. (N/A allowed). ^{[crypto_used_network]}
The software produced by the project SHOULD, if it supports or uses TLS, support at least TLS version 1.2. Note that the predecessor of TLS was called SSL. If the software does not use TLS, select "not applicable" (N/A). (N/A allowed). ^{[crypto_tls12]}
The software produced by the project MUST, if it supports TLS, perform TLS certificate verification by default when using TLS, including on subresources. If the software does not use TLS, select "not applicable" (N/A). (N/A allowed). ^{[crypto_certificate_verification]}

Details: Note that having incorrect TLS certificate verification is a common mistake. For more information, see "The Most Dangerous Code in the World: Validating SSL Certificates in Non-Browser Software" by Martin Georgiev et al. and "Do you trust this application?" by Michael Catanzaro.
The software produced by the project MUST, if it supports TLS, perform certificate verification before sending HTTP headers with private information (such as secure cookies). (N/A allowed). ^{[crypto_verification_private]}

Secure Release

The project MUST cryptographically sign releases of the project results intended for widespread use, and there MUST be a documented process explaining to users how they can obtain the public signing keys and verify the signature(s). The private key for these signature(s) MUST NOT be on site(s) used to directly distribute the software to the public. ^{[signed_releases]}

Details: The project results include both source code and any generated deliverables where applicable (e.g., executables, packages, and containers). Generated deliverables MAY be signed separately from source code. These MAY be implemented as signed git tags (using cryptographic digital signatures). Projects MAY provide generated results separately from tools like git, but in those cases, the separate results MUST be separately signed.

Rationale: This provides protection from compromised distribution systems. The public key must be accessible so that recipients can check the signature. The private key must not be on sites(s) distributing the software to the public; that way, even if those sites are compromised, the signature cannot be altered. This is sometimes called "code signing". A common way to implement this is by using GPG to sign the code, for example, the GPG keys of every person who signs releases could be in the project README. Node.js implements this via GPG keys in the README, but note that in the criterion we are intentionally more general: Node.js Release Team
It is SUGGESTED that in the version control system, each important version tag (a tag that is part of a major release, minor release, or fixes publicly noted vulnerabilities) be cryptographically signed and verifiable as described in signed_releases. ^{[version_tags_signed]}

Details: See also signed_releases and version_tags.

Rationale: This was suggested by Kevin W. Wall (@kwwall) in issue #709.

Analysis

Projects MUST monitor or periodically check their external dependencies (including convenience copies) to detect known vulnerabilities, and fix exploitable vulnerabilities or verify them as unexploitable. ^{[dependency_monitoring]}

Details: This can be done using an origin analyzer / dependency checking tool such as OWASP's Dependency-Check, Sonatype's Nexus Auditor, Black Duck's Protex, Synopsys' Protecode, and Bundler-audit (for Ruby). Some package managers include mechanisms to do this. It is acceptable if the components' vulnerability cannot be exploited, but this analysis is difficult and it is sometimes easier to simply update or fix the part.

Rationale: This must be monitored or periodically checked, because new vulnerabilities are continuously being discovered.

Gold (passing+2) criteria

Achieve the lower silver (passing+1) badge.

Upgrade of SHOULD and SUGGESTED (or not)

Here we list upgrades from silver. We also identify criteria that were not upgraded, but we discussed possibly upgrading them.

Upgrade: Basics

Upgrade bus_factor from SHOULD to MUST. "The project MUST have a "bus factor" of 2 or more. ^[bus_factor]
Unchanged:
- dco
- accessibility
- floss_license_osi - "It is SUGGESTED that any required license(s) be approved by the Open Source Initiative (OSI)."
- english - "The project SHOULD include documentation in English and be able to accept bug reports and comments about code in English."

Upgrade: Change Control

Upgrade repo_distributed from SUGGESTED to MUST "The project MUST use a common distributed version control software (e.g., git or mercurial)."
Unchanged:
- version_semver - "It is SUGGESTED that the Semantic Versioning (SemVer) format be used for releases."
- version_tags - "It is SUGGESTED that projects identify each release within their version control system. For example, it is SUGGESTED that those using git identify each release using git tags."

Upgrade: Reporting

Upgrade: Quality

Upgrade test_invocation from SHOULD to MUST "A test suite MUST be invocable in a standard way for that language."
Upgrade test_continuous_integration from SUGGESTED to MUST. "A project MUST implement continuous integration, where new or changed code is frequently integrated into a central code repository and automated tests are run on the result."

Details: This criterion is merely SUGGESTED at passing level. A subset of this criterion is required for silver; see automated_integration_testing. Here, we require both the continuous check-in and its testing. In most cases this means that each developer who works full-time on the project integrates at least daily.
Unchanged:
- build_common_tools - "It is SUGGESTED that common tools be used for building the software."
- build_floss_tools - "The project SHOULD be buildable using only FLOSS tools."
- test_most - "It is SUGGESTED that the test suite cover most (or ideally all) the code branches, input fields, and functionality."
  
  NOTE: Statement/branch coverage is covered separately; they are increased, so we are not changing the level of this one.
- build_preserve_debug

Upgrade: Security

Upgrade hardening from SHOULD to MUST. "The project software MUST use hardening mechanisms so software defects are less likely to result in security vulnerabilities. If the project does not produce software, choose N/A."
Upgrade crypto_used_network from SHOULD (NOT) to MUST (NOT). "The project MUST NOT use unencrypted network communication protocols (such as HTTP and telnet) if there is an encrypted equivalent (e.g., HTTPS/TLS and SSH), unless the user specifically requests or configures it. (N/A allowed). ^{[crypto_used_network]}
Upgrade crypto_tls12 from SHOULD to MUST. The project MUST, if it supports TLS, support at least TLS version 1.2. Note that the predecessor of TLS was called SSL. (N/A allowed). ^{[crypto_tls12]}
Unchanged:
- crypto_agility
- crypto_call - "If the project software is an application or library, and its primary purpose is not to implement cryptography, then it SHOULD only call on software specifically designed to implement cryptographic functions; it SHOULD NOT re-implement its own."
- crypto_pfs - "The project SHOULD implement perfect forward secrecy for key agreement protocols so a session key derived from a set of long-term keys cannot be compromised if one of the long-term keys is compromised in the future."
- vulnerabilities_critical_fixed - "Projects SHOULD fix all critical vulnerabilities rapidly after they are reported."
  
  NOTE: We'd like this to always be true, but some vulnerabilities are hard to fix, so it's difficult to mandate this. We could require activities to actively work to fix it, and that is worth considering - but would that really help users?

Upgrade: Analysis

Upgrade dynamic_analysis from SUGGESTED to MUST. "The project MUST apply at least one dynamic analysis tool to any proposed major production release of the software before its release."
Upgrade dynamic_analysis_enable_assertions from SUGGESTED to SHOULD. "The project MUST include many run-time assertions in the software it produces, and those assertions MUST be checked during dynamic analysis."

It was: "It is SUGGESTED that the software include many run-time assertions that are checked during dynamic analysis."
Unchanged:
- static_analysis_often - "It is SUGGESTED that static source code analysis occur on every commit or at least daily."

Basics

The project MUST include a copyright statement in each source file, identifying at least one relevant year and copyright holder. ^{[copyright_per_file]}

Details: This MAY be done by including the following inside a comment near the beginning of each file: "Copyright [year this project or content started] - [most recent year modified], [project founder] and the [project name] contributors."

Rationale: This isn't legally required in most jurisdictions, per the Berne Convention. For example, copyright notices have not been required in the US since 1979. On the other hand, this is not hard to add. Ben Balter's "Copyright notices for open source projects" provides some good arguments for why it should be included: "First, someone may want to use your work in ways not allowed by your license; notices help them determine who to ask for permission. Explicit notices can help you prove that you and your collaborators really are the copyright holders. They can serve to put a potential infringer on notice by providing an informal sniff test to counter the 'Oh yeah, well I didn’t know it was copyrighted' defense. For some users the copyright notice may suggest higher quality, as they expect that good software will include a notice... Git can track these things, but people may receive software outside of git or where the git history has not been retained." In addition, we have been informed by the Linux Foundation's SPDX community that having this information is extremely valuable for relicensing and for checking to determine if a copyrighted work is derived from another. While version control systems do track versioning within a project, when files are copied between projects this information is often lost. Having the copyright notice information helps those researching sources, e.g., if they wish to try to relicense something.
The project MUST include a license statement in each source file. This MAY be done by including the following inside a comment near the beginning of each file: SPDX-License-Identifier: [SPDX license expression for project]. ^{[license_per_file]}

Details: This MAY also be done by including a statement in natural language identifying the license. The project MAY also include a stable URL pointing to the license text, or the full license text. Note that the criterion license_location requires the project license be in a standard location. See this SPDX tutorial for more information about SPDX license expressions. Note the relationship with copyright_per_file, whose content would typically precede the license information.

Rationale: Files are sometimes individually copied from one project into another. Per-file license information increases the likelihood that the original license will be honored. SPDX provides a simple standard way to identify common licenses, without having to embed the full license text in each file; since this makes the criterion easier to do, we specifically mention it. Technically, the text after "SPDX-License-Identifier" is a SPDX license expression, not an identifier, but the tag "SPDX-License-Identifier" is what is used for backwards-compatibility.

Continuity

The project MUST have at least two unassociated significant contributors. ^{[contributors_unassociated]}

Details: Contributors are associated if they are paid to work by the same organization (as an employee or contractor) and the organization stands to benefit from the project's results. Financial grants do not count as being from the same organization if they pass through other organizations (e.g., science grants paid to different organizations from a common government or NGO source do not cause contributors to be associated). Someone is a significant contributor if they have made non-trivial contributions to the project in the past year. Examples of good indicators of a significant contributor are: written at least 1,000 lines of code, contributed 50 commits, or contributed at least 20 pages of documentation.

Rationale: This reduces the risk of non-support if a single organization stops supporting the project as FLOSS. It also reduces the risk of malicious code insertion, since there is more independence between contributors. This covers the case where "two people work for company X, but only one is paid to work on this project" (because the non-paid person could still have many of the same incentives). It also covers the case where "two people got paid working for Red Cross for a day, but Red Cross doesn't use the project".

Change Control

The project MUST clearly identify small tasks that can be performed by new or casual contributors. ^{[small_tasks]}

Details: This identification is typically done by marking selected issues in an issue tracker with one or more tags the project uses for the purpose, e.g., up-for-grabs, first-timers-only, "Small fix", microtask, or IdealFirstBug. These new tasks need not involve adding functionality; they can be improving documentation, adding test cases, or anything else that aids the project and helps the contributor understand more about the project.

Rationale: Identified small tasks make it easier for new potential contributors to become involved in a project, and projects with more contributors have an increased likelihood of continuing. Alluxio uses SMALLFIX and OWASP ZAP uses IdealFirstBug. This is related to criterion installation_development_quick.
The project MUST require two-factor authentication (2FA) for developers for changing a central repository or accessing sensitive data (such as private vulnerability reports). This 2FA mechanism MAY use mechanisms without cryptographic mechanisms such as SMS, though that is not recommended. ^{[require_2FA]}

Rationale: 2FA is used by Node.js and the Linux kernel projects. See "Linux Kernel Git Repositories Add 2-Factor Authentication" by Kontin Ryabitsev and "Linux Foundation Protects Kernel Git Repositories With 2FA" by Eduard Kovacs.
The project's two-factor authentication (2FA) SHOULD use cryptographic mechanisms to prevent impersonation. Short Message Service (SMS) based 2FA, by itself, does not meet this criterion, since it is not encrypted. ^[secure_2FA]

Details: A 2FA mechanism that meets this criterion would be a Time-based One-Time Password (TOTP) application that automatically generates an authentication code that changes after a certain period of time. Note that GitHub supports TOTP.

Rationale: SMS is easier and lower cost for many people, but it also provides much weaker security. It has been argued that SMS isn't really 2FA at all; we permit it, because it's better than nothing, but we don't recommend it because of its weaknesses. So Hey You Should Stop Using Texts for Two-Factor Authentication

Reporting

(No new criteria)

Quality

The project MUST have at least 50% of all proposed modifications reviewed before release by a person other than the author, to determine if it is a worthwhile modification and free of known issues which would argue against its inclusion. ^{[two_person_review]}

Rationale: Review can counter many problems. The percentage here could be changed; 100% would be great but untenable for many projects. We have selected 50%, because anything less than 50% would mean that most changes could go unreviewed. See, for example, the Linux Kernel's "Reviewer's statement of oversight". Note that the set of criteria allow people within the same organization to review each others' work; it is better to require different organizations to review each others' work, but in many situations that is not practical.
The project MUST document its code review requirements, including how code review is conducted, what must be checked, and what is required to be acceptable. ^{[code_review_standards]}

Details: See also two_person_review and contribution_requirements

Rationale: Code review is a cornerstone of quality and secure coding practices. Projects often seek new contributors but lack training and documentation to increase the number of reviewers. An increase in code reviewers lowers maintainer workload while aiding in meeting the badge requirement two_person_review. See coreinfrastructure#699 from GeorgLink.
The project MUST have a reproducible build. (N/A allowed). ^{[build_reproducible]}

Details: A reproducible build means that multiple parties can independently redo the process of generating information from source files and get exactly the same bit-for-bit result. If no building occurs (e.g., scripting languages where the source code is used directly instead of being compiled), select "N/A". In some cases, this can resolved by forcing some sort order. JavaScript developers may consider using npm shrinkwrap and webpack OccurenceOrderPlugin. GCC and clang users may find the -frandom-seed option useful. The build environment (including the toolset) can often be defined for external parties by specifying the cryptographic hash of a specific container or virtual machine that they can use for rebuilding. The reproducible builds project has documentation on how to do this.

Rationale: If a project needs to be built but there is no working build system, then potential co-developers will not be able to easily contribute and many security analysis tools will be ineffective. Reproduceable builds counter malicious attacks that generate malicious executables, by making it easy to recreate the executable to determine if the result is correct. By itself, reproducible builds do not counter malicious compilers, but they can be extended to counter malicious compilers using processes such as diverse double-compiling (DDC).

Testing

The project MUST have FLOSS automated test suite(s) that provide at least 90% statement coverage if there is at least one FLOSS tool that can measure this criterion in the selected language. ^{[test_statement_coverage90]}

Rationale: This increases the statement coverage requirement from the previous badge level, thus requiring even more thorough testing (by this measure).
The project MUST have FLOSS automated test suite(s) that provide at least 80% branch coverage if there is at least one FLOSS tool that can measure this criterion in the selected language. ^{[test_branch_coverage80]}

Rationale: This adds another test coverage requirement, again requiring more thorough testing. A program with many one-armed "if" statements could achieve 100% statement coverage but only 50% branch coverage (if the tests only checked the "true" branches). Branch coverage is probably the second most common test coverage measure (after statement coverage), and is often added when a stricter measure of tests is used. Branch coverage is widely (but not universally) implemented.

Security

The project MUST have performed a security review within the last 5 years. This review MUST consider the security requirements and security boundary. ^{[security_review]}

Details: This MAY be done by the project members and/or an independent evaluation. This evaluation MAY be supported by static and dynamic analysis tools, but there also must be human review to identify problems (particularly in design) that tools cannot detect.

Rationale: Security review is important, because security problems often come from subtle interactions of components. Reviewing the system as a whole can help find these problems. Ideally this would be independent, but that often requires a lot of money, and we would rather have some review than none at all. We do not require a specific level of review; this is difficult to quantify given the different environments, requirements, and sizes of various projects. Kevin Wall noted, "If passing+2 is going to be the highest back level, I'd also like to see some sort of mandatory code inspection (possibly SAST assisted), and when applicable, some sort of DAST (for APIs, probably just fuzzing), where failed tests would have to be added to the regression test suite." It's difficult to get agreement on the details of what a security review must include, but we believe that the stated criteria would be agreed on.
"The project website, repository (if accessible via the web), and download site (if separate) MUST include key hardening headers with nonpermissive values." ^{[hardened_site]}

Analysis

(No new criteria)

Improving the criteria

We are hoping to get good suggestions and feedback from the public; please contribute! Please post an issue if you have comments on these criteria.

See criteria for the main current set of criteria. You may also want to see the "background" file for more information about these criteria, and the "implementation" notes about the BadgeApp application.

Files

other.md

Latest commit

History

other.md

File metadata and controls

Other criteria for higher-level badges

Introduction

Silver (passing+1) criteria

Upgrade of SHOULD and SUGGESTED

Upgrade: Basics

Upgrade: Change Control

Upgrade: Reporting

Upgrade: Quality

Upgrade: Security

Upgrade: Analysis

Basics

Documentation

Other

Accessibility and Internationalization

Continuity

Change Control

Reporting

Quality

Test

Coding standards

Externally-maintained components

Build

Installation

Security

Cryptography

Secure Release

Analysis

Gold (passing+2) criteria

Upgrade of SHOULD and SUGGESTED (or not)

Upgrade: Basics

Upgrade: Change Control

Upgrade: Reporting

Upgrade: Quality

Upgrade: Security

Upgrade: Analysis

Basics

Continuity

Change Control

Reporting

Quality

Testing

Security

Analysis

Improving the criteria

See also