Replies: 4 comments 6 replies
-
💭 It could be the opportunity to look around pydantic |
Beta Was this translation helpful? Give feedback.
-
Thank you for your mension, and I think you cover my past comments. If we got EOL dataset like this with perfect asset dataset, matching EOL will be very easy.
As I do not investigate about purl deeply, I do not know if we can use purl or not.
|
Beta Was this translation helpful? Give feedback.
-
Absolutely lovely timing! I've been digging into component lifecycle management as part of my job and on our side we keep on circulating back to the problem of matching collected estate data with EoL data. Robust way to communicate evolution events is needed. Feels a bit sad (although totally understandable) that communicating vulnerabilities is pretty well established, but communicating component evolution/lifecycle is far from it. Only partially related to the topic: How does endoflife.date product name get decided? I couldn't find any guidelines and the current names don't seem to consistently match any of the identifier (CPE or PURL) parts. I see SWID has been a consideration, but not included, not even as optional. Is that by explicit choice? |
Beta Was this translation helpful? Give feedback.
-
FYI OASIS is in the process of defining a standard for EoL information as well. But it requires payment to participate https://www.oasis-open.org/join-a-tc/ so I believe it's worth it to have this discussion here anyway. |
Beta Was this translation helpful? Give feedback.
-
Hello everyone! I'm the creator of the https://github.com/xeol-io/xeol tool which uses endoflife.date for EOL information to matching EOL software in containers. First of all thank you tremendously for the project and leading the efforts around EOL.
I have been thinking about an improved schema for EOL lifecycles recently and wanted to open a discussion about it. I know that we have somewhat differing use cases for this schema/data. We would like to use it programatically, while the main use case right now for endoflife.date is for displaying it on the web page. And this schema is most useful for the programmatic use-case, and I tend to believe the two use cases are somewhat at odds (without doing some data transformations).
For our use case, we need a highly accurate database of EOL products, so much so that using
./well-known
from vendors would not work as there would likely be too much variance in quality between vendors. I would like to work together as much as possible (for example sharing this RFC), but also understand that there may be different goals we each have in mind. In either case, hope this ignites some helpful discussion and has some ideas you can draw on.This document builds off of some of the work from the releases.json RFC and is related to the discussion Distinct PURLs/Identifiers for each release cycle.
cc: @captn3m0 @marcwrobel and also @witchcraze as I know you have been thinking about this area for quite some time.
An Open Source EOL Schema
Aug 27, 2023
Purpose
The goal for this document is to describe a schema for describing end-of-life (EOL) dates for software products. It should be extremely accurate and yield low false positive rates. We believe that false positives are in fact more harmful than false negatives because they actively degrade the trust from the user. This must be kept in mind. There has been some great work in the space with https://endoflife.date and we hope to build upon that.
This format is a work in progress. Please feel free to add comments.
Definitions
We must first define some critical terminology, specifically around end-of-life (EOL). In the schema there are different dates surrounding when a project is no longer supported:
end-of-support (eos)
: this is the end of active support. This is when bug fixes are no longer being made, but there are still security bugs being fixed.end-of-life (eol)
: this is the end of bugfixes and security support. All support stops.There may be other definitions that have been made historically, but these are the two we think matter.
Background
The biggest problem with creating a lifecycle schema is the sheer complexity of the different things we need to identify. We need to first define the things with which we can attach a lifecycle. Let's give some examples of each:
Since a software contains packages, you might assume that the lifecycle associated with them is always the same, but this is not the case. For example, .NET 7.0 is a software that has a lifecycle attached to it, but the packages in the .NET ecosystem hosted on NuGet may be deprecated before or after .NET 7.0 itself is EOL.
The schema must also have support for vendors in some way. A vendor may support software or packages in their ecosystem that are not their own. For example, Red Hat may support the lifecycle of
.NET 6.0
in their Application Streams for RHEL even though.NET 6.0
is produced by Microsoft.Project Schema
The format is a JSON structure.
Field Details
schema_version field
The
schema_version
field is used to indicate which version of the Lifecycle schema a particular lifecycle was created with. This can help consumer applications decide how to import the data for their own systems and offer some protection against future breaking changes. The value should be a string following the SemVer 2.0.0 format, with no leading "v" prefix. Clients can assume that new minor and patch versions of the schema only add new fields, without changing the meaning of old fields.id, modified fields
The
id
field is a unique identifier for a lifecycle entry. The syntax of theid
field follows the format LC–xxxx-xxxx-xxxx where:x
is a letter or number from the following set23456789cfghjmpqrvwx
. We use a set of characters that aren't easily confused with others (e.g 0 and O, 1 and I) to reduce errors in transmission and interpretation.The
modified
field gives the time the entry was last modified as an RFC3339-formatted timestamp in UTC (ending in "Z"). We have chosen RFC3339 over ISO8601 because it is more simple to parse and has fewer possible variations.Given two different entries claiming to describe the same
id
field, the one with the later modification time is considered authoritative. Theid
andmodified
fields are required.published field
The
published
field gives the time the entry should be considered to have been published,as an RFC3339-formatted timestamp in UTC (ending in "Z").
withdrawn field
The
withdrawn
field gives the time the entry should be considered to have been withdrawn,as an RFC3339-formatted timestamp in UTC (ending in "Z"). If the field is missing, then the
entry has not been withdrawn. Any rationale for why lifecycle has been withdrawn should go into the summary text.
entity field
The lifecycle field describes The combination of lifecycle.name, lifecycle.version, lifecycle.type, and lifecycle.vendor must be unique across all records.
entity.name
The
entity
object'sname
field is a pretty name for the package or software. It should be sufficiently close to the names used within the identifiers (such as the PURL identifier), but does not need to be exact.entity.version
The
entity
object'sversion
field is the cycle version for the software or package. In the case of a software like MongoDB Server, which ties lifecycle events to the minor version of the software, it would be "6.0" or "6.1" etc. In the case of a software like Vue, which ties lifecycle events to the major version, it would be "2" or "3".entity.type
The
entity
object'stype
field is the type of object described. It may be eithersoftware
orpackage
oros
orproduct
entity.released
The
entity
object'sreleased
field is when the entity was first released, as an RFC3339-formatted timestamp in UTC (ending in "Z"). This is an optional field, as finding the real release date of a software or package may be difficult in some cases.entity.lts
The
entity
object'slts
field is an optional boolean to describe whether the entity is a Long Term Support (LTS) release.entity.discontinued
The
entity
object'sdiscontinued
field is an optional boolean to describe whether the entity is available or not. For software this means it's available via a non-archive source. For devices this would be whether is it available for sale.references field
The optional
references
field contains a list of JSON objects describing references. This could be a link to the vendor's support page or an explanation of their versioning schema. Each object has a string fieldtype
specifying the type of reference and a string fieldurl
. Theurl
is the fully qualified URL (including the scheme) linking to additional information about the entity.The known reference
type
values are:PACKAGE
: A home web page for the package such as NuGetSOURCE
: A source page containing the package source code on Github or other VCS.WEB
: A web page of some unspecified kind.ARTICLE
: An article or blog post describing why the entity is EOL.CHANGELOG
: A web page containing the projects change log information.identification field
identification.versions
The
identification
object'sversions
field is a JSON array containing strings of the versions that are matched along with eitheridentification.cpes
oridentification.purls
oridentification.custom
. This is an optional field as is the case with a software that doesn't have a version associated with it and only needs to be identified using theidentification.custom
object.The
identification.versions
may include a wildcard, it may be a total wildcard (see the RHEL .NET example below) like "*" or a wildcard for a specific semver version "7.*" in which all versions like 7.1.2 and 7.1, etc are all matched.In the case where semver is not used for a software or package, a complete list of versions should be used.
identification.cpes
The
identification
object'scpes
field is a JSON array containing Common Platform Enumeration (CPE) strings used to identify the software. These may be CPE 2.2 or 2.3 strings. These CPE strings are used in combination withidentification.versions
to identify a software.We must support CPE if we are to be able to identify things such as operating systems, as there is no PURL support for Operating Systems. SWIDs are one other possibility, but native support by the different OS's is much less reliable than CPEs.
identification.purls
The
identification
object'spurls
field is a JSON array containing strings following the Package URL specification that identifies the package. This PURL should include qualifiers when possible to ensure accuracy in matching. The PURLs are used in combination withidentification.versions
to identify the package.identification.custom
The
identification
object'scustom
field is a JSON object that containsName
andVendor
which may be used to identify a software. This is used for cases where a CPE is not availabl e or the type is a software where PURL cannot be used. An example would be AWS EKS, which is a software, not a package, and where no CPE exists.support field
The
support
field is a JSON array that is used to describe the end of life dates associated with a software or package. Vendors may provide support above their standard support levels for a fee and in exchange the customer is provided with a longer support period.A support object has the field
support.level
,support.eol
andsupport.eos
. The support level describes the type of support level being described, for example Standard Support or Extended Support.support.eol
describes the date at which security updates are no longer made; this is a mandatory field along withsupport.level
. And finally,support.eos
describes when there are no longer bug fixes being made, this is an optional field.The
support.level.name
gives a pretty name to the support level from a vendor. This name should match the real name given by the vendor when possible. When no name for a support level by a vendor is given and there is only one, "Standard" will be used.The
support.level.int
is the support level as an increasing int starting from 0. For example, RHEL has the support tiers "Standard Support" - 0, "Extended Update Support (EUS)" - 1, and "Enhanced Extended Update Support (Enhanced EUS)" - 2 with increasing support time for each level. This int may be used in consumer app configuration such as automated scanners to help set the level for a product without needing to understand the naming for each product.Examples
RHEL
Red Hat Enterprise Linux is an operating system that can be identified with a CPE.
RHEL .NET
Red Hat Enterprise Linux versions 8 & 9 have introduced "Application Streams". Versions
of user-space components that are delivered and updated more frequently than the core operating system packages. Each Application Stream component has a given lifecycle, either the same as the RHEL release or shorter.
In this example, we will use the .NET 7.0 software. We use the .NET 7.0 SDK package to identify the .NET 7.0 software. There are a couple unique things to note here:
Microsoft .NET
Microsoft has its own support for .NET. Please note that the PURLs here include some platform information (linux-arm64, linux-arm, linux-x64), so a new PURL should be created for each platform. We use the wildcard in the versions list to match any version like 7.0.1, 7.0.2, etc. We use the NETCore runtime package to identify .NET 7.0.
However, the Microsoft.NETCore.App.Runtime.linux-arm64 package might have its own lifecycle and need to be identified. Note that this is just an example, NuGet does not include the date when a package has been marked as deprecated, so we currently have no way to set an EOL date.
Google Kubernetes Engine
For Google Kubernetes Engine, we have neither a PURL nor a CPE to identify the software. We will identify by the product name and vendor.
Beta Was this translation helpful? Give feedback.
All reactions