Overhaul aggregates #312
Replies: 1 comment
-
Metadata inconsistencies We should either remove the "delete category" feature or making them not mandatory when uploading a torrent. It is a good idea to use the name if no title is provided. InfoHash Canonical Groups We can improve the error messages in the frontend, and redirecting to the torrent details of the already uploaded torrent. The 409 will make things clearer as it reflects the "error", while a 400 Bad Response can be confusing to the end user Database Schema I also agree with having different tables for the different data that we store, as that way, we have all the info stored in case we need it, and leaves all the data in a more abstract way, making future support for torrent v2 or other types easier. API Response Body Structure As mentioned on #297, a change in the format of the response will improve readability and organization. I think those are points to consider for future feats/releases, and as you mentioned, it is a good idea to implement them bit by bit with new fixes, features or releases. |
Beta Was this translation helpful? Give feedback.
-
Introduction
There are issues related to:
I want to start this discussion by collecting all the information we have and giving an overview of the main Index feature, which is to store torrent files. There is a missing point in the documentation for explaining the whole process.
I will explain the current state and propose some changes that might help maintain (including new changes) the project in the future.
Single-step upload
Currently, there is one page for uploading torrents, and the page has a one-step form that you can submit to include a new torrent file in the index.
There are two different data pieces:
title
,description
,category
,tags
.metainfo
file.Notice that you upload the torrent metadata and the torrent file at the same time. This single-step process has some drawbacks. For example, if something fails for the metadata in the following form POST, you upload the torrent again. @da2ce7 and I discussed an alternative to the two-step process where you can first upload the torrent file and then the metadata. That two-step process would have other advantages I will mention later.
Metadata inconsistencies
There are some features we should review. For example, the "category" is mandatory when you upload a torrent, but if the category is removed later, the torrent will not have any.
We also force torrents to have a title. Maybe we could just use the torrent field "name" if no title is provided.
Those changes and the two-step upload process could lower the barrier to adding torrents to the index (manually or automatically).
InfoHash Canonical Groups
Recently, we introduced this new concept. When you upload a torrent file, the index removes non-standard fields from the
info
dictionary, leading to a different info-hash. We store the relationship between the original info-hash and the new one. We call the final info-hash the "canonical info-hash". We also consider that we are creating a new torrent and could potentially change the creation date to the upload date.This feature was introduced but does not have good support in the UI (labels and errors are not aligned with this change). See #290.
Different Representations of the Torrent File
The app handles different data structures to hold the torrent file info during the process of adding a new torrent to the index.
info
dict.Torrent
struct are persisted. There is an open issue to persist all the fields in the Torrust struct. Secondly, we change some fields. For example, theannounce
andannounce_list
fields are changed, and there is another proposal to overwrite also the "created by" field.So we have a lot of different "torrents":
Once we upload the torrent, we lose any other information of the original one (with all the fields and without any change).
Torrent File Inmutability
You can change the torrent metadata, but you cannot change the canonical torrent file after the upload. After persisting the torrent file, you can only download it or delete it.
This is important because, in the database, we mix metadata fields (mutable) with torrent file fields (immutable) in the same table,
torrust_torrents
. That could not be a problem, but conceptually I think it makes more sense the two-step upload. You upload artefacts (torrent files) that can be classified/tagged later with metadata. In some cases, it might make sense even to separate both processes (bulk automatic upload and manual classification).Database Schema
The master table
torrust_torrents
contains the IDtorrent_id
and:torrent_id
,uploader_id
,date_uploaded
. Fields that make sense only for the index.info_hash
(canonical info-hash).category_id
There are other tables for the torrent file fields that are arrays:
And finally, tables specific for metadata:
My suggestion would be to split this information (and future additions) into three separate groups of tables:
id
,uploader
,upload_date
, ...For example, if we change the creation date for the canonical torrent file (because the info-hash changes) and we want to store the original info-hash, I would store that info in a new table, not in the current
torrust_torrents
table.I think a clear separation between the uploaded artefact (torrent file) and the extra info would:
REgarding the DB schema, there is also an open discussion about using UUID instead of auto-incremental ids.
API Response Body Structure
The API response body for the torrent details is:
There is also an open issue to change it.
I think we should also align the response structure with the new database schema. There should also be three parts:
Only the torrent file key in the response will be affected if we serve another type of object.
Conclusion
I think all those changes will:
I'm not proposing to implement them ASAP, but to consider the whole picture anytime we open a new related issue. And maybe progressively refactor to the new state.
And anyway, I think it's helpful to have an overview. I hope this discussion adds more context to the mentioned issues.
cc @da2ce7 @mario-nt
Beta Was this translation helpful? Give feedback.
All reactions