Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sanitize bookmarks at save time #1307

Draft
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

0penBrain
Copy link

This PR aims at sanitizing bookmarks description when saved (see #1117)
Among available strategies, I used percent-encoding strings because :

  • It's available in Qt as static methods of QUrl
  • It should almost transparently convert old bookmarks to new bookmarks, preventing need to maintain legacy code. Only users that have used a '%' followed by 2+ characters in their bookmarks (should be pretty rare) will be impacted and will see some weirdness in their bookmarks (basically the `%xy' will be replaced by a random character).

Only sensitive fields are encoded:

  • Tag names in tag list
  • Bookmark names
  • Tag names in bookmark list

The 2nd commit is simply a code refactoring of the Bookmarks::save() function using range-based for loops instead of older iterator loops. Mainly aiming at increasing function understandability.
Has been placed as last commit so easily can be dropped if judged inappropriate.

@vladisslav2011
Copy link
Contributor

It may be better to just follow CSV specification when saving/loading bookmarks:
When the field contains field separator/line separator/quote, put the field into quotes.
When the field contains quote symbol(s), escape the quote symbol with another quote symbol.
Example:
abs,"rt -> "abc,""rt"

This would maintain compatibility with libreoffice/excel/database engines and other external tools, that support CSV format.

@argilo
Copy link
Member

argilo commented Oct 8, 2023

I used percent-encoding strings

It may be better to just follow CSV specification

I think we should agree on a format (and consider backwards compatibility implications) before making changes. The current format was not well designed, so I would be open to replacing it with something else.

@argilo argilo added the bug label Oct 8, 2023
@0penBrain 0penBrain marked this pull request as draft October 8, 2023 15:10
@0penBrain
Copy link
Author

PR set to Draft till there is a decision. No opinion from me. Tell me when decided and I can update the PR.

@argilo
Copy link
Member

argilo commented Oct 9, 2023

I had a look at the current format, and despite being called bookmarks.csv, it's not even close to being a standard CSV file:

  • There are two tables (tag colors, bookmarks) in the same file.
  • The delimiter is ;.
  • There are extra spaces around values.
  • Header rows are preceded by #.

So it looks like conforming to CSV would require extensive changes. Also, the tags field is a variable-length array, which is not well-suited to CSV.

URL encoding seems like a reasonable way to make a minimal change to the existing format while mostly preserving compatibility. To maximize readability for humans, we could perhaps reduce the list of encoded characters (which is quite large by default). I think only ; , and % would need to be encoded, so we could pass in an exclude list like this:

" !\"#$&'()*+/:<=>?@[\\]^`{|}"

@vladisslav2011
Copy link
Contributor

vladisslav2011 commented Oct 9, 2023

Mixing tags and bookmarks in a single file does not look like good decision...
Conforming to CSV format does not look like extensive changes to me. I have already done it and switched to using different files to store tags and bookmarks:
vladisslav2011@3975627
vladisslav2011@69c4760
vladisslav2011@941fbf1
vladisslav2011@d009d34

XML or JSON or even sqlite may fit better, but any of these would add extra build dependencies.

@argilo
Copy link
Member

argilo commented Oct 9, 2023

Mixing tags and bookmarks in a single file does not look like good decision...

Agreed, but it was a decision that was made a long time ago. I might be open to a format change in a major release though, provided that support for reading the old format remains.

XML or JSON or even sqlite may fit better, but any of these would add extra build dependencies.

JSON is included in QT Core, so I don't think that would require a new dependency. Of course, JSON is not as human-friendly, and definitely not spreadsheet-friendly.

@0penBrain
Copy link
Author

To maximize readability for humans, we could perhaps reduce the list of encoded characters (which is quite large by default). I think only ; , and % would need to be encoded, so we could pass in an exclude list like this:

" !\"#$&'()*+/:<=>?@[\\]^`{|}"

Makes sense. :) Just notice that for those using accented characters like àéêèù for example, these ones will still be converted to %-codes. 😉

@argilo
Copy link
Member

argilo commented Oct 9, 2023

True, and I suppose that too is unnecessary. Maybe a more minimal ad-hoc escaping mechanism would be appropriate.

@vladisslav2011
Copy link
Contributor

vladisslav2011 commented Oct 9, 2023

It may be enough to just replace invalid characters in tag/bookmark names (comma, semicolon) with underscore and update documentation to reflect this behavior.

Qt may output "human-frienly" JSON with QJsonDocument::Indented.

@argilo
Copy link
Member

argilo commented Oct 9, 2023

Yeah, blocking or replacing , and ; might be a reasonable stopgap until we switch to a better format, and would fully preserve compatibility.

@0penBrain
Copy link
Author

There may be several implementation for replacing :

  • Replace at bookmark save time: simple but a bit user unfriendly as the change will be visible only after leaving then restarting gqrx
  • Replace at model setData: simple too because bookmarks already use a dedicated model. Still user will see the change only when the new name is validated (as long as user is typing, it would display commas and semicolons)
  • Replace when user is typing the new name (at this time it's also possible to block/reject forbidden characters): nice but needs to add in the code a new QItemDelegate to deal with this

Any thought/preference ?

@vladisslav2011
Copy link
Contributor

vladisslav2011 commented Oct 13, 2023

My thoughts:

  1. Sanitize bookmark name when user adds a new bookmark somewhere in MainWindow::on_actionAddBookmark_triggered()
  2. Sanitize bookmark name when user changes bookmark's name inline somewhere in BookmarksTableModel::setData
  3. Sanitize tag name when user adds a new tag by connecting BookmarksTagList::itemChanged(QTableWidgetItem *item) to a slot and checking/updating item->text()

The change will be visible immediately (not while typing, of course).

@0penBrain
Copy link
Author

  1. Sanitize bookmark name when user adds a new bookmark somewhere in MainWindow::on_actionAddBookmark_triggered()
  2. Sanitize bookmark name when user changes bookmark's name inline somewhere in BookmarksTableModel::setData
  3. Sanitize tag name when user adds a new tag by connecting BookmarksTagList::itemChanged(QTableWidgetItem *item) to a slot and checking/updating item->text()

1 & 2 looks redundant to me. Only 2 is needed as it covers the case of creation and renaming. 😉

@vladisslav2011
Copy link
Contributor

2 is not enough. It does not cover creation. You may test it yourself.

@0penBrain
Copy link
Author

Well, after some thinking around this, I pushed a new version using some dedicated delegate.

Pros are that it will be flexible and easy to maintain. We can easily change the behavior or increase number of forbidden characters. Now it just totally prevent user to enter ; or , in a bookmark or tag name.

Cons are that it is handled at GUI level so if someone introduce a new widget/way that changes bookmark/tag names, it has to adopt the new delegate.

Please give your thoughts about this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants