Skip to content

Commit

Permalink
Ensure priority sort over alpha sort
Browse files Browse the repository at this point in the history
- Added extension priority map. This is an imperfect solution, and is
  not used by default with default configuration (column-based data).

  - We may want to consider a revised columnar format for a future
    version that has a bit more information than is present in the base
    file.

- Adding the sort priority and extension priority helped, but because
  the alphanumeric sort was first in `MIME::Type#priority_compare`, the
  results weren't as good as they should have been. We now sort by the
  sort priority values _first_ and the alphanumeric values _second_.

  - Stored sort priority was not respected because it depends on flags
    not kept in the base file. Added support for a binary file with this
    to ensure it is loaded.
  • Loading branch information
halostatue committed Feb 17, 2023
1 parent 8748d86 commit 2b8ae86
Show file tree
Hide file tree
Showing 4 changed files with 64 additions and 9 deletions.
31 changes: 22 additions & 9 deletions lib/mime/type.rb
Original file line number Diff line number Diff line change
Expand Up @@ -188,8 +188,8 @@ def <=>(other)
# consumers of mime-types. For the next major version of MIME::Types, this
# method will become #<=> and #priority_compare will be removed.
def priority_compare(other)
if (cmp = simplified <=> other.simplified).zero?
__sort_priority <=> other.__sort_priority
if (cmp = __sort_priority <=> other.__sort_priority).zero?
simplified <=> other.simplified
else
cmp
end
Expand Down Expand Up @@ -229,7 +229,7 @@ def hash

# The computed sort priority value. This is _not_ intended to be used by most
# callers.
def __sort_priority
def __sort_priority # :nodoc:
@__sort_priority || update_sort_priority
end

Expand Down Expand Up @@ -324,17 +324,24 @@ def preferred_extension=(value) # :nodoc:
end

##
# Optional extension priorities for this MIME type. This is a relative value
# similar to nice(1). An explicitly set `preferred_extension` is automatically
# given a relative priority of `-10`.
# Optional extension priorities for this MIME type. This is a map of
# extensions to relative priority values (+-20..20+) similar to +nice(1)+.
# Unless otherwise specified in the data, an explicitly set
# +preferred_extension+ is automatically given a relative priority of +-10+.
#
# :attr_reader: extension_priorities
attr_accessor :extension_priorities

##
# Returns the priority for the provided extension or extensions. If a priority
# is not set, the default priority is 0. The range for priorities is -20..20,
# inclusive.
# is not set, the default priority is +0+. The range for priorities is
# +-20..20+, inclusive.
#
# Obsolete MIME types have a <code>+3</code> penalty applied to their
# extension priority and unregistered MIME types have a <code>+2</code>
# penalty to their extension priority, meaning that the highest priority an
# obsolete, unregistered MIME type can have is +-15+. The lowest priority is
# always <code>+20</code>.
def extension_priority(*exts)
exts.map { |ext| get_extension_priority(ext) }.min
end
Expand Down Expand Up @@ -650,7 +657,7 @@ def clear_extension_priority(ext)
end

def get_extension_priority(ext)
[[-20, __extension_priorities[ext] || 0].max, 20].min
[[-20, (__extension_priorities[ext] || 0) + __priority_penalty].max, 20].min
end

def set_preferred_extension_priority(ext)
Expand Down Expand Up @@ -686,6 +693,12 @@ def update_sort_priority
extension_count = [0, 16 - extension_count].max

@__sort_priority = obsolete | registered | provisional | complete | extension_count
@__priority_penalty = (@obsolete ? 3 : 0) + (@registered ? 0 : 2)
end

def __priority_penalty
update_sort_priority if @__priority_penalty.nil?
@__priority_penalty
end

def content_type=(type_string)
Expand Down
11 changes: 11 additions & 0 deletions lib/mime/type/columnar.rb
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,17 @@ def encode_with(coder) # :nodoc:
super
end

def update_sort_priority
if @container.__fully_loaded?
super
else
obsolete = (@__sort_priority & (1 << 7)) != 0
registered = (@__sort_priority & (1 << 5)) == 0

@__priority_penalty = (@obsolete ? 3 : 0) + (@registered ? 0 : 2)
end
end

class << self
undef column
end
Expand Down
4 changes: 4 additions & 0 deletions lib/mime/types.rb
Original file line number Diff line number Diff line change
Expand Up @@ -204,6 +204,10 @@ def add_type(type, quiet = false)
index_extensions!(type)
end

def __fully_loaded? # :nodoc:
true
end

private

def add_type_variant!(mime_type)
Expand Down
27 changes: 27 additions & 0 deletions lib/mime/types/_columnar.rb
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,10 @@ def self.extended(obj) # :nodoc:
obj.instance_variable_set(:@__files__, Set.new)
end

def __fully_loaded? # :nodoc:
@__files__.size == 10
end

# Load the first column data file (type and extensions).
def load_base_data(path) # :nodoc:
@__root__ = path
Expand All @@ -33,6 +37,10 @@ def load_base_data(path) # :nodoc:
add(type)
end

each_file_byte("spri") do |type, byte|
type.instance_variable_set(:@__sort_priority, byte)
end

self
end

Expand Down Expand Up @@ -60,6 +68,25 @@ def each_file_line(name, lookup = true)
end
end

def each_file_byte(name)
LOAD_MUTEX.synchronize do
next if @__files__.include?(name)

i = -1

filename = File.join(@__root__, "mime.#{name}.column")

next unless File.exist?(filename)

IO.binread(filename).unpack("C*").each do |byte|
(type = @__mime_data__[i += 1]) || next
yield type, byte
end

@__files__ << name
end
end

def load_encoding
each_file_line("encoding") do |type, line|
pool ||= {}
Expand Down

0 comments on commit 2b8ae86

Please sign in to comment.