Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Option groups #486

Merged
merged 12 commits into from
Sep 3, 2024
Merged

Add Option groups #486

merged 12 commits into from
Sep 3, 2024

Conversation

tbidne
Copy link
Contributor

@tbidne tbidne commented May 15, 2024

Resolves #270.

This is an alternative to #231 by @larskuhtz, for whom I am grateful for blazing the trail.


First, thanks for the great library. This is my attempt to resolve #270, as this is a feature I really want (and it seems I am not alone) 🙂.

Also, this isn't nearly as bad as the diff numbers imply; about 80% of the changes are test boilerplate.

Background

These are notes on why solving this isn't trivial. Feel free to skip if you're already familiar with the difficulties:

Click to expand

Commands

Before I jump into the design / impl, I want to first look at why command groups work so well, and why it's harder to do the same for option groups. For commands, the relevant types are:

data CommandFields a = CommandFields
  { cmdCommands :: [(String, ParserInfo a)]
  , cmdGroup :: Maybe String }

command :: String -> ParserInfo a -> Mod CommandFields a
commandGroup :: String -> Mod CommandFields a
subparser :: Mod CommandFields a -> Parser a

-- usage
subparser
  ( command "cmd1" ...
      <> command "cmd2" ...
      <> commandGroup "Some group"
  )

We can add multiple commands with command, add a group with commandGroup, and combine all of this together using the Semigroup instance. It is easy to add the concept of a group here since the basic modifier -- CommandFields -- already collects multiple commands together. So all we have to do is add an optional label and render it appropriately.

Options

Now let's look at Option.

-- Option does not show up in the signatures for user-facing functions like
-- 'option', but it is used internally i.e. part of the Parser type.
data Option a = Option
  { optMain :: OptReader a
  , optProps :: OptProperties
  }

data OptionFields a = OptionFields
  { optNames :: [OptName]
  , optCompleter :: Completer
  , optNoArgError :: String -> ParseError }

-- Focusing on 'option', but the same applies to argument and flag since they
-- all go through the Option type.
option :: ReadM a -> Mod OptionFields a -> Parser a

Unlike CommandFields, OptionFields does not take in multiple options, so we cannot easily add an optional group label. Perhaps we can simply make one? Say,

data GroupedOptions a = GroupedOptions
  { options :: [Option a]
  , optGroup :: Maybe String
  }

Unfortunately, this won't work! The fundamental problem is different options usually have different types -- unlike different commands, which must have the same type -- so we cannot, say, group an Option Int and Option String together (at least not without existential types, which would probably be terrible).

Solutions

As far as I can tell, there are really only two possibilities for a user-facing groupOption :: String -> ....

1. Add an optional group to OptionFields (and ArgumentFields, FlagFields)

This is what the previous PR did:

class HasGroup f where
  setGroup :: String -> f a -> f a
  getGroup :: f a -> Maybe String

instance HasGroup OptionFields where ...

-- usage
parseFoo =
  optional
      ( strOption
          ( long "file-log-path"
              <> metavar "PATH"
              <> help "Log file path"
              <> setGroup "Some group" -- use OptionFields' HasGroup instance
          )
      )

The advantage is that this fits well with standard optparse usage, and it is probably the most obvious way to retrofit option groups into the existing framework.

The disadvantage is that we have to individually add setGroup "Some group" to every option we want in "Some Group". It's clunky, and opens up the possibility for typos creating multiple groups. This is also inconsistent with how command groups work, as you only have to specify the group once for the latter. Unfortunately our hand is forced due to options having different types.

I assume this is the primary reason the PR was declined, perhaps along with some concern over the necessary implementation changes.

2. Add an optional group to OptProperties

This is what this PR does. Because this is at a "higher level" than Mod, our user-facing function is:

parserOptionGroup :: String -> Parser a -> Parser a

-- usage
-- every parser that constitutes someParser will be grouped together
parserOptionGroup "Some group" someParser

The disadvantage is that isn't the usual Mod semigroup pattern.

The advantage is that we only have to specify the group in one place, and all "downstream" parsers will automatically belong to the same group.

The other advantage is that the code changes are quite small. Other than the rendering logic (which is nearly identical to the first PR, thanks @larskuhtz!), the only modification here is adding the new field to OptProperties, and providing a set function.

Examples

The tests show examples of what this looks like. For a "realer" example, I tried this out on a CLI app with many options. Here is the diff, and here is the original help page (brace yourself):

Click to expand
Shrun: A tool for running shell commands concurrently.

Usage: shrun [-c|--config PATH] [-i|--init STRING] 
             [-t|--timeout (NATURAL | STRING)] [--common-log-key-hide] 
             [--command-log-poll-interval NATURAL] 
             [--command-log-read-size BYTES] [--console-log-command] 
             [--console-log-command-name-trunc NATURAL] 
             [--console-log-line-trunc (NATURAL | detect)] 
             [--console-log-strip-control (all | smart | none)] 
             [--console-log-timer-format (digital_compact | digital_full | prose_compact | prose_full)]
             [-f|--file-log (default | PATH)] 
             [--file-log-command-name-trunc NATURAL] 
             [--file-log-delete-on-success] 
             [--file-log-line-trunc (NATURAL | detect)] 
             [--file-log-mode (append | write)] 
             [--file-log-size-mode (warn BYTES | delete BYTES | nothing)] 
             [--file-log-strip-control (all | smart | none)] 
             [--notify-action (final |command | all)] 
             [--notify-system (dbus | notify-send | apple-script)] 
             [--notify-timeout (never | NATURAL)] [--default-config] 
             [-v|--version] Commands...

  Shrun runs shell commands concurrently. In addition to providing basic timing
  and logging functionality, we also provide the ability to pass in a config
  file that can be used to define aliases for commands.

  In general, each option --foo has a --no-foo variant that disables cli and
  toml configuration for that field. For example, the --no-console-log-command
  option will instruct shrun to ignore both cli --console-log-command and toml
  console-log.command, ensuring the default behavior is used (i.e. no command
  logging).

  See github.com/tbidne/shrun#README for full documentation.

Available options:
  -c,--config PATH         Path to TOML config file. If this argument is not
                           given we automatically look in the XDG config
                           directory e.g. ~/.config/shrun/config.toml. The
                           --no-config option disables --config and the
                           automatic XDG lookup.

  --no-config              Disables --config.

  -i,--init STRING         If given, init is run before each command. That is,
                           'shrun --init ". ~/.bashrc" foo bar' is equivalent to
                           'shrun ". ~/.bashrc && foo" ". ~/.bashrc && bar"'.

  --no-init                Disables --init.

  -t,--timeout (NATURAL | STRING)
                           Non-negative integer setting a timeout. Can either be
                           a raw number (interpreted as seconds), or a "time
                           string" e.g. 1d2h3m4s or 2h3s. Defaults to no
                           timeout.

  --no-timeout             Disables --timeout.

  --common-log-key-hide    By default, we display the key name from the legend
                           over the actual command that was run, if the former
                           exists. This flag instead shows the literal command.
                           Commands without keys are unaffected.

  --no-common-log-key-hide Disables --common-log-key-hide.

  --command-log-poll-interval NATURAL
                           Non-negative integer used in conjunction with
                           --console-log-command and --file-log that determines
                           how quickly we poll commands for logs, in
                           microseconds. A value of 0 is interpreted as infinite
                           i.e. limited only by the CPU. Defaults to 10,000.
                           Note that lower values will increase CPU usage. In
                           particular, 0 will max out a CPU thread.

  --no-command-log-poll-interval
                           Disables --command-log-poll-interval.

  --command-log-read-size BYTES
                           The max number of bytes in a single read when
                           streaming command logs (--console-log-command and
                           --file-log). Logs larger than --command-log-read-size
                           will be read in a subsequent read, hence broken
                           across lines. The default is '1 kb'.

  --no-command-log-read-size
                           Disables --command-log-read-size.

  --console-log-command    The default behavior is to swallow logs for the
                           commands themselves. This flag gives each command a
                           console region in which its logs will be printed.
                           Only the latest log per region is show at a given
                           time.

  --no-console-log-command Disables --console-log-command.

  --console-log-command-name-trunc NATURAL
                           Non-negative integer that limits the length of
                           commands/key-names in the console logs. Defaults to
                           no truncation.

  --no-console-log-command-name-trunc
                           Disables --console-log-command-name-trunc.

  --console-log-line-trunc (NATURAL | detect)
                           Non-negative integer that limits the length of
                           console logs. Can also be the string literal
                           'detect', to detect the terminal size automatically.
                           Defaults to no truncation. Note that "log prefixes"
                           (e.g. labels like [Success], timestamps) are counted
                           towards the total length but are never truncated.

  --no-console-log-line-trunc
                           Disables --console-log-line-trunc.

  --console-log-strip-control (all | smart | none)
                           Control characters can wreak layout havoc, thus we
                           include this option. 'all' strips all such chars.
                           'none' does nothing i.e. all chars are left
                           untouched. The default 'smart' attempts to strip only
                           the control chars that affect layout (e.g. cursor
                           movements) and leaves others unaffected (e.g.
                           colors). This has the potential to be the 'prettiest'
                           though it is possible to miss some chars. This option
                           is experimental and subject to change.

  --no-console-log-strip-control
                           Disables --console-log-strip-control.

  --console-log-timer-format (digital_compact | digital_full | prose_compact | prose_full)
                           How to format the timer. Defaults to prose_compact
                           e.g. '2 hours, 3 seconds'.

  --no-console-log-timer-format
                           Disables --console-log-timer-format.

  -f,--file-log (default | PATH)
                           If a path is supplied, all logs will additionally be
                           written to the supplied file. Furthermore, command
                           logs will be written to the file irrespective of
                           --console-log-command. Console logging is unaffected.
                           This can be useful for investigating command
                           failures. If the string 'default' is given, then we
                           write to the XDG state directory e.g.
                           ~/.local/state/shrun/shrun.log.

  --no-file-log            Disables --file-log.

  --file-log-command-name-trunc NATURAL
                           Like --console-log-command-name-trunc, but for
                           --file-logs.

  --no-file-log-command-name-trunc
                           Disables --file-log-command-name-trunc.

  --file-log-delete-on-success
                           If --file-log is active, deletes the file on a
                           successful exit. Does not delete the file if shrun
                           exited via failure.

  --no-file-log-delete-on-success
                           Disables --file-log-delete-on-success.

  --file-log-line-trunc (NATURAL | detect)
                           Like --console-log-line-trunc, but for --file-log.

  --no-file-log-line-trunc Disables --file-log-line-trunc.

  --file-log-mode (append | write)
                           Mode in which to open the log file. Defaults to
                           write.

  --no-file-log-mode       Disables --file-log-mode.

  --file-log-size-mode (warn BYTES | delete BYTES | nothing)
                           Sets a threshold for the file log size, upon which we
                           either print a warning or delete the file, if it is
                           exceeded. The BYTES should include the value and
                           units e.g. warn 10 mb, warn 5 gigabytes, delete
                           20.5B. Defaults to warning at 50 mb.

  --no-file-log-size-mode  Disables --file-log-size-mode.

  --file-log-strip-control (all | smart | none)
                           --console-log-strip-control for file logs created
                           with --file-log. Defaults to all.

  --no-file-log-strip-control
                           Disables --file-log-strip-control.

  --notify-action (final |command | all)
                           Sends notifications for various actions. 'Final'
                           sends off a notification when Shrun itself finishes
                           whereas 'command' sends off one each time a command
                           finishes. 'All' implies 'final' and 'command'.

  --no-notify-action       Disables --notify-action.

  --notify-system (dbus | notify-send | apple-script)
                           The system used for sending notifications. 'dbus' and
                           'notify-send' available on linux, whereas
                           'apple-script' is available for osx.

  --no-notify-system       Disables --notify-system.

  --notify-timeout (never | NATURAL)
                           When to timeout success notifications. Defaults to 10
                           seconds.Note that the underlying notification system
                           may not support timeouts.

  --no-notify-timeout      Disables --notify-timeout.

  --default-config         Writes a default config.toml file to stdout.

  -h,--help                Show this help text

Version: 0.9

And here is the same with the new groups:

Click to expand
Shrun: A tool for running shell commands concurrently.

Usage: shrun [-c|--config PATH] [-i|--init STRING] 
             [-t|--timeout (NATURAL | STRING)] [--common-log-key-hide] 
             [--command-log-poll-interval NATURAL] 
             [--command-log-read-size BYTES] [--console-log-command] 
             [--console-log-command-name-trunc NATURAL] 
             [--console-log-line-trunc (NATURAL | detect)] 
             [--console-log-strip-control (all | smart | none)] 
             [--console-log-timer-format (digital_compact | digital_full | prose_compact | prose_full)]
             [-f|--file-log (default | PATH)] 
             [--file-log-command-name-trunc NATURAL] 
             [--file-log-delete-on-success] 
             [--file-log-line-trunc (NATURAL | detect)] 
             [--file-log-mode (append | write)] 
             [--file-log-size-mode (warn BYTES | delete BYTES | nothing)] 
             [--file-log-strip-control (all | smart | none)] 
             [--notify-action (final |command | all)] 
             [--notify-system (dbus | notify-send | apple-script)] 
             [--notify-timeout (never | NATURAL)] [--default-config] 
             [-v|--version] Commands...

  Shrun runs shell commands concurrently. In addition to providing basic timing
  and logging functionality, we also provide the ability to pass in a config
  file that can be used to define aliases for commands.

  In general, each option --foo has a --no-foo variant that disables cli and
  toml configuration for that field. For example, the --no-console-log-command
  option will instruct shrun to ignore both cli --console-log-command and toml
  console-log.command, ensuring the default behavior is used (i.e. no command
  logging).

  See github.com/tbidne/shrun#README for full documentation.

Available options:
  -c,--config PATH         Path to TOML config file. If this argument is not
                           given we automatically look in the XDG config
                           directory e.g. ~/.config/shrun/config.toml. The
                           --no-config option disables --config and the
                           automatic XDG lookup.

  --no-config              Disables --config.

  -i,--init STRING         If given, init is run before each command. That is,
                           'shrun --init ". ~/.bashrc" foo bar' is equivalent to
                           'shrun ". ~/.bashrc && foo" ". ~/.bashrc && bar"'.

  --no-init                Disables --init.

  -t,--timeout (NATURAL | STRING)
                           Non-negative integer setting a timeout. Can either be
                           a raw number (interpreted as seconds), or a "time
                           string" e.g. 1d2h3m4s or 2h3s. Defaults to no
                           timeout.

  --no-timeout             Disables --timeout.

  --default-config         Writes a default config.toml file to stdout.

  -h,--help                Show this help text

Command Logging:
  --command-log-poll-interval NATURAL
                           Non-negative integer used in conjunction with
                           --console-log-command and --file-log that determines
                           how quickly we poll commands for logs, in
                           microseconds. A value of 0 is interpreted as infinite
                           i.e. limited only by the CPU. Defaults to 10,000.
                           Note that lower values will increase CPU usage. In
                           particular, 0 will max out a CPU thread.

  --no-command-log-poll-interval
                           Disables --command-log-poll-interval.

  --command-log-read-size BYTES
                           The max number of bytes in a single read when
                           streaming command logs (--console-log-command and
                           --file-log). Logs larger than --command-log-read-size
                           will be read in a subsequent read, hence broken
                           across lines. The default is '1 kb'.

  --no-command-log-read-size
                           Disables --command-log-read-size.


Common Logging:
  --common-log-key-hide    By default, we display the key name from the legend
                           over the actual command that was run, if the former
                           exists. This flag instead shows the literal command.
                           Commands without keys are unaffected.

  --no-common-log-key-hide Disables --common-log-key-hide.


Console Logging:
  --console-log-command    The default behavior is to swallow logs for the
                           commands themselves. This flag gives each command a
                           console region in which its logs will be printed.
                           Only the latest log per region is show at a given
                           time.

  --no-console-log-command Disables --console-log-command.

  --console-log-command-name-trunc NATURAL
                           Non-negative integer that limits the length of
                           commands/key-names in the console logs. Defaults to
                           no truncation.

  --no-console-log-command-name-trunc
                           Disables --console-log-command-name-trunc.

  --console-log-line-trunc (NATURAL | detect)
                           Non-negative integer that limits the length of
                           console logs. Can also be the string literal
                           'detect', to detect the terminal size automatically.
                           Defaults to no truncation. Note that "log prefixes"
                           (e.g. labels like [Success], timestamps) are counted
                           towards the total length but are never truncated.

  --no-console-log-line-trunc
                           Disables --console-log-line-trunc.

  --console-log-strip-control (all | smart | none)
                           Control characters can wreak layout havoc, thus we
                           include this option. 'all' strips all such chars.
                           'none' does nothing i.e. all chars are left
                           untouched. The default 'smart' attempts to strip only
                           the control chars that affect layout (e.g. cursor
                           movements) and leaves others unaffected (e.g.
                           colors). This has the potential to be the 'prettiest'
                           though it is possible to miss some chars. This option
                           is experimental and subject to change.

  --no-console-log-strip-control
                           Disables --console-log-strip-control.

  --console-log-timer-format (digital_compact | digital_full | prose_compact | prose_full)
                           How to format the timer. Defaults to prose_compact
                           e.g. '2 hours, 3 seconds'.

  --no-console-log-timer-format
                           Disables --console-log-timer-format.


File Logging:
  -f,--file-log (default | PATH)
                           If a path is supplied, all logs will additionally be
                           written to the supplied file. Furthermore, command
                           logs will be written to the file irrespective of
                           --console-log-command. Console logging is unaffected.
                           This can be useful for investigating command
                           failures. If the string 'default' is given, then we
                           write to the XDG state directory e.g.
                           ~/.local/state/shrun/shrun.log.

  --no-file-log            Disables --file-log.

  --file-log-command-name-trunc NATURAL
                           Like --console-log-command-name-trunc, but for
                           --file-logs.

  --no-file-log-command-name-trunc
                           Disables --file-log-command-name-trunc.

  --file-log-delete-on-success
                           If --file-log is active, deletes the file on a
                           successful exit. Does not delete the file if shrun
                           exited via failure.

  --no-file-log-delete-on-success
                           Disables --file-log-delete-on-success.

  --file-log-line-trunc (NATURAL | detect)
                           Like --console-log-line-trunc, but for --file-log.

  --no-file-log-line-trunc Disables --file-log-line-trunc.

  --file-log-mode (append | write)
                           Mode in which to open the log file. Defaults to
                           write.

  --no-file-log-mode       Disables --file-log-mode.

  --file-log-size-mode (warn BYTES | delete BYTES | nothing)
                           Sets a threshold for the file log size, upon which we
                           either print a warning or delete the file, if it is
                           exceeded. The BYTES should include the value and
                           units e.g. warn 10 mb, warn 5 gigabytes, delete
                           20.5B. Defaults to warning at 50 mb.

  --no-file-log-size-mode  Disables --file-log-size-mode.

  --file-log-strip-control (all | smart | none)
                           --console-log-strip-control for file logs created
                           with --file-log. Defaults to all.

  --no-file-log-strip-control
                           Disables --file-log-strip-control.


Notifications:
  --notify-action (final |command | all)
                           Sends notifications for various actions. 'Final'
                           sends off a notification when Shrun itself finishes
                           whereas 'command' sends off one each time a command
                           finishes. 'All' implies 'final' and 'command'.

  --no-notify-action       Disables --notify-action.

  --notify-system (dbus | notify-send | apple-script)
                           The system used for sending notifications. 'dbus' and
                           'notify-send' available on linux, whereas
                           'apple-script' is available for osx.

  --no-notify-system       Disables --notify-system.

  --notify-timeout (never | NATURAL)
                           When to timeout success notifications. Defaults to 10
                           seconds.Note that the underlying notification system
                           may not support timeouts.

  --no-notify-timeout      Disables --notify-timeout.


Version: 0.9

Remarks

  • There is a question of what to do in the nested case e.g.

    parserOptionGroup "General group" $
      (,)
        <$> parserA
        <*> parserOptionGroup "B group" parserB

    Currently, nested groups are concatenated i.e. General group.B group.

    Click to expand outdated description I chose to always override the group, so in this case both `parserA` and `parserB` will be part of `General Group`. I judged this simpler, but we could choose instead to only overwrite the group when it is `Nothing`. This would render `parserA` in `General Group` and `parserB` in `B group` (flat; rendering is never nested).

    See the comment on optPropertiesGroup in Builder.hs.

  • Another question concerns duplicate groups e.g.

    a <- parserOptionGroup "Group A" parserA
    b <- parserOptionGroup "Group B" parserB
    b2 <- parserOptionGroup "Group B" parserB2
    c <- parserOptionGroup "Group A" parserC

    We treat these the same as command groups, that is, duplicate groups are only merged when they are consecutive. The above example will render three groups, Group A, Group B, Group A.

    Click to expand outdated description I chose to make groups unique i.e. both parserA and parserC will go in the same `Group A`.

    See the comment on sortGroupFst in Internal.hs.

  • I'm not very familiar with optparse's internals, so please do let me know if I've missed anything obvious. I'm not married to this design, but I would really like to make some progress here.

Thanks!

@tbidne tbidne force-pushed the parser-groups branch 2 times, most recently from 752442b to 94f61bc Compare May 15, 2024 10:55
@tbidne
Copy link
Contributor Author

tbidne commented May 15, 2024

Decided to see how this might help out the linked fourmolu issue: fourmolu/fourmolu#18

One-line change 🙂:

diff --git a/app/Main.hs b/app/Main.hs
index cd53a8a..2a8943f 100644
--- a/app/Main.hs
+++ b/app/Main.hs
@@ -469,7 +469,7 @@ sourceTypeParser =
     ]
 
 printerOptsParser :: Parser PrinterOptsPartial
-printerOptsParser = parsePrinterOptsCLI mkOption
+printerOptsParser = parserOptionGroup "Printer options" $ parsePrinterOptsCLI mkOption
   where
     mkOption name helpText placeholder =
       option (Just <$> eitherReader parsePrinterOptType) . mconcat $
Click to expand help page
Usage: fourmolu [-v|--version] [--manual-exts] [--print-defaults] 
                [-i | (-m|--mode MODE)] [-q|--quiet] [-o|--ghc-opt OPT] 
                [-f|--fixity FIXITY] [-r|--reexport REEXPORT] 
                [-p|--package PACKAGE] [-u|--unsafe] [-d|--debug] 
                [-c|--check-idempotence] [--color WHEN] [--start-line START] 
                [--end-line END] [--indentation INT] [--column-limit OPTION] 
                [--function-arrows OPTION] [--comma-style OPTION] 
                [--import-export-style OPTION] [--indent-wheres BOOL] 
                [--record-brace-space BOOL] [--newlines-between-decls INT] 
                [--haddock-style OPTION] [--haddock-style-module OPTION] 
                [--let-style OPTION] [--in-style OPTION] 
                [--single-constraint-parens OPTION] 
                [--single-deriving-parens OPTION] [--unicode OPTION] 
                [--respectful BOOL] [--no-cabal] [--stdin-input-file ARG] 
                [-t|--source-type TYPE] [FILE]

Available options:
  -h,--help                Show this help text
  -v,--version             Print version of the program
  --manual-exts            Display extensions that need to be enabled manually
  --print-defaults         Print default configuration options that can be used
                           in fourmolu.yaml
  -i                       A shortcut for --mode inplace
  -m,--mode MODE           Mode of operation: 'stdout' (the default), 'inplace',
                           or 'check'
  -q,--quiet               Make output quieter
  -o,--ghc-opt OPT         GHC options to enable (e.g. language extensions)
  -f,--fixity FIXITY       Fixity declaration to use (an override)
  -r,--reexport REEXPORT   Module re-export that Fourmolu should know about
  -p,--package PACKAGE     Explicitly specified dependency (for operator
                           fixity/precedence only)
  -u,--unsafe              Do formatting faster but without automatic detection
                           of defects
  -d,--debug               Output information useful for debugging
  -c,--check-idempotence   Fail if formatting is not idempotent
  --color WHEN             Colorize the output; WHEN can be 'never', 'always',
                           or 'auto' (the default)
  --start-line START       Start line of the region to format (starts from 1)
  --end-line END           End line of the region to format (inclusive)
  --no-cabal               Do not extract default-extensions and dependencies
                           from .cabal files
  --stdin-input-file ARG   Path which will be used to find the .cabal file when
                           using input from stdin
  -t,--source-type TYPE    Set the type of source; TYPE can be 'module', 'sig',
                           or 'auto' (the default)
  FILE                     Haskell source files to format or stdin (the default)

Printer options:
  --indentation INT        Number of spaces per indentation step (default: 4)
  --column-limit OPTION    Max line length for automatic line breaking (default:
                           none)
  --function-arrows OPTION Styling of arrows in type signatures (choices:
                           "trailing", "leading", or "leading-args") (default:
                           trailing)
  --comma-style OPTION     How to place commas in multi-line lists, records,
                           etc. (choices: "leading" or "trailing") (default:
                           leading)
  --import-export-style OPTION
                           Styling of import/export lists (choices: "leading",
                           "trailing", or "diff-friendly") (default:
                           diff-friendly)
  --indent-wheres BOOL     Whether to full-indent or half-indent 'where'
                           bindings past the preceding body (default: false)
  --record-brace-space BOOL
                           Whether to leave a space before an opening record
                           brace (default: false)
  --newlines-between-decls INT
                           Number of spaces between top-level declarations
                           (default: 1)
  --haddock-style OPTION   How to print Haddock comments (choices:
                           "single-line", "multi-line", or "multi-line-compact")
                           (default: multi-line)
  --haddock-style-module OPTION
                           How to print module docstring (default: same as
                           'haddock-style')
  --let-style OPTION       Styling of let blocks (choices: "auto", "inline",
                           "newline", or "mixed") (default: auto)
  --in-style OPTION        How to align the 'in' keyword with respect to the
                           'let' keyword (choices: "left-align", "right-align",
                           or "no-space") (default: right-align)
  --single-constraint-parens OPTION
                           Whether to put parentheses around a single constraint
                           (choices: "auto", "always", or "never") (default:
                           always)
  --single-deriving-parens OPTION
                           Whether to put parentheses around a single deriving
                           class (choices: "auto", "always", or "never")
                           (default: always)
  --unicode OPTION         Output Unicode syntax (choices: "detect", "always",
                           or "never") (default: never)
  --respectful BOOL        Give the programmer more choice on where to insert
                           blank lines (default: true)

CC @georgefst

@tbidne tbidne force-pushed the parser-groups branch 3 times, most recently from 06431d5 to 37ac35e Compare May 16, 2024 00:05
@HuwCampbell
Copy link
Collaborator

HuwCampbell commented May 16, 2024

Hi, thanks for talking the time to write this and justify your approach so well.

You've pretty much nailed the reasons as to why this is a tricky thing for this API. The way things stand we currently have no functions apart from the core type classes which act over the Parser type and modify it. So fmap and <*> work, but that's about it, everything else is a modifier when building a point Parser.

Now this needn't be a strict requirement, as, for example, Parsec comes with a <?> operator for adding documentation to parsers. But it's a nice principal to keep in mind as it means that the library stays composable.

That said, this PR looks like a pretty reasonable solution on first read through. While the function does need to do a tree traversal, which strikes me as a little odd, it seems to be a relatively simple way to achieve the behaviour the issue is asking for.

I'll have a proper look and think when I get some spare cycles.

With your choice of nested cases I would probably go the other way, with inner cases taking precedence, otherwise things may not compose as well, or one could actually nest the groups as well. But that might be overkill.

The other thing to note is that it will look different to command groups again, which would be a bit of an annoying inconsistency.

Cheers.

@tbidne
Copy link
Contributor Author

tbidne commented May 16, 2024

Hi, thanks for the quick response.

  1. Re: nested groups. Sure, having the inner group take precedence ("do not overwrite Just") is reasonable. I don't personally have any plans for nested groups, so either way is fine with me.

    Actually composing nested groups and rendering them as such is an interesting idea. Truth be told, I didn't pursue it since I wanted the simplest thing that worked, and the rendering logic for nesting could be tricky. But I can see if it's not too hard.

  2. I'm glad you brought up the inconsistency with command groups, since I was just about to comment about that. Right now there are (at least?) two things command groups do differently:

    1. Command groups are not alphabetically sorted.

    2. Duplicate command groups are not merged; they are all listed as created.

    I sorta like the current merging logic, but that's really just my aesthetic preferences, and it is probably a good idea to prefer consistency with command groups where possible. Alphabetically sorting is probably too prescriptive in any case, since I don't believe optparse does this anywhere else. I made a commit here exploring this. I can bring those changes to this PR if you agree that command group consistency is a good idea.

  3. My only other thought at the moment concerns Global options:. Specifically, do they require any special treatment? So far I haven't given it any thought at all i.e. they've inherited the same treatment as Available options:. My gut instinct is they should be treated the same, but I don't think I've actually rendered them for any of my apps before, so I wanted to explicitly ask.

@HuwCampbell
Copy link
Collaborator

HuwCampbell commented May 16, 2024

Using a nested group would just be a List instead of a Maybe, so it would be easy to explore.

It's been a while since I implemented command groups, but I thought they were actually merged as well, though I would be happy to remove that logic as I think it's redundant.

@tbidne
Copy link
Contributor Author

tbidne commented May 16, 2024

Edit: I think I figured out what you mean. Consecutive command groups are merged i.e.

hsubparser ( commandGroup "Group 1" ... )
  <|> hsubparser ( commandGroup "Group 1" ... )

This will indeed be merged. What I currently have here allows for merging all (including non-consecutive) identical groups regardless of the order. It would be the equivalent of the following, for command groups:

-- the two "Group 1"s will be merged
hsubparser ( commandGroup "Group 1" ... )
  <|> hsubparser ( commandGroup "Group 2" ... )
  <|> hsubparser ( commandGroup "Group 1" ... )

It does this by sorting first, then grouping. I like it, but that's probably too opinionated, and a departure from how optparse otherwise works.


Expand outdated comment

Here's an example of what happens to commands:

parseCommand =
      hsubparser
        ( command "list 2" (info (pure List) $ progDesc "Lists elements")
        )
        <|> hsubparser
        ( command "list" (info (pure List) $ progDesc "Lists elements")
            <> command "print" (info (pure Print) $ progDesc "Prints table")
            <> commandGroup "Info commands"
        )
        <|> hsubparser
          ( command "delete" (info (pure Delete) $ progDesc "Deletes elements")
          )
        <|> hsubparser
          ( command "query" (info (pure Query) $ progDesc "Runs a query")
              <> commandGroup "Query commands"
          )
Available commands:
  list 2                   Lists elements

Info commands
  list                     Lists elements
  print                    Prints table

Available commands:
  delete                   Deletes elements

Query commands
  query                    Runs a query

@tbidne

This comment was marked as outdated.

@tbidne
Copy link
Contributor Author

tbidne commented May 16, 2024

Apologies for the moving target, but I think the conversation has possibly become hard to follow due to separate threads (my fault!), so I've pushed some of the changes to this branch, to make for an easier review. In particular, the commit changes the following:

  • Duplicate groups are only merged when they are consecutive. If they are not consecutive, then they repeat. This is how command groups work.
  • No sorting.
  • Nested groups are concatenated and displayed as Group 1.Group 2, as discussed above. This seems quite reasonable to me (up to formatting changes).

Eventually I'd like to make my added tests more robust, so I'll probably redo those, but for now I'm happy with the way this works, so I'll wait for more feedback before changing anything.

@HuwCampbell
Copy link
Collaborator

So I'll probably merge this in the future; I just need to find some time to do a bit of tyre kicking and make amendments.

@tbidne
Copy link
Contributor Author

tbidne commented Jul 10, 2024

Great, thanks! I will likely get to the test updates I mentioned this weekend. Please let me know if there is anything I can do to help.

@tbidne tbidne force-pushed the parser-groups branch 3 times, most recently from 30a5b05 to cef907d Compare July 21, 2024 23:47
@tbidne
Copy link
Contributor Author

tbidne commented Jul 21, 2024

I made the test changes I mentioned earlier. In particular, I added a test that shows the new parser groups have the same duplicate semantics as command groups. I also squashed the commits together. With that, I am satisfied until there is feedback.

Some notes for review:

  • I added the tree traversal helper functions to Options.Applicative.Internal. These could instead go in Options.Applicative.Builder where they are used, if you do not want them exposed at all.
  • I took the liberty of bumping the version to 0.19.0.0, updating the changelog, and adding the corresponding @since tags. The only breaking change is adding a new field to OptProperties AFAICT.
  • I decided not to automatically add colons to the end of parser groups. This is how command groups behave.

@tbidne
Copy link
Contributor Author

tbidne commented Jul 22, 2024

Thanks for removing those imports @HuwCampbell (leftovers from a previous patch). Your CI is running into a problem with haskell-ci. See here for a workaround and links to more discussion (I am happy to make the same change here, if you'd like).

@HuwCampbell
Copy link
Collaborator

HuwCampbell commented Jul 22, 2024

Yeah I noticed... thanks for the link, should save me a bit.

@HuwCampbell
Copy link
Collaborator

Yeah, put up a separate PR for it if you like and rebase on to it.

@HuwCampbell
Copy link
Collaborator

I've done it.

tbidne and others added 2 commits July 22, 2024 19:39
Adds the ability to group options together, similar to command
groups. The parser groups semantics are:

1. Consecutive duplicate groups are merged.
2. Non-consecutive duplicate groups are __not__ merged.
3. Nested groups are concatenated together e.g. if "Group B" is parsed
   within "Group A", then B's options will render under
   "Group A.Group B".

1 and 2 is the same behavior as Command groups.
@tbidne
Copy link
Contributor Author

tbidne commented Jul 22, 2024

Ah I forgot to add the new test to extra-source-files. Just pushed up the fix.

HuwCampbell and others added 3 commits July 23, 2024 09:31
The (<<+>>) operator was swapped for (<</>>) in c1bf93b to fix pcapriotti#360. This was
mistakenly reverted when implementing the new rendering logic for option
groups. This commit re-establishes the fix.
@tbidne
Copy link
Contributor Author

tbidne commented Jul 23, 2024

Fixed CI (passed on my repo).

@HuwCampbell
Copy link
Collaborator

HuwCampbell commented Jul 28, 2024

Ta. I'm patching up a few small things, just a bit time poor.

I think nesting instead of dot separating would look nicer - it would be less repetitive and feel like scoping is respected.

@tbidne
Copy link
Contributor Author

tbidne commented Jul 29, 2024

I don't disagree that scoping is more natural, but do you have an idea of what this should look like? My thoughts were something like this (notice System Options and Double nested):

parser_group.nested - a test for optparse-applicative

Usage: parser_group_nested --hello TARGET [--file-log-path PATH] [--poll]
                           --double-nested STR --timeout INT 
                           [--file-log-verbosity INT] [-q|--quiet]
                           (-v|--verbosity ARG) Command

  Nested parser groups

Available options:
  --hello TARGET           Target for the greeting

Logging:
  --file-log-path PATH     Log file path

  System Options:
    --poll                   Whether to poll

    Double nested:
      --double-nested STR      Some nested option

  System Options:
    --timeout INT            Whether to time out

Logging:
  --file-log-verbosity INT File log verbosity

Available options:
  -q,--quiet               Whether to be quiet
  -v,--verbosity ARG       Console verbosity
  -h,--help                Show this help text

My implementation is quite naive. It replaces the option groups' [String] with essentially Maybe (Int, String). The int index tracks the nest level.

diff --git a/src/Options/Applicative/Builder.hs b/src/Options/Applicative/Builder.hs
index 0f78281..7ccecf7 100644
--- a/src/Options/Applicative/Builder.hs
+++ b/src/Options/Applicative/Builder.hs
@@ -390,9 +390,11 @@ option r m = mkParser d g rdr
 --
 -- will render as "Group Outer.Group Inner".
 optPropertiesGroup :: String -> OptProperties -> OptProperties
-optPropertiesGroup g o = o { propGroup = OptGroup (g : gs) }
+optPropertiesGroup g o = o { propGroup = Just newGroup }
   where
-    OptGroup gs = propGroup o
+    newGroup = case propGroup o of
+      Nothing -> OptGroup 0 g
+      Just (OptGroup i innerGroup) -> OptGroup (i + 1) innerGroup
 
 -- | Prepends a group per 'optPropertiesGroup'.
 optionGroup :: String -> Option a -> Option a
diff --git a/src/Options/Applicative/Types.hs b/src/Options/Applicative/Types.hs
index 9693db5..2619824 100644
--- a/src/Options/Applicative/Types.hs
+++ b/src/Options/Applicative/Types.hs
@@ -151,16 +151,9 @@ data OptVisibility
 -- | Groups for optionals. Can be multiple in the case of nested groups.
 --
 -- @since 0.19.0.0
-newtype OptGroup = OptGroup [String]
+data OptGroup = OptGroup Int String
   deriving (Eq, Show)
 
-instance Semigroup OptGroup where
-  OptGroup xs <> OptGroup ys = OptGroup (xs ++ ys)
-
-instance Monoid OptGroup where
-  mempty = OptGroup []
-  mappend = (<>)
-
 -- | Specification for an individual parser option.
 data OptProperties = OptProperties
   { propVisibility :: OptVisibility       -- ^ whether this flag is shown in the brief description
@@ -169,7 +162,7 @@ data OptProperties = OptProperties
   , propShowDefault :: Maybe String       -- ^ what to show in the help text as the default
   , propShowGlobal :: Bool                -- ^ whether the option is presented in global options text
   , propDescMod :: Maybe ( Doc -> Doc )   -- ^ a function to run over the brief description
-  , propGroup :: OptGroup
+  , propGroup :: Maybe OptGroup
     -- ^ optional (nested) group
     --
     -- @since 0.19.0.0

A consequence of this is that the help text is no longer globally aligned. Probably it is possible to have the same alignment, but this is what I was referencing earlier when I thought the rendering logic might be tricky. At least it was not clear to me what the output should look like, nor necessarily how to achieve it. But perhaps I'm missing something obvious.

@HuwCampbell
Copy link
Collaborator

HuwCampbell commented Jul 30, 2024

Rule of thumb is to be as polite as possible, so some regrouping to make things
nice is ok.

For you program I would do this:

parser_group.nested - a test for optparse-applicative

Usage: parser_group_nested --hello TARGET [--file-log-path PATH] [--poll]
                           --double-nested STR --timeout INT 
                           [--file-log-verbosity INT] [-q|--quiet]
                           (-v|--verbosity ARG) Command

  Nested parser groups

Available options:
  --hello TARGET           Target for the greeting
  -q,--quiet               Whether to be quiet
  -v,--verbosity ARG       Console verbosity
  -h,--help                Show this help text

Logging:
  --file-log-path PATH     Log file path
  --file-log-verbosity INT File log verbosity

- System Options:
    --poll                 Whether to poll
    --timeout INT          Whether to time out

  - Double nested:
      --double-nested STR  Some nested option

The "algorithm" is relatively simple:

  • Group the options by the "head" user defined grouping, pulling nothing defined (i.e., "Available Options") to the top, and maintaining the order user defined groups where possible
  • If the same name is used twice, pull it up to the predefined group, even if there's a gap.
  • Recursively apply.

I don't think this sort of nesting will happen too much in the wild, but we should make it nice to look at.

@tbidne
Copy link
Contributor Author

tbidne commented Jul 30, 2024

I do in fact like the consolidation, but note that this is different from how command groups operate. I.e. command groups are only combined when they are consecutive. For those that are split, they are rendered separately. E.g. the following (note that Info is split, but Update is consecutive):

parseCommand =
  hsubparser
    ( command "list" (info (pure List) $ progDesc "Lists elements")
        <> commandGroup "Info commands"
    )
    <|> hsubparser
      ( command "delete" (info (pure Delete) $ progDesc "Deletes elements")
          <> commandGroup "Update commands"
      )
    <|> hsubparser
      ( command "insert" (info (pure Insert) $ progDesc "Inserts elements")
          <> commandGroup "Update commands"
      )
    <|> hsubparser
      ( command "query" (info (pure Query) $ progDesc "Runs a query")
      )
    <|> hsubparser
    ( command "print" (info (pure Print) $ progDesc "Prints table")
        <> commandGroup "Info commands"
    )

is rendered as:

Info commands
  list                     Lists elements

Update commands
  delete                   Deletes elements
  insert                   Inserts elements

Available commands:
  query                    Runs a query

Info commands
  print                    Prints table

@HuwCampbell
Copy link
Collaborator

I'm ok with improving the rendering of command groups.

tbidne and others added 4 commits August 8, 2024 20:35
Fixup help text.
Move worker function to where it's used.
Remove Applicative do from test suite.
Small tidies.
fst' (_, (x, _)) = x

addIdx :: [(a, b)] -> [(Int, (a, b))]
addIdx = zip [1 ..]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asymptotics look good, nice job.

@HuwCampbell
Copy link
Collaborator

This is coming together very well, I'm happy. I'll spend a bit more time reviewing over the weekend.

Is there anything else you would like to add or comment on before I merge?

@tbidne
Copy link
Contributor Author

tbidne commented Aug 8, 2024

Thanks for the quick review! The docs may need some updating in light of the change in representation. I had also planned to clean up the commit history a bit, but I can leave it as-is if you'd prefer to keep it or change it yourself.

With that out of the way, there is one weird edge case. Consider something like:

parserOptionGroup "Group A" $ parserOptionGroup "Group B" ...

i.e. Group A contains no top-level options, just another subgroup. We might expect this to render as

Group A
- Group B
    -- opt-one
    ...

But in fact we will have

- Group B
    -- opt-one

That is, we will increase Group B's indention level (as we should), but Group A will not appear anywhere in the output. This is due to the representation choice of OptGroup !Int (Maybe String). The function parserOptionGroup merely increases the index when it receives a nested group, but the parameter is dropped.

On the one hand, it would be nice for this to "do the right thing". On the other hand, it seems pretty pathological to me (who would want this?), and tbh, I'm not sure how we'd do the rendering correctly even if we wanted to.

Suppose we always preserve all group names i.e. we instead have OptGroup ["Group A", "Group B"], where the options are grouped under the last member of the list. The rendering logic would somehow have to distinguish between:

  1. Need to print Group A because Group A is otherwise empty, thus hasn't previously been printed.
  2. Only print Group B because Group A did have top-level options and was therefore already printed.

But how would we know? The only thing I can think of is using something like Set to track which parent groups have already been printed. Of course we don't have access to containers, and this seems like it could be messy anyway.

Tbh it's hard for me to care too much about this due to the obscurity and difficulty. But if you'd like to try to address this and have ideas, I'm open to suggestions.


Edit:

I think I can solve this issue using the crude idea above i.e. keep track of the printed groups in a list. It's ugly and O(n^2) (because it uses List.elem rather than Set.member), but I think it can be made to work. At least it only affects people who nest groups like this (i.e. probably no one). Your call if it's worth it.

It is possible that someone defines a "parent group" that only contains
child group(s) i.e. it does not contain any top-level options. This is
a problem, because the parent name will not be attached to any
options, thus will not be printed.

To fix this, we instead represent the Option Groups as a hierarchy
list of group titles e.g. ["Parent 1", "Parent 2", "Child group"].

Then, when rendering, we keep track of which group titles have already
been printed. If some group has parent titles that have not yet been
printed, we prepend their titles to the normal output.
@tbidne
Copy link
Contributor Author

tbidne commented Aug 12, 2024

Just pushed up a fix to the issue I described (and also updated some documentation that needed it anyway). Please let me know what you think.


Edit 2: Actually, I retract Edit 1. The current branch appears to handle this just fine. I don't love the nested logic, as I think it's a bit tricky. But it handles all the weird, pathological cases I've thrown at it so far.

Edit 1: Thought of a pathological example that this currently does not handle 🙃: Duplicate nested group. If you write parserOptionGroup "A" $ parserOptionGroup "A", then the nested A will be displayed, but the parent will not. Note this only affects immediate parents. Grandparents can have the same name i.e. parserOptionGroup "A" $ parserOptionGroup "B" $ parserOptionGroup "A" will render as expected.

@HuwCampbell HuwCampbell merged commit 2ea40a9 into pcapriotti:master Sep 3, 2024
@HuwCampbell
Copy link
Collaborator

I've merged and will go for a release this month.

Thanks for contributing and great work.

@tbidne
Copy link
Contributor Author

tbidne commented Sep 3, 2024

Thanks for merging and all your help!

@tbidne tbidne deleted the parser-groups branch September 3, 2024 22:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Option groups, à la command groups
2 participants