Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #586 follow-up: Too little indentation if named arguments come after a newline #596

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

danielzuncke
Copy link
Contributor

From what I can tell this was the problem: (inside newline() function) When setting the new indentLevel, named function arguments didn't have their own case defined in the if-else section when a token pair "identifier", ":" is encountered. In the example from the issue thread this selected a case where indentLevel gets set to the same as the line with the previous { token.

Fix: named func args now have their own case which works the same as struct initializer: get indentLevel from previous tok!"(" and add 1.

Added a helper function to keep the if-condition a bit shorter and lazy. While it can return an out of bounds index, we are inside an if block which guarantees us, that it won't.
The helper function is repurposed from peekImplementation.

Updated existing tests, added test with --keep_line_breaks true to match the provided code in issue thread.

Indentation is wrong if named arguments come after newline.
Add test against bad indentation if named args are after newline.
(Currently) only testing for the indentation problem after a newline.
{
size_t i = index + n;
while (n != 0 && i < tokens.length && tokens[i].type == tok!"comment")
i = n > 0 ? i + 1 : i - 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if you run this on tokens == [tok!"...", tok!"comment"] with start index = 0 and n = 1, this will return 2 and cause a range error above in your usage (just in general if the token stream ends with comments and you start searching on it)

Starting on 0 and searching negative will immediately go out of bounds and return size_t.max or a little bit below that.

I think you should make the function always return size_t.max for non-found tokens (and document it that way), since otherwise out-of-bounds tokens are too hard to check for and require knowledge about the tokens length, which isn't necessary for an API like this.

In your call site above, make sure you check the return value of this to not be size_t.max then. Right now it looks like that branch could easily cause a range error if the code ends in comments. (remember the array access is called also when it's not a named argument, since it itself is how it checks if it is a named argument or not, so the syntax may be arbitrary here) - however it's not that easy to get, I haven't been able to do it with a few tests right now yet. Just to defend against range errors I want you to test for it with a function that can generate one anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the slow response

Don't get me wrong, I don't want to argue and will implement it how you like, but I think you might've missed something (or maybe I am, it's been a while). Please consider this 1. regarding the range error (= out of bounds access?), 2. regarding return value of the new function:


1.
Caller is in line Line 1807 and has surrounding code like this (most importantly Line 1794):

if (hasCurrent)
{
  if (...)
  else if (currentIs(tok!"identifier") && peekIs(tok!":")) // Line 1794,
  {                                                        // peekIs ignores comments when invoked like this.
    if (...)
    else if (canFind(astInformation.namedArgumentColonLocations, tokens[nextNonComment(1)].index)) // Line 1807
  }
}

The if-condition in line 1794, that is true for the block in which tokens[nextNonComment(1)] is run, guarantees us that the tokens array looks at least something like this (block below). So your example (if I understand it correctly) where the function is invoked with only trailing comments in the array isn't possible:

[..., tok!"identifier", {arbitrary number of tok!"comment"}, tok!":", ...]
      ^current                                               ^guaranteed to exist by line 1794,
                                                              found by nextNonComment(1)

I assume the range error that you see as a possibility is that one, otherwise iirc tokens[nextNonComment(1)].index is the position in the file and canFind should be either true or false, so I don't see the need for additional access checks (or a reasonable way to implement any, peekIs finding a non-comment should be the appropriate check).

For the same reason I think this is a moot point since while the tok!":" may not be related to named args, it only needs to exist for the array access to not throw:

(remember the array access is called also when it's not a named argument, since it itself is how it checks if it is a named argument or not, so the syntax may be arbitrary here) - however it's not that easy to get, I haven't been able to do it with a few tests right now yet. Just to defend against range errors I want you to test for it with a function that can generate one anyway.

If I am not mistaken, there is currently no way to write a testcase that fails with an out of bounds error for tokens[nextNonComment(1)], even when writing the tokens array manually and then invoking the surrounding function.


2.
I changed the return value when no token is found from nextNonComment(), but I am not convinced this is a good thing: the function is a private member of struct TokenFormatter so anyone calling it also has access to tokens.length (I don't understand how out-of-bounds tokens would be hard to check for when that's accessible).
With the explicit return of size_t.max (for a variable expecting an index) I can see how it encourages to test for returnedIndex == size_t.max instead of returnedIndex < tokens.length which I assume can easily have much worse consequences if another error is at play.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants