std: do not modify iterator state in TokenIterator.peek() #20427

gpanders · 2024-06-26T16:23:45Z

A common property of "peek()" functions is that they provide the next element in the iterator without modifying the internal iterator state. This is true of many other peek() functions in the Zig stdlib but is not true of TokenIterator.peek(), which advances the iterator's index to the beginning of the next token.

This can result in surprising behavior if a user of this function expects peek() to not modify the internal iterator state, especially since the doc comment of the function says that it "Does not advance to the next token".

A similar PR was made recently but closed by the author before it could be discussed: #20379

A common property of "peek()" functions is that they provide the next element in the iterator without modifying the internal iterator state. This is true of many other peek() functions in the Zig stdlib but is not true of TokenIterator.peek(), which advances the iterator's index to the beginning of the next token. This can result in surprising behavior if a user of this function expects peek() to not modify the internal iterator state, especially since the doc comment of the function says that it "Does not advance to the next token".

squeek502 · 2024-06-26T23:31:18Z

One kinda nice property of the existing implementation is that there is a reduction of redundant work done when calling peek and then next.

For example, if we have the string:

aaaaaaaaaaaaaaaaaaaaaaaaaaaaaZZ

and use tokenizeScalar(str, 'a'), then:

calling peek would advance the index to just before the ZZ, and return the ZZ slice
calling next afterwards would just need to re-read ZZ and return it

With peek not advancing index, then:

calling peek would iterate over all the a's and return the ZZ
calling next would again iterate over all the a's and return the ZZ

See the original PR that implemented peek for the very similar justification used at the time: #11925 (comment)

gpanders · 2024-06-27T13:19:29Z

While my preference is for peek() to not modify the iterator state, I wouldn't object to the status quo so long as it's clearly documented that self.index is modified (the existing doc comment is a bit misleading, or at least ambiguous).

andrewrk · 2024-07-03T19:36:20Z

The problem statement here seems weak. The API upholds its guarantees. In what context was this discovered and determined to be problematic?

gpanders · 2024-07-03T22:04:31Z

The context was looking at the following segment of code:

var iter = std.mem.tokenizeScalar(u8, line, ' ');
var ws_start: usize = 0;
while (iter.peek() != null) {
    const whitespace = line[ws_start..iter.index];
    const word = iter.next().?;
    ws_start = iter.index;
    // the rest of the code which operates on `word` and `whitespace`
}

When I encountered this code I was confused how it worked at all based on the assumption that peek() does not modify the iterator state (if that were true, ws_start would always equal iter.index and whitespace would always be an empty slice). It was not until I looked at the implementation of TokenIterator.peek() that I found that index was modified.

(The above code has since been rewritten to avoid this confusion in either case, but this is what prompted the initial question and PR.)

The API upholds its guarantees

Strictly speaking this is true, though explicitly stating that the index is/can be modified might help to avoid future confusion. If a functional change is unwanted I can repurpose this PR into a documentation change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

std: do not modify iterator state in TokenIterator.peek() #20427

std: do not modify iterator state in TokenIterator.peek() #20427

gpanders commented Jun 26, 2024

squeek502 commented Jun 26, 2024

gpanders commented Jun 27, 2024

andrewrk commented Jul 3, 2024

gpanders commented Jul 3, 2024

std: do not modify iterator state in TokenIterator.peek() #20427

Are you sure you want to change the base?

std: do not modify iterator state in TokenIterator.peek() #20427

Conversation

gpanders commented Jun 26, 2024

squeek502 commented Jun 26, 2024

gpanders commented Jun 27, 2024

andrewrk commented Jul 3, 2024

gpanders commented Jul 3, 2024