-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Julep: Support universal newlines and make it default for Text IO #19785
Comments
Inserting |
I forgot to say that adding When reading It is also useful to read the original Python PEP https://www.python.org/dev/peps/pep-0278/ |
I'd be fine with making any line-oriented functions more permissive in what they considered to be a line ending. But I think I/O should generally be binary-faithful. I wasn't much of a fan of #14073, but parsing of Julia source files does normalize line endings from |
I think code should to be written in a way that there is no need to call Of course using the Python way is not the only solution, but I think it works very well. For instance I recently had a bug in Weave.jl (JunoLab/Weave.jl#72) related to this, which I fixed using Line-oriented functions do work with |
I just opened #19815 . It would not be fixed with this proposal, but shows I'm not the only one who forgot to deal with |
After giving it some thought I agree fixing line oriented functions as @tkelman suggested is a good solution. Then
If this is OK I can try to implement it and make pull request. |
+1, but I'd start with a PR to change Regarding whether to include the newlines character(s), could you check what popular languages do (in particular recent ones like Rust, Swift and Go)? We could always add an argument to choose the behavior. |
Rust and Swift seem to drop newlines by default, Python doesn't. https://doc.rust-lang.org/std/io/trait.BufRead.html#method.lines It'd be good to add dropping the character as an option to Julia |
OK, thanks for checking. Changing the default would make sense to me. It should be possible by first adding the new argument and deprecating the one-argument method in 0.6, and then changing its meaning in 1.0. Maybe it would be accepted if you can make a PR shortly. |
@StefanKarpinski any thoughts on this? I can try implement the changes to I would be fine with keeping the current default (as I'm used to it in Python anyway) and just adding the variant that removes the newlines as an option. |
Would it be OK to implement function readline(s::IO)
linefeeds = ['\n', '\r', '\u85', '\u0B', '\u0c', '\u2028', '\u2029']
out = IOBuffer()
while !eof(s)
c = read(s, Char)
write(out, c)
if c in linefeeds
if c == '\r' && !eof(s) && Base.peek(s) == 0x0a
write(out, read(s, Char))
end
break
end
end
return String(take!(out))
end I'm not sure if options to chomp.(readlines(file)) To get rid of EOL characters. |
Anyway if you want to change the default, you'll have to first provide a version with an argument, to allow for a non-breaking deprecation period. So you may as well start with that. Then seeing how much code would need to be changed in Base (and whether it would simplify code or not) would be an interesting data point to make a decision. The version using |
How about? function readline(s::IO, chomp = true; nl2lf=false)
nl2lf && (chomp = false)
linefeeds = ['\n', '\r', '\u85', '\u0B', '\u0c', '\u2028', '\u2029']
out = IOBuffer()
while !eof(s)
c = read(s, Char)
if c in linefeeds
if c == '\r' && !eof(s) && Base.peek(s) == 0x0a
!(nl2lf || chomp) && write(out, c)
c = read(s, Char)
!chomp && write(out, c)
else
nl2lf && (c = '\n')
!chomp && write(out, c)
end
break
else
write(out, c)
end
end
return String(take!(out))
end If Did I understand correctly that if we want to make function readline(s::IO)
readline(s, false)
end and add the approriate deprecation warning? |
I guess we could have a |
For now I/O streams are assumed to contain UTF-8 text, since they have no encoding information attached to them anyway. When passing a custom string type to Custom string types are free to implement their own |
I have added support for The code that detects line ends also works for most encodings as they share the same control characters. |
I think it would be useful to have universal newlines mode for
open
similar to Python. A short description of the mode foropen
from Python 3.6 docs https://docs.python.org/3.6/library/functions.html#open .Currently this needs to be handled manually in Julia which I think is a bit annoying and easy to enough to forget.
So I suggest the following:
open
as a flag to mode or a new keyword argumentreadstring
,readlines
,eachline
The text was updated successfully, but these errors were encountered: