Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read worksheets without formatting #50

Open
hhoeflin opened this issue Mar 5, 2015 · 6 comments
Open

Read worksheets without formatting #50

hhoeflin opened this issue Mar 5, 2015 · 6 comments

Comments

@hhoeflin
Copy link

hhoeflin commented Mar 5, 2015

Hi,

When reading columns and converting to character - the formatting is used. This can lead to undesired inaccuracies. As this is the default behaviour, I wanted to request a switch when reading worksheet that leads to ignoring of all formats (even when type conversion from numeric to character is done).

Also - the help could more prominently note that when numeric values are coerced to character - this can lead to rounding issues.

Thanks!

@juliettefabre
Copy link

Hello,

My issue seems related to this one but a little bit different: when importing my spreadsheet, some of my datetimes are automatically converted in a wrong way, and I don't manage to reverse it after the import.

The first column of my spreadsheet contains datetimes in UTC. The datetimes causing problems are initially these ones in the spreadsheet:

2016-03-27 00:00:00
2016-03-27 00:30:00
2016-03-27 01:00:00
2016-03-27 01:30:00
2016-03-27 02:00:00
2016-03-27 02:30:00
2016-03-27 03:00:00
2016-03-27 03:30:00
2016-03-27 04:00:00

They correspond to the summer time change, but they are provided in UTC.

After the Excel file import, these lines have been transformed into:

"2016-03-27 00:00:00 CET"  
"2016-03-27 00:30:00 CET" 
"2016-03-27 01:00:00 CET"  
"2016-03-27 01:30:00 CET" 
"2016-03-27 03:00:00 CEST" 
"2016-03-27 03:30:00 CEST" 
"2016-03-27 03:00:00 CEST" 
"2016-03-27 03:30:00 CEST"
"2016-03-27 04:00:00 CEST" 

So the 2 original dates:

2016-03-27 02:00:00
2016-03-27 02:30:00

are transformed into:

"2016-03-27 03:00:00 CEST" 
"2016-03-27 03:30:00 CEST" 

I don't get the logic, and don't manage neither to reverse it by using format etc ..
The only solution I see would to tell the library to not convert the initial datetime, or to read it as simple UTC?
Did I miss something?

Thanks a lot,
Juliette

@juliettefabre
Copy link

juliettefabre commented Feb 1, 2019

Hello,
I didn't manage to solve this problem, I still don't understand how this conversion works, and why only 2 datetimes are modified .. Could anyone have a look at this?
Thanks,
Juliette

@RolandASc
Copy link
Member

hi @juliettefabre - yes, to reduce our effort here, could you please provide some more information, such as the versions of R, Excel, XLConnect and the actual code that you are executing that produces this behavior? Providing the output of sessionInfo() is a good place to start. Thanks

@juliettefabre
Copy link

juliettefabre commented Feb 1, 2019

Thanks,

Yes here is the result of sessionInfo():

R version 3.4.4 (2018-03-15)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.5 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=fr_FR.UTF-8          LC_NUMERIC=C                  LC_TIME=fr_FR.UTF-8           LC_COLLATE=fr_FR.UTF-8        LC_MONETARY=fr_FR.UTF-8      
 [6] LC_MESSAGES=fr_FR.UTF-8       LC_PAPER=fr_FR.UTF-8          LC_NAME=fr_FR.UTF-8           LC_ADDRESS=fr_FR.UTF-8        LC_TELEPHONE=fr_FR.UTF-8     
[11] LC_MEASUREMENT=fr_FR.UTF-8    LC_IDENTIFICATION=fr_FR.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] sp_1.2-5             XLConnect_0.2-15     XLConnectJars_0.2-15 RPostgreSQL_0.4      DBI_0.3.1           

loaded via a namespace (and not attached):
 [1] Rcpp_1.0.0       lattice_0.20-35  XML_3.98-1.16    geonapi_0.2-0    geometa_0.4-0    grid_3.4.4       R6_2.2.2         plyr_1.8.4       gtable_0.1.2    
[10] scales_0.5.0     httr_1.3.1       ggplot2_2.2.1    pillar_1.2.3     rlang_0.2.1      lazyeval_0.2.1   tools_3.4.4      munsell_0.4.2    yaml_2.2.0      
[19] compiler_3.4.4   colorspace_1.2-4 rJava_0.9-8      openssl_1.0.2    tibble_1.4.2    

The Excel files read by my script are based on templates that I previously generated with XLConnect, but actually the problem is the same if I import a really basic Excel file (test.xls) made with LibreOffice (Microsoft Excel 97-2003 format) with only one column and these values:

date
2016-03-27 00:00
2016-03-27 00:30
2016-03-27 01:00
2016-03-27 01:30
2016-03-27 02:00
2016-03-27 02:30
2016-03-27 03:00
2016-03-27 03:30

Then in R:

wb <- loadWorkbook('test.xls')
data <- readWorksheet(wb, sheet = 1)
data
                 date
1 2016-03-27 00:00:00
2 2016-03-27 00:30:00
3 2016-03-27 01:00:00
4 2016-03-27 01:30:00
5 2016-03-27 03:00:00
6 2016-03-27 03:30:00
7 2016-03-27 03:00:00
8 2016-03-27 03:30:00

Juliette

@RolandASc
Copy link
Member

Ok thanks - so I can replicate this.

The part that isn't clear to me though, is how can you specify the dates in Excel to be UTC?

Because it's reading the times without any offset, so I'd say they're treated as CET/CEST to begin with, and then 2 or 2:30 are just kind of invalid and become 3 and 3:30.

I can set the time-zone in R with e.g. Sys.setenv("TZ" = "UTC").

Then, consequently, I get the below, which I think is correct logically speaking (I added two more timestamps at the bottom):

                  date
1  2016-03-26 23:00:00
2  2016-03-26 23:30:00
3  2016-03-27 00:00:00
4  2016-03-27 00:30:00
5  2016-03-27 01:00:00
6  2016-03-27 01:30:00
7  2016-03-27 01:00:00
8  2016-03-27 01:30:00
9  2016-03-27 02:00:00
10 2016-03-27 02:30:00

So in Switzerland I am +1 compared to UTC in winter, and in summer (DST) I am +2.
(It's still treating my Excel input according to my locale.)

@juliettefabre
Copy link

Thanks a lot Roland, I'm so sorry, I just needed to add Sys.setenv("TZ" = "UTC") to import and keep the dates in UTC.
Thanks again,
Juliette

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants