Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retrieving data with fromUrl gets wrong data from Alibaba OSS #196

Closed
AshleySetter opened this issue Feb 2, 2021 · 3 comments · Fixed by #198
Closed

Retrieving data with fromUrl gets wrong data from Alibaba OSS #196

AshleySetter opened this issue Feb 2, 2021 · 3 comments · Fixed by #198

Comments

@AshleySetter
Copy link

AshleySetter commented Feb 2, 2021

I am using the fromUrl method to retrieve an overview of a Cloud Optimised Geotiff (COG) from Alibaba, the method works well for geotiffs hosted on amazon S3, but retrieves incorrect data for geotiffs hosted on Alibaba OSS along with the following warning: RangeError: Offset is outside the bounds of the DataView.

I suspect that the issue may lie in https://github.com/geotiffjs/geotiff.js/blob/master/src/source.js#L257 and may be because the fetch request is reading the wrong bytes or getting some additional bytes from Alibaba.

Below I have added the code I use to retrieve the geotiff overview data from 2 public urls where the same COG is uploaded, one for amazon s3 and one for alibaba OSS, and save it to a csv file. I've set the resX and resY to 1 so it retrieves the lowest resolution overview, which is only 63x63, but the issue occurs for all overviews. I have also added some python code used to visualise the retrieved data which demonstrate the differences visually and may help in debugging and attached the plotted images of the 2 sets of retrieved data.

The first image from amazon contain values of 0, 1 and 5 is the correct data.

const geotiff = require("geotiff");
const log = console.log;
const fs = require("fs");

const amazon_url = "https://image-test-bucket-ash.s3.us-east-2.amazonaws.com/tile_1_2.tif";

(async() => {
  const example = await geotiff.fromUrl(amazon_url);
  log("example: ", example);
  const image = await example.getImage();
  log(image);
  const box = await image.getBoundingBox();
  const boxHeight = Math.abs(box[2] - box[0]);
  const boxWidth = Math.abs(box[3] - box[1]);
  log("bounding box: ", box);
  const data = await example.readRasters({
    bbox: box,
    resX: 1,
    resY: 1,
    fillValue: 0, // value to use for parts of image with no-data
  });
  log(data);
  const stringData = data[0].join(", ");
  fs.writeFile("data_amazon.csv", stringData, (err) => {
    if (err) return log(err);
    log("written amazon data to file");
  });
})();


const alibaba_url = "https://debug-image-bucket.oss-eu-west-1.aliyuncs.com/tile_1_2.tif";

(async() => {
  const example = await geotiff.fromUrl(alibaba_url);
  log("example: ", example);
  const image = await example.getImage();
  log(image);
  const box = await image.getBoundingBox();
  const boxHeight = Math.abs(box[2] - box[0]);
  const boxWidth = Math.abs(box[3] - box[1]);
  log("bounding box: ", box);
  const data = await example.readRasters({
    bbox: box,
    resX: 1,
    resY: 1,
    fillValue: 0, // value to use for parts of image with no-data
  });
  log(data);
  const stringData = data[0].join(", ");
  fs.writeFile("data_alibaba.csv", stringData, (err) => {
    if (err) return log(err);
    log("written alibaba data to file");
  });
})();
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
import numpy as np

data = np.genfromtxt("data_amazon.csv", delimiter=", ")
shape = [int(len(data)**0.5), int(len(data)**0.5)]
geotiff = data.reshape(shape)
im = plt.imshow(geotiff)

values = np.unique(data)
# get the colors of the values, according to the 
# colormap used by imshow
colors = [ im.cmap(im.norm(value)) for value in values]
# create a patch (proxy artist) for every color 
patches = [ mpatches.Patch(color=colors[i], label="Level {l}".format(l=values[i]) ) for i in range(len(values)) ]
# put those patched as legend-handles into the legend
plt.legend(handles=patches, bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0. )
plt.show()

data = np.genfromtxt("data_alibaba.csv", delimiter=", ")
shape = [int(len(data)**0.5), int(len(data)**0.5)]
geotiff = data.reshape(shape)
im = plt.imshow(geotiff)

values = np.unique(data)
# get the colors of the values, according to the 
# colormap used by imshow
colors = [ im.cmap(im.norm(value)) for value in values]
# create a patch (proxy artist) for every color 
patches = [ mpatches.Patch(color=colors[i], label="Level {l}".format(l=values[i]) ) for i in range(len(values)) ]
# put those patched as legend-handles into the legend
plt.legend(handles=patches, bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0. )
plt.show()

amazon_geotiff
alibaba_geotiff

@AshleySetter AshleySetter changed the title Retrieving data with fromUrl get wrong bytes from Alibaba OSS Retrieving data with fromUrl gets wrong data from Alibaba OSS Feb 2, 2021
@constantinius
Copy link
Member

@AshleySetter

Is this issue now fixed with the PR by @aloisklink (#198)? This issue was automatically closed, so if you still have this problem, please reopen the issue

@aloisklink
Copy link
Contributor

Hi all, posting this here in case any body else is having the same issue, and wants the work around.

I'm getting a similar issue with AliyunOSS in v1.0.1:

Error: Server responded with full file
      at s.fetchSlice (node_modules/geotiff/dist-node/geotiff.js:48:1874)
      at processTicksAndRejections (internal/process/task_queues.js:97:5)
      at async Promise.all (index 0)
      at async s.fetch (node_modules/geotiff/dist-node/geotiff.js:48:580)
      at async .../node_modules/geotiff/dist-node/geotiff.js:38:2281

I'm fairly certain that this is the same underlying issue, e.g. Aliyun OSS has a broken API for HTTP Range requests.

Luckily, there's a way to get around that until Aliyun OSS fixes their API.

When using fromUrl(url), you can use the option: allowFullFile: true to disable this error check, for example:

async function main() {
    const tiff = await fromUrl(
        "https://example.oss-eu-west-1.aliyuncs.com/example.tiff",
        {allowFullFile: true},  // add option here
    );
    const image = await tiff.getImage();
    const data = await image.readRasters();
}

Hope this helps anyone in the future having the same issue :)

@hongfaqiu
Copy link

Seems to Aliyun OSS hasn't solved this problem, there will still be random broken API for HTTP Range requests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants