Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merging deployments: new functionality proposed by user #310

Open
damianooldoni opened this issue Jun 12, 2024 · 6 comments
Open

Merging deployments: new functionality proposed by user #310

damianooldoni opened this issue Jun 12, 2024 · 6 comments

Comments

@damianooldoni
Copy link
Member

In a project related private INBO repo, @MartijnUH had the following feature request:

@damianooldoni it is possible to provide a function in the camtrapdp or camtraptor package that merges deployments into one deployment.

While preparing a report, I noticed that for Meerdaal (GMU8) there are some deployments that are split incorrectly, while the images actually belong to one and the same deployment. For an example, take a look at csv in attachment

@jimcasaer, @LynnPallemaerts if this is a problem that occurs in Agouti, then let's see how this can be avoided in the future.

This is a manipulation of a camtrap DP, so I think we should implement it in camtrapdp and import/reexported in camtraptor. @peterdesmet: how do you see it?

The fact that these deployments are splitted due to a possible bug in Agouti changes the urgency of the issue, of course, but not the subject of it. Users are probably interested in manipulating a camtrap DP by merging multiple deployments.

duplicate_deployments.csv

@jimcasaer
Copy link
Collaborator

I think we should make a difference between:

  • the interest of users to merge deployments (for I do not know what reason)
  • a function that allows to join two parts of a deployment that were split in Agouti but in reality are belonging to the same deployment -- this is in reality fix of an Agouti problem

@damianooldoni @peterdesmet another question that returns is the possibility to join to exports of different projects (two camtrapDP packages) into one before starting to run analyses for the total joined new dataset -- I know this was mentioned before but no idea where we are for this functionality

@peterdesmet
Copy link
Member

  1. I agree with @jimcasaer that this functionality is better implemented in Agouti. That is where the source data are managed.
  2. There is currently a solution to this with dplyr and camtrapdp, but it's a bit cumbersome. You can:
  3. Get the deployments()
  4. Use dplyr to join two deployments (giving same deploymentID, joining dates, removing the duplicate)
  5. Assign the result with deployments()<-
  6. The tricky part is that you also need to update the deploymentID in media() and observations()

I'm not entirely convinced we need a dedicated function for this right now... I wonder how big the use case is.

  1. @jimcasaer merging datasets is on our todo list and described here: Support combining Camtrap DP into one, i.e. support multiple projects tdwg/camtrap-dp#380 It is a bit involved, but we would like to tackle it. Note that there currently is a workaround:
  2. Get deployments() from two datasets
  3. bind_rows() and assume the identifiers are unique across both datasets, which they are for Agouti
  4. Assign the result with deployments()<- to one of the two datasets
  5. Repeat for media() and observations()
  6. Don't care about the metadata getting updated.

@MartijnUH
Copy link
Collaborator

I agree. Let's not add this functionality to the camtrapdp package.
The reason for writing this function was to temporarily solve the issue of splitted deployments until the issue is adressed within Agouti. @peterdesmet @jimcasaer has one of you already flagged this issue to the Agouti developpers? If not, I will do this. Who should I contact in that case?

@peterdesmet
Copy link
Member

I have created an issue in the Agouti GitLab repository (not public).

@peterdesmet
Copy link
Member

@MartijnUH, the answer of Yorick:

Typically users rename files so that all images of one deployment can be uploaded together. If they don't and upload subfolders of the same deployment as separate folders in Agouti they have two option: merge those afterwards in R or re-upload the deployment after renaming.

A merge deployment feature would be possible but may be quite a bit of work for something that does not happen very often. We could also try to solve this at the uploading phase by adding some logic to allow uploading several files with the same filename.

Thoughts?

@damianooldoni
Copy link
Member Author

@MartijnUH, @peterdesmet: any news about this? May I close this issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants