Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPTCParser now takes the raw bytes of an IPTCRecord element into account #4

Open
wants to merge 2 commits into
base: trunk
Choose a base branch
from

Conversation

ola-github
Copy link

The IPTCRecord supports a String value and raw bytes. When writing an IPTCRecord into the file the raw bytes are ignored and the string value is used only instead. This value is encoded in charset 8859-1. There is currently now way to use a different encoding then 8859-1.

The given change now uses the raw bytes if existent. if not then if falls back to the previous strategy (string value encoded as 8859-1)

My context:I am currently working on handling of IIM and XMP metadata in images and i need the capability to use a encoding different from 8859 im IIM. I would be happy to hear your opinions on the proposal.

…unt. If it contains bytes then these bytes will be written into the file. Otherwise the string will be written using encoding iso8859-1. This allows to store values encoded to charsets different from iso8859-1
@mgmechanics
Copy link
Contributor

Hi Oliver,
Thank you for your patch!
I'm collecting all the waiting patches for Apache Commons-Imaging and employ them at http://sourceforge.net/p/albonubes/
I hope you forgive me that I use your patches for a child project. Maybe you give it a try?
Michael

@kinow
Copy link
Member

kinow commented Dec 23, 2017

I think from your description you have a valid use case for handling the encoding yourself, instead of letting the library use ISO-8859-1 only.

I started to rebase the branch locally, when I realized that the IptcRecord class actually lost the #getRawBytes() method. So we can't really merge the code any longer.

Not sure why it changed, but looks like we will have to either add it back, or find an alternative solution. Sorry @ola-github Let me know if you have any idea how to update the pull request.

* own encoding of fields.
*/
final byte[] recordData;
if( element.getRawBytes() != null && element.getRawBytes().length > 0 ) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I had a better look using the git history, and getRawBytes returned the bytes returned from a getBytes("ISO-8859-1") call.

image

So the only way that getRawBytes would return null, would be if there was an error encoding the String into ISO-8859-1. Which would also happen in the other branch of the if statement here.

So I believe the change wouldn't help you, as we would be still using ISO-8859-1 before you got the bytes.

@kinow kinow added the someday label Jul 5, 2021
@garydgregory
Copy link
Member

This needs a test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants