-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPTCParser now takes the raw bytes of an IPTCRecord element into account #4
base: trunk
Are you sure you want to change the base?
Conversation
…unt. If it contains bytes then these bytes will be written into the file. Otherwise the string will be written using encoding iso8859-1. This allows to store values encoded to charsets different from iso8859-1
Hi Oliver, |
I think from your description you have a valid use case for handling the encoding yourself, instead of letting the library use ISO-8859-1 only. I started to rebase the branch locally, when I realized that the Not sure why it changed, but looks like we will have to either add it back, or find an alternative solution. Sorry @ola-github Let me know if you have any idea how to update the pull request. |
* own encoding of fields. | ||
*/ | ||
final byte[] recordData; | ||
if( element.getRawBytes() != null && element.getRawBytes().length > 0 ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I had a better look using the git history, and getRawBytes
returned the bytes returned from a getBytes("ISO-8859-1")
call.
So the only way that getRawBytes
would return null
, would be if there was an error encoding the String into ISO-8859-1. Which would also happen in the other branch of the if
statement here.
So I believe the change wouldn't help you, as we would be still using ISO-8859-1 before you got the bytes.
This needs a test. |
The IPTCRecord supports a String value and raw bytes. When writing an IPTCRecord into the file the raw bytes are ignored and the string value is used only instead. This value is encoded in charset 8859-1. There is currently now way to use a different encoding then 8859-1.
The given change now uses the raw bytes if existent. if not then if falls back to the previous strategy (string value encoded as 8859-1)
My context:I am currently working on handling of IIM and XMP metadata in images and i need the capability to use a encoding different from 8859 im IIM. I would be happy to hear your opinions on the proposal.