IPTCParser now takes the raw bytes of an IPTCRecord element into account #4

ola-github · 2015-01-15T14:30:49Z

The IPTCRecord supports a String value and raw bytes. When writing an IPTCRecord into the file the raw bytes are ignored and the string value is used only instead. This value is encoded in charset 8859-1. There is currently now way to use a different encoding then 8859-1.

The given change now uses the raw bytes if existent. if not then if falls back to the previous strategy (string value encoded as 8859-1)

My context:I am currently working on handling of IIM and XMP metadata in images and i need the capability to use a encoding different from 8859 im IIM. I would be happy to hear your opinions on the proposal.

…unt. If it contains bytes then these bytes will be written into the file. Otherwise the string will be written using encoding iso8859-1. This allows to store values encoded to charsets different from iso8859-1

mgmechanics · 2015-04-03T15:03:43Z

Hi Oliver,
Thank you for your patch!
I'm collecting all the waiting patches for Apache Commons-Imaging and employ them at http://sourceforge.net/p/albonubes/
I hope you forgive me that I use your patches for a child project. Maybe you give it a try?
Michael

kinow · 2017-12-23T11:28:31Z

I think from your description you have a valid use case for handling the encoding yourself, instead of letting the library use ISO-8859-1 only.

I started to rebase the branch locally, when I realized that the IptcRecord class actually lost the #getRawBytes() method. So we can't really merge the code any longer.

Not sure why it changed, but looks like we will have to either add it back, or find an alternative solution. Sorry @ola-github Let me know if you have any idea how to update the pull request.

kinow · 2019-11-10T06:30:33Z

src/main/java/org/apache/commons/imaging/formats/jpeg/iptc/IptcParser.java

+                 * own encoding of fields.
+                 */
+                final byte[] recordData;
+                if( element.getRawBytes() != null && element.getRawBytes().length > 0 ) {


Ah, I had a better look using the git history, and getRawBytes returned the bytes returned from a getBytes("ISO-8859-1") call.

So the only way that getRawBytes would return null, would be if there was an error encoding the String into ISO-8859-1. Which would also happen in the other branch of the if statement here.

So I believe the change wouldn't help you, as we would be still using ISO-8859-1 before you got the bytes.

garydgregory · 2023-12-23T16:35:48Z

This needs a test.

IPTCParser now takes the raw bytes of an IPTCRecord element into acco…

30d14f4

…unt. If it contains bytes then these bytes will be written into the file. Otherwise the string will be written using encoding iso8859-1. This allows to store values encoded to charsets different from iso8859-1

ola-github mentioned this pull request Jan 15, 2015

Keep eye on Pull Request on commons-imaging dpa-gmbh/metadata-mapper#10

Closed

#4: typo - element.getRawBytes() instead of element.value.getBytes()

27ed8eb

kinow reviewed Nov 10, 2019

View reviewed changes

kinow added the someday label Jul 5, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

IPTCParser now takes the raw bytes of an IPTCRecord element into account #4

IPTCParser now takes the raw bytes of an IPTCRecord element into account #4

ola-github commented Jan 15, 2015

mgmechanics commented Apr 3, 2015

kinow commented Dec 23, 2017

kinow Nov 10, 2019

garydgregory commented Dec 23, 2023

IPTCParser now takes the raw bytes of an IPTCRecord element into account #4

Are you sure you want to change the base?

IPTCParser now takes the raw bytes of an IPTCRecord element into account #4

Conversation

ola-github commented Jan 15, 2015

mgmechanics commented Apr 3, 2015

kinow commented Dec 23, 2017

kinow Nov 10, 2019

Choose a reason for hiding this comment

garydgregory commented Dec 23, 2023