Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example usage with WebCodecs AudioDecoder or VideoDecoder? #22

Open
JohnWeisz opened this issue Jun 18, 2022 · 3 comments
Open

Add example usage with WebCodecs AudioDecoder or VideoDecoder? #22

JohnWeisz opened this issue Jun 18, 2022 · 3 comments

Comments

@JohnWeisz
Copy link
Contributor

JohnWeisz commented Jun 18, 2022

Hi, thanks for your work on this great library, it's working perfect for parsing audio files.

Do you have an example about how to use it with the WebCodecs API to decode frames? My use case is decoding an audio file into raw LPCM samples. In particular, what's not clear to me:

  • How do duration and totalDuration of a CodecFrame correlate to the timestamp and duration of an EncodedAudioChunk?
  • How to determine if a CodecFrame is a keyframe?
  • Is it the CodecFrame.data array that you pass as the data to EncodedAudioChunk? Do you pass it as-is or by copying only the portion the Uint8Array view references?

Thanks in advance, I'll add a reply if I figure it out myself in the meantime.

@JohnWeisz
Copy link
Contributor Author

JohnWeisz commented Jun 18, 2022

I figured it out in the meantime, it's actually very simple and straightforward all things considered, but there are quite a few gotchas:

  • EncodedAudioChunk expects microseconds, so durations and timestamps reported by CodecParser need to be amped by 1000
  • CodecFrame.data seems to use a "shared" ArrayBuffer across all frames/chunks, so the slice referenced by the Uint8Array must be cloned, and the clone passed to EncodedAudioChunk
  • It seems, at least for mpeg, that timestamp can be either 0, or the totalDuration reported by the CodecFrame, because both seem to work

In a nutshell, it's something like:

for (let frame of parser.parseAll(file)) {
    let frameBufferCopy = new ArrayBuffer(frame.data.length);
    new Uint8Array(frameBufferCopy).set(frame.data);

    let audioChunk = new EncodedAudioChunk({
        data: frameBufferCopy,
        timestamp: 0, // or can be frame.totalDuration
        type: "key", // didn't figure this out, but making each chunk a keyframe seems to cause no issues
        duration: frame.duration * 1000 // milliseconds to microseconds
    });

    // queue 'audioChunk' for decoding...
}

@eshaz
Copy link
Owner

eshaz commented Jun 18, 2022

Thanks for posting this, it's a good example. I'll make an update to the docs explaining the shared aspect of the data property.

One note on keyframes,

All of the codecs supported in this library aac, mpeg, opus, vorbis, and flac don't use keyframes. Based on my understanding of keyframes, it doesn't apply to audio compression at all. It's a structure that's only applied to video compression. It's interesting that the WebCodecs API requires this for audio data. Maybe it's used somewhere else to sync audio and video data together?

@JohnWeisz
Copy link
Contributor Author

JohnWeisz commented Jun 19, 2022

@eshaz Might be a simple API limitation in this case. I found that:

  1. marking the first EncodedAudioChunk as a "keyframe", and
  2. retrying decoding as a "keyframe" (once) in case of an error

... together get decoding to work great. Performance is pretty good as well, currently in my case it's very slightly behind AudioContext.decodeAudioData, but there is probably a lot of room for optimization to match or even surpass it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants