-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SF2 Support #76
SF2 Support #76
Conversation
Hey just FYI, there's this cool thing called Sapling by facebook https://sapling-scm.com/ Main downside is it replaces git locally, so you'd need to clone a new repository. But once you clone it, you can just Up to you if you wanna use it, but IMO it's really fun and I can't go back to normal git anymore If you aren't familiar with diff stacking, here's a video explaining what it is (using a different tool called graphite) https://www.youtube.com/watch?v=I88z3zX3lMY |
I'll take a look but probably some time later, I already had a pretty rough time learning git properly lol |
Btw @arduano I think I am done with this PR, I just need your input for the changes I mentioned in my first message.:
I am not sure what else to do so I will ignore it. The instrument/preset loading is pretty fast anyway, what takes a lot of time is the resampling.
Even though the quality has improved, loading soundfonts that need resampling takes a pretty long time. Although if we revert to the old resampler the quality loss is extremely noticeable on low sample rate SFs. Also see below
I stopped noticing this indexing issue after tinkering with the values of the sinc resampler but I would still like to hear your take on this "live resampling" idea (also about the performance issues I mentioned above).
I also a bit unsure about the implementation of this. Mostly regarding performance because I am searching the voice buffers on every new addition. I'll probably do some benchmarks but if you have some time take a look at this too. |
You didn't change the actual voice code right, or am I missing something? Regarding live resampling, it would be a massive performance hit. Currently we just do nearest neighbor, but even linear interpolation takes like a 4x hit or something big like that. The goal is to pre-interpolate as much as possible into bery high sample rates (4x or 8x the output sample rate iirc) so when pitch bends happen, the nearest neighbor interpolation becomes higher quality |
core/src/channel/voice_buffer.rs
Outdated
@@ -143,7 +143,9 @@ impl VoiceBuffer { | |||
} | |||
|
|||
if let Some(max_voices) = max_voices { | |||
if self.options.fade_out_killing { | |||
if self.buffer.iter().filter(|v| v.id == id).count() > max_voices { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait in what context is this condition even possible, what's your layer count? And how many voices do your SF2 spawners create per voice stack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can debug this by using a soundfont that has 2 voices per key press (for example XP-80 fantasy) and set the layer count to 1. Before this change, two voices with the same ID would be added and ignored. So it would constantly try to reduce the voices to 1 while ignoring two voices and this would end up in an infinite loop, freezing the synth.
What I am skeptical about is the implementation for this fix because it might be slow when the buffers have a lot of voices. I'll think if I can come up with a better solution though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, It's probably better to replace Iterator with ExactSizeIterator (a rust builtin trait), and then do conditions based on that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my opinion, It's probably better to replace Iterator with ExactSizeIterator (a rust builtin trait), and then do conditions based on that
I am guessing here you mean to use ExactSizeIterator for the voices parameter of the function. I tried this and it will need a lot more changes in the code so I added a counter of the added voices since we iterate over them anyway. If you meant something else let me know.
No I don't think I changed any voice code. Alright so should I keep the |
Btw @arduano could you review this when you find some time so I can make any further changes and merge I am also planning to do some other things (in another PR) that will probably conflict with this if it's not merged first |
Hey yeah sorry I must've missed the emails for your original comments |
Addressed the comments |
I made the changes. If there is anything else let me know.
|
To compensate for the lack of interpolation on playback, I increased the sample multiplier for soundfont loading, so a 48000 audio sample would become 192000 or something like that. This greatly reduces the static hissing noise during pitch bends (btw only pitch bends are affected). Of course, the cost is that more memory is required. |
Yeah, it is slower compared to your resampler. I will do some benchmarks before merging but I'd rather keep it anyway because it improves the quality a lot.
From what I can tell the samples are resampled to the stream's sample rate. Where is that part of code you are talking about? Also is there anything else I should change before merging? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also just make sure midis with pitch bends don't hiss. That's the only concern with all this re-sampling
Actually nevermind, the resampling seems to be faster with Here are the results of some test runs I did for a few soundfonts. I had it load each soundfont with 4 different sample rates so it would do all kinds of resampling. Btw I did wait for my laptop to completely cool down before running each test just in case. CFaz III (WAV, 44.1kHz)master
sf2
Amethyst 2.0 (FLAC, 48kHz)master
sf2
JV1080 (WAV, 44.1kHz)masterAs SFZ
As SF2N/A sf2As SFZ
As SF2
Aside from these I also ran the Also about the hissing, I personally didn't notice any difference between |
Everything looks good to me yeah, just overall thank you for implementing then #1 biggest feature request in xsynth lol Feel free to merge whenever you're happy with it |
No problem lol, I do enjoy working on this :) |
Ok I have to go for real now lol |
Right now it is in a working state but there are plenty of things to be added, as well as optimized.
Also the checks are probably going to fail due to MTK having some build errors, for which I have made a PR on its repo (arduano/midi-toolkit-rs#14).
To-Do
Notable changes
rubato
, because the current implementation had very low quality results on low sample rate samples. (*1 ->) This still needs to be change/fixed though as the looping indexes are usually wrong. One solution I thought of is to not resample anything when loading the soundfont and use the speed multiplier in the spawner settings (essentially resampling live) but this has low quality downsampling especially using nearest neighbor (default).Preview
(*2 ->) Maybe some SIMD on the generator checking and loading but idk how to do that or even if it's possible
(*3 ->) Currently all samples are mono, so SF2s with stereo samples will use 2 voices per key hit. The SF2 spec has some options for linking samples with their stereo pair but it is way more complicated than it seems