You need to me careful about benchmarking to find performance problems after the fact. You can get stuck in a local maxima where there is no particular cost center buts it’s all just slow.
If performance specifically is a goal there should probably at least be a theory of how it will be achieved and then that can be refined with benchmarks and profiling.
It does. I use this port