I read the Armory3d architecture page, and dug through the code, and I’m amazed by this Haxe based technology stack.
However, there are some sticky and intertwined issues that affect the performance of data exchange across the boundary between C-code (native or WASM) and Haxe code.
Similar issues occur in thunking to other managed languages, but in other engines (UE4, Blender, Source, etc) the core of the engine is written in the low-level language. Having the core of Armory3d / Iron / Kha written in the high-level language complicates efficient integration of low-level modules.
This is especially important for a cloth simulator, because it needs to see all meshes and acceleration structures (aka BVH/kdtree) because it needs to do collision detection for all collision targets during the cloth solve. This is also stuff that effects haxebullet physics.
After my digging, I’ve decided I don’t think it’s practical to use a C cloth simulator with Armory3d, because of the data-marshalling issues… and I don’t think it’s practical to expect good multi-threading out of Armory3d, because the Krom target doesn’t support it, and the CPP/native target has a very very poor garbage collector.
Probably it would be more practical to convert my cloth simulator to Haxe+GPU compute. This might be a decent route for me, because the way Armory3d works with Blender scene files is extremely advantageous for my application - because it will be a very long time before the UE4 editor is as good as Blender.
Here is a summary of the detailed facts that lead me to this conclusion. I welcome any corrections or thoughts…
(1) Armor3d / Iron / Kha allocates the scene data structures in Haxe, which turns into Javascript GC heap objects in the Krom target. Mesh data is allocated by Kha as a Javascript “Int16Array”…
(2) Inside Krom/V8, WASM code (aka Emscripten compiled C code), can only see data allocated inside the WASM heap. This means to see any of the above scene or javascript mesh arrays, they need to be copied into the WASM heap, causing performance cost and memory bloat. This makes a C/Emscripten compiled cloth simulator pretty impractical.
For details, read Most performant way to pass data JS/WASM context · Issue #1231 · WebAssembly/design
(3) Integrating a C cloth simulator into Krom as a native C library could avoid copying mesh buffers, but there is still a big impedance mismatch, because it will have to use V8 APIs to traverse the scene data that lives in Javascript GC objects.
(4) CPU side collision detection also requires seeing animated mesh results, and mesh acceleration structures. Something needs to get the skinned mesh results into CPU memory (whether computed on the CPU or GPU), and update CPU memory acceleration structures (like a BVH). If this data lives in Javascript/GC memory, then it’ll have the same issues as described above to be seen by C/native or C/Emscripten.
(5) As a result…it seems most viable with Armory3d to write a cloth simulator in Haxe, so it has most efficient access to the data. However, unfortunately, Armory3d on Krom is single threaded, because Haxe/Krom compiles to Javascript with GC heap objects, and the Javascript V8 GC heap is single threaded. This would mean the cloth simulator would also be single-threaded, which is pretty painful. (Likewise, single-threaded CPU mesh animation skinning would be pretty painful)
(6) For Haxe code to multi-thread with shared heap objects on Krom, it would need to stop using the Javascript GC heap. This would basically mean running the Haxe CPP target on WASM, and relying on hxcpp’s inferor garbage collector, instead of the excellent (but single threaded) V8 garbage collector. My understanding is that this is not currently possible, as hxcpp currently does not generate Emscripten compatible code.
(7) The Haxe cpp target running native can support threads, and can share data buffers by reference with C, but relies on it’s very poor internal garbage collector. It would still have the same Haxe data thunking issues as C/native/Krom.