Recent “The Evolution of PhysX” article has unvealed the current situation with performance improvements among various PhysX SDK vesions, however, one interesting case has remained outside the coverage – performance scaling in multithreaded environments.
It is known that, while PhysX SDK 2.8 has rather limited multi-threading capabilities (mostly working on per-scene or per-compartment basis), PhysX SDK 3.x can distribute various tasks across worker threads much more effective, and thus offer better support for multi-core CPUs.
But how well does multi-threading actually work in PhysX 3 (we’ll take the latest 3.3 version)? Using the same PEEL (Physics Engine Evaluation Lab) tool to the record the performance metrics, we will try to shed the light on this question.
Scene #1 – random dynamic primitives in a box
Static container filled with 256 random primitives (sphere, box, capsule).
Scene #2 – random falling convexes
1728 convexes (12×12x12 formation) falling on a plane, forming a pile. Each convex is randomly choosen from 14 predifined objects, of various complexities.
Scene #3 – convexes falling on triangle mesh
4096 convexes (64×64 formation) colliding against tesselated triangle mesh (743 616 triangles in total).
Scene #4 – stacking test with boxes
10 medium-sizes box stacks (10 boxes wide basis).
Scene #5 – spherical joints net
Net, consisting of 40*40 spheres connected with spherical joints, colliding against static object.
Scene #6 – 256 dynamic ragdolls
256 ragdolls, each one comprised of 19 bones connected with hinge joints.
As you may see, multi-threading in PhysX SDK 3.3 is indeed functional and fairly effective, showing significant performance improvements in case of convex-convex collisions (2x times faster in average, 3 threads vs single thread) and stacking (1.88x faster), and lesser, but noticable performance gains in case of collisions between primitives (1.5x faster) and joints (1.2x faster) calculations.
As a downside, additional worker threads are increasing the memory footprint of the scene.
Also, we have discovered that Scene Queries (such as raycasts and sweep tests) are showing same performance regardless to the number of threads.
In any case, improved multi-threading capabilities of PhysX 3.x are making it even more consistent and futureproof, especially when compared with previous generation of the PhysX physics engine.
Appendix #1:
SDK 2.8.4 settings - default, 1 thread. SDK 3.3 settings - default, SAP broadphase, legacy contact generation, 1 - 3 threads.
System: i7 2600K CPU, GTX 580 GPU, 8 GR RAM, Win 7 x64