Thursday, June 21, 2012

Another Random Observation

memcpy () is fast!  Really, really fast.

One of the things I need to do during runtime is copy small-to-moderately-sized chunks of data around.  Experimenting some with SIMD-optimized copy functions, then testing under heavy load revealed a Great Big Dirty Secret, which is that for the majority of use cases, plain-boring-old standard memcpy is actually the optimal solution.

With memcpy I can shift 10gb of data in about 6 milliseconds.  A SIMD version needs almost 4 times as long - 23 milliseconds.  Adjusting the pattern a little shows that memcpy starts to slow down a little under different variations of number of calls, size of data being copied in each call, etc, but never gets much slower than 17 milliseconds.  The SIMD version is at least consistent - it hovers around the 21 to 23 mark no matter what you throw at it.

For kicks I ported the copy routines from Doom 3, but they turned out even slower.

I guess that this kind of thing may have been relevant back in 2003/2004, but in 2012 the rules have obviously changed (AGAIN!)  If I was still targetting older machines from that timeframe I may reconsider and retest on something representative; because I'm D3D11 and therefore not doing so - I'm not even going to bother.

memcpy it is - yayyyy! memcpy!

No comments: