(^_^) FastDXT
FastDXT is fast implementation of a DXT compressor, giving you real-time compression speed for HD and 4K content. It uses a smart algorithm and an implementation with multimedia instrutions (SSE2). It offers a fully optmized DXT1 code, a partially optimized DXT5 code, and finally a partially optimized DXT5 using the YCoCg colorspace code. The DXT5-YCoCg code gives a reduced error at the expense of a smaller compression factor (6 vs 3). The code is also multi-threaded (for now, 1,2, or 4 threads for single, dual, or quad-core CPUs).
Check wikipedia to understand how DXT works.
Check the paper: Real-Time Compression For High-Resolution Content (authors: L. Renambot, B. Jeong, J. Leigh, Proceedings of the Access Grid Retreat 2007, Chicago, IL)
The source code is made available under the terms of the GNU General Public License.
Building this source requires a compiler working with SSE2 intrinsics and the following libraries: SDL, GLEW (for display only). A makefile is included for Unix systems (Linux and OS X). A Visual Studio 2005 project file is included for Windows.
Acces to the Subversion repository might be provided in the future.
The package is available after filling the form (it helps us justify the work to our funding agencies):
How to use the DXT code: for now, read the sample code (example.cpp, 2dxt.cpp, ...)
By default the code uses SSE2 intrinsics to get the fatest speed (using DXT_INTR #define). For a C-only implementation, remove the #define in the Makefile and/or dxt.h file.
// DXT formats #define FORMAT_DXT1 1 #define FORMAT_DXT5 2 #define FORMAT_DXT5YCOCG 3 // Main compression function, multi-threaded (1,2, or 4 threads) int CompressDXT(const byte *in, byte *out, int width, int height, int format, int numthreads); // Compress to DXT1 format void CompressImageDXT1( const byte *inBuf, byte *outBuf, int width, int height, int &outputBytes ); // Compress to DXT5 format void CompressImageDXT5( const byte *inBuf, byte *outBuf, int width, int height, int &outputBytes ); // Compress to DXT5 format, first convert to YCoCg color space void CompressImageDXT5YCoCg( const byte *inBuf, byte *outBuf, int width, int height, int &outputBytes ); // Compute error between two images (uncompressed, one image flipped) double ComputeError( const byte *original, const byte *dxt, int width, int height);
#include "libdxt.h" byte *in; byte *out; int nbbytes, numthreads; // Allocate input buffer, 16-byte aligned in = (byte*)memalign(16, width*height*4); memset(in, 0, width*height*4); // Allocate output buffer, 16-byte aligned out = (byte*)memalign(16, width*height*4/8); memset(out, 0, width*height*4/8); numthreads = 2; nbbytes = CompressDXT(in, out, width, height, FORMAT_DXT1, numthreads); printf("Compressed %d bytes to %d bytes\n", width*height*4, nbbytes)
FastDXT package includes a small suite of tools that can be used to learn how to integrate it into your own code
For reference:
Here are some performance numbers:
OpenGL DXT texture compression: texture_compression_dxt1.txt
Great paper describing how to optimize DXT to achieve real-time performance: J.M.P. van Waveren Real-Time DXT Compression May 20th 2006, Id Software, Inc. PDF
Real-Time Compression For High-Resolution Content (authors: L. Renambot, B. Jeong, J. Leigh, Proceedings of the Access Grid Retreat 2007, Chicago, IL)
FastDXT requires an OpenGL 2.0 implementation with the extensions GL_ARB_vertex_program, GL_ARB_fragment_program, and GL_ARB_texture_compression to display compressed DXT5+YCoCg textures correctly.
renambot (at) uic.edu