1 This is a patched version of zlib, modified to use 2 Pentium-Pro-optimized assembly code in the deflation algorithm. The 3 files changed/added by this patch are: 4 5 README.686 6 match.S 7 8 The speedup that this patch provides varies, depending on whether the 9 compiler used to build the original version of zlib falls afoul of the 10 PPro's speed traps. My own tests show a speedup of around 10-20% at 11 the default compression level, and 20-30% using -9, against a version 12 compiled using gcc 2.7.2.3. Your mileage may vary. 13 14 Note that this code has been tailored for the PPro/PII in particular, 15 and will not perform particuarly well on a Pentium. 16 17 If you are using an assembler other than GNU as, you will have to 18 translate match.S to use your assembler's syntax. (Have fun.) 19 20 Brian Raiter 21 breadbox@muppetlabs.com 22 April, 1998 23 24 25 Added for zlib 1.1.3: 26 27 The patches come from 28 http://www.muppetlabs.com/~breadbox/software/assembly.html 29 30 To compile zlib with this asm file, copy match.S to the zlib directory 31 then do: 32 33 CFLAGS="-O3 -DASMV" ./configure 34 make OBJA=match.o 35 36 37 Update: 38 39 I've been ignoring these assembly routines for years, believing that 40 gcc's generated code had caught up with it sometime around gcc 2.95 41 and the major rearchitecting of the Pentium 4. However, I recently 42 learned that, despite what I believed, this code still has some life 43 in it. On the Pentium 4 and AMD64 chips, it continues to run about 8% 44 faster than the code produced by gcc 4.1. 45 46 In acknowledgement of its continuing usefulness, I've altered the 47 license to match that of the rest of zlib. Share and Enjoy! 48 49 Brian Raiter 50 breadbox@muppetlabs.com 51 April, 2007