a 2x2 matrix visually here is about as good as 4x4. at least in what i
see, but on low end gpu's it can halve the cost. in fact i was watching the
gpu on my old i5-4200u drop down to 340-410mhz (no dithering is 320-360mhz).
i got to 630-660mhz with the original 4x4 code.
the 4x4 is still there ifdefed out. perhaps i can bring it back with a
high-quality dither option, but 2x3 i think is good enough.