LUT probably couldn't fit in CPU cache anyways. TODO: Consider whether LUTs for separate channels (size 32 * 3 * 3 instead of std.math.maxInt(u15))
LUT probably couldn't fit in CPU cache anyways. TODO: Consider whether LUTs for separate channels (size 32 * 3 * 3 instead of std.math.maxInt(u15))