We Made CUDA Optimization Suck Less