Parallel Histogram Computation with CUDA