Diffusion language models are super data learners