CS336: Language Modeling from Scratch