Machine Learning to Computational Plasma Physics Reduced-Order Plasma Modeling

Crux, a Precise Verifier for Rust and Other Languages

It's Not Easy Being Green: On the Energy Efficiency of Programming Languages

DeepSeek: Advancing theorem proving in LLMs through large-scale synthetic data

Addition is all you need for energy efficient language models

Running LLMs with 3.3M Context Tokens on a Single GPU

A Mathematical Model of Package Management Systems

Why do random forests work? They are self-regularizing adaptive smoothers

Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces

LLMD: A Large Language Model for Interpreting Longitudinal Medical Records

Machine learning and information theory concepts towards an AI Mathematician

ARIA: An Open Multimodal Native Mixture-of-Experts Model

Gödel Agent: A self-referential agent framework for recursive self-improvement

Sample what you can't compress; image auto-encoders wihtout GANs

Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs

Were RNNs all we needed?

Brightness of the Qianfan Satellites

Route Planning in Transportation Networks (2015)

Skip Hash: A fast ordered map via software transactional memory

A novel channel contention mechanism for improving wi-fi's reliability

Meissonic, High-Resolution Text-to-Image Synthesis on consumer graphics cards

Understanding the Limitations of Mathematical Reasoning in LLMs

Efficient and Effective Model Extraction

Addition Is All You Need for Energy-Efficient Language Models

Decoding the Language of Othering by Russia-Ukraine War Bloggers

The Role of Anchor Tokens in Self-Attention Networks

Differential Transformer

Grokking at the edge of linear separability

Interpreting Clip with Sparse Linear Concept Embeddings (SpLiCE)

Sorbet: A neuromorphic hardware-compatible transformer-based spiking model

More →