"Shallow safety alignment," a weakness in Large Language Models, allows users to bypass guardrails and elicit directions for malicious uses, like hacking government databases and stealing from charities…

Related Stories

Two new computer models aim to reduce the impacts of severe weather in Puerto Rico, one by predicting next-day energy demand as a storm approaches, the other by identifying critical power lines to prevent total blackouts [PNAS…

Chemical knowledge and reasoning of large language models vs. chemist expertise

Large Language Models, Small Labor Market Effects

Reinforcement Learning to Train Large Language Models to Explain Human Decisions

Large language models often know when they are being evaluated

Ephesus: a rust-inspired (and backed) Probabilistic Programming Language for hybrid graph-relational databases

Towards Understanding Sycophancy in Language Models

Working on databases from prison

Real-time action chunking with large models

Extracting memorized pieces of books from open-weight language models

Trump administration cuts 'Safety' from AI Safety Institute

Self-Adapting Language Models

Web-scraping AI bots cause disruption for scientific databases and journals

A Hidden Weakness

Changing Directions

Extreme Super-Resolution via Scale Autoregression and Preference Alignment

Vision Language Models Are Biased

Unsupervised Elicitation of Language Models

bitssh: Terminal user interface for SSH. It uses ~/.ssh/config to list and connect to hosts.

Mochi — a lightweight language for agents and data, written in Go

P-Hacking in Startups

Ish: Grep-like text search with optimal alignment, built with Mojo

Adapter allows users to connect an M.2 NVMe SSD to a microSD Express card slot

How much do language models memorize?

Marijuana use dramatically increases risk of dying from heart attacks and stroke, large study finds

How to feed large datasets to LLM for analysis.

Wireless implant uses electrical stimulation and machine learning to manage chronic pain, including thermally- and mechanically-induced pain in animal trials.

A lightweight utility for training multiple Keras models in parallel and comparing their final loss and last-epoch time.

Why Rust uses more RAM than Swift and Go?

Built a Python solver for dynamic mathematical expressions stored in databases