Top Stories

LLMs consistently pick resumes they generate over ones by humans or other models

175 points · arxiv.org

A new paper finds that when LLMs are used as resume screeners, they exhibit a strong self-preference bias — consistently ranking resumes they themselves generated above those written by humans or other models. As candidates increasingly use AI to polish their applications and recruiters increasingly use AI to filter them, this creates an awkward feedback loop where the “winning” resumes are the ones that match the screener’s own writing style. The HN thread is digging into whether this is true bias or just a stylistic prior, and what it means for fairness in automated hiring.


DeepSeek V4 – almost on the frontier, a fraction of the price

341 points · simonwillison.net

Simon Willison’s hands-on with DeepSeek V4 argues the model is now within striking distance of the closed frontier labs at a dramatically lower API cost. The takeaway HN is wrestling with: if open-weight Chinese models keep matching frontier capability at 10–20% of the price, the economics of US labs that train and serve premium models start to look very different. Lots of discussion about benchmark contamination, real-world coding performance, and what this means for the GPU-spend arms race.


Open Design: use your coding agent as a design engine

104 points · github.com/nexu-io

A new open-source project that flips the design-to-code workflow: instead of going from Figma to code, you let your coding agent be the design tool, generating components and iterating on them through prompts and live previews. It taps into a larger trend HN has been chewing on for months — that AI-native dev tools may eventually eat traditional design software entirely. Comments are split between “this is the future” and “designers exist for a reason.”


Why does it take so long to release black-fan versions?

531 points · noctua.at

Noctua, the cult favorite PC fan maker, finally explains why their iconic brown-and-tan fans take years to ship in plain black. The answer is a deep dive into supply chain pain: getting black anti-vibration mounts that don’t squeak, ensuring color consistency across batches, and not breaking acoustic performance with new materials. HN loves it because it’s a master class in why “just paint it black” is never just paint it black, and why hardware engineering at quality is genuinely hard.


TI-84 Evo

529 points · ti.com

Texas Instruments has refreshed the TI-84 line with the “Evo” — a faster CPU, a color screen, and (notably) USB-C. The HN thread is a glorious mix of nostalgia, outrage at the persistent ~$120 price tag, and confusion that an essentially 1996-era product is still mandatory in US classrooms in 2026. The 430-comment debate is really about why the educational testing monopoly hasn’t been broken by a phone app or a $20 alternative.


Ask.com has closed

382 points · ask.com

The search engine that started as Ask Jeeves in 1996 has shut down, ending a 30-year run that survived Google, the dot-com bust, and several pivots into Q&A. HN is using the moment to reflect on the strange afterlives of early-web brands and the broader collapse of the long tail of search — when LLMs answer questions directly, “search engines that aren’t Google” don’t have much of a niche left.


Uber wants to turn its drivers into a sensor grid for AV companies

16 points · techcrunch.com

Uber is reportedly pitching autonomous vehicle companies on a new data product: equip its millions of human drivers with cameras and sensors, then sell the resulting real-world driving data to AV teams hungry for edge cases. It’s a clever play — Uber gets a high-margin data business that doesn’t require it to win the AV race itself, and AV companies get scale they can’t match in-house. The HN debate is mostly about driver consent, passenger privacy, and whether this finally gives Uber a moat.


Show HN: Mljar Studio – local AI data analyst that saves analysis as notebooks

48 points · mljar.com

A new local-first AI data analyst tool that runs on your machine, talks to your data through natural language, and outputs everything as Jupyter notebooks rather than a black-box chat. The pitch resonates with HN’s growing skepticism of cloud-only AI tooling: you keep the data, you keep the artifacts, and you can audit and re-run every analysis. Good discussion in the comments about how this competes with Hex, Mode, and Claude/ChatGPT’s code interpreter modes.


Refusal in language models is mediated by a single direction

29 points · arxiv.org

An interpretability paper that’s getting a second wind: researchers find that the entire “refuse to answer” behavior in modern LLMs is mediated by a single direction in activation space, meaning you can ablate it with a tiny intervention and produce uncensored versions of safety-tuned models. It’s a striking result for both safety and security — and the HN comments are split on whether this means alignment is more brittle than people think, or simply that “refusal” is a thin behavioral layer rather than deep model values.


New research suggests people can communicate and practice skills while dreaming

405 points · newyorker.com

A New Yorker piece on lucid dreaming research showing that people can answer questions, do simple math, and even practice motor skills while in REM sleep — with measurable carryover to waking performance. The HN thread is a mix of skepticism about effect sizes and excitement about the implications for learning, therapy, and (inevitably) productivity hacking. Whether or not it scales, it’s a fun reminder of how little we still understand about sleep.