2026-05-26

该研究揭示了大型推理模型中存在“隐式批判能力”，并发现了一个可解释的“批判向量”，通过在潜在空间中引导该向量，可在不增加训练成本的情况下显著提升模型的纠错能力和测试时扩展性能。 Fast-dDrive 提出了一种用于自动驾驶的块扩散 VLA 模型，通过结构脚手架投机解码和低开销测试时缩放，在保持 SOTA 规划精度的同时实现 12 倍吞吐量提升，加速了 VL…

Decoding the Critique Mechanism in Large Reasoning Models 83

Tags: 推理模型 测试时扩展 表征工程
Source: arXiv Machine Learning | 阅读原文

[摘要]
该研究揭示了大型推理模型中存在“隐式批判能力”，并发现了一个可解释的“批判向量”，通过在潜在空间中引导该向量，可在不增加训练成本的情况下显著提升模型的纠错能力和测试时扩展性能。

Fast-dDrive: Efficient Block-Diffusion VLM for Autonomous Driving 83

Tags: 自动驾驶 VLA模型 推理优化
Source: arXiv Computation and Language | 阅读原文

[摘要]
Fast-dDrive 提出了一种用于自动驾驶的块扩散 VLA 模型，通过结构脚手架投机解码和低开销测试时缩放，在保持 SOTA 规划精度的同时实现 12 倍吞吐量提升，加速了 VLA 的车载实时部署。

OPPO: Bayesian Value Recursion for Token-Level Credit Assignment in LLM Reasoning 83

Tags: 强化学习 大模型推理 信用分配 GRPO
Source: arXiv Artificial Intelligence | 阅读原文

[摘要]
OPPO 提出贝叶斯价值递归方法，解决大模型推理中 GRPO 等算法的 Token 级信用分配问题，无需值网络即可在每步估算成功概率，在数学和代码基准上显著超越 GRPO。

ChainFlow-VLA: Causal Flow Planning with Vision-Language Models 83

Tags: 自动驾驶 多模态大模型 轨迹规划
Source: arXiv Artificial Intelligence | 阅读原文

[摘要]
ChainFlow-VLA 统一了自回归因果生成与扩散全局优化，利用 VLM 语义先验进行轨迹修正，在自动驾驶基准 NAVSIM v1 上取得 94.85 的 SOTA 成绩，达到人类水平。

When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions 83

Tags: 大模型 推理优化 思维链
Source: arXiv Artificial Intelligence | 阅读原文

[摘要]
该研究从动力学系统视角探讨LLM推理，发现早期解码熵动态可预测CoT的有效性，据此提出免训练路由框架EDRM，通过自适应选择推理策略，在提升准确率的同时减少41-55%的Token消耗。

Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography 83

Tags: 可解释性 稀疏自编码器 认知神经科学 大模型
Source: arXiv Artificial Intelligence | 阅读原文

[摘要]
该研究利用稀疏自编码器（SAE）解析LLM内部特征，首次在机制上解释了LLM与人脑语言反应的对齐性，证实SAE特征能精准映射大脑皮层的语义拓扑结构。

苹果据称正使用定制版1.2T参数Google模型重塑下一代Siri 82

Tags: 大模型 智能助手 端侧AI
Source: AI HOT 精选 | 阅读原文

[摘要]
据报道，苹果正使用定制版1.2T参数的谷歌大模型重塑下一代Siri，以驱动其核心功能，并结合本地设备运行简单查询，这展示了巨头在端云协同及超大模型落地上的新动向。

TrapDoor供应链攻击：AI助手成新型攻击面 82

Tags: AI安全 AI Agent 供应链安全
Source: AI HOT 精选 | 阅读原文

[摘要]
新型供应链攻击“TrapDoor”利用恶意配置文件，诱导Claude Code和Cursor等AI助手执行恶意命令，这是首例将AI助手作为攻击面的安全事件。

Distill to Think, Foresee to Act: Cognitive-Physical Reinforcement Learning for Autonomous Driving 82

Tags: 自动驾驶 强化学习 世界模型 知识蒸馏
Source: arXiv Machine Learning | 阅读原文

[摘要]
提出CoPhy自动驾驶强化学习框架，通过将VLM知识蒸馏至BEV编码器保留认知能力，构建自回归BEV世界模型预测未来，并结合GRPO算法与双重奖励优化策略，在NAVSIM基准上达到SOTA。

DCC: Data-Centric Compilation of Machine Learning Kernels for Processing-In-Memory Architectures 82

Tags: 编译优化 存内计算 推理加速
Source: arXiv Machine Learning | 阅读原文

[摘要]
研究人员推出首个针对存内计算（PIM）架构的以数据为中心ML编译器DCC，通过协同优化数据重排与计算代码，使LLaMA-2等大模型推理速度相比GPU平均提升4.52倍，现已开源。

AGZO: Activation-Guided Zeroth-Order Optimization for LLM Fine-Tuning 82

Tags: 大模型 微调优化 零阶优化
Source: arXiv Machine Learning | 阅读原文

[摘要]
提出 AGZO 零阶优化算法，通过在前向传播中动态提取激活引导的低秩子空间并限制扰动，显著提升了 LLM 内存受限微调的性能，缩小了与一阶微调的差距。

Energy-Guided Generative Modeling for Low-Energy Molecular Structure Discovery 82

Tags: AI4S 生成模型 分子构象
Source: arXiv Machine Learning | 阅读原文

[摘要]
提出首个能量引导生成框架 EnFlow，将流式构象生成与显式能量景观建模结合，在 GEOM 数据集上仅需 1-2 步采样即可高效生成低能分子构象并准确识别基态。

Mechanistic Interpretability of EEG Foundation Models via Sparse Autoencoders 82

Tags: 可解释性 稀疏自编码器 医疗AI 基础模型
Source: arXiv Machine Learning | 阅读原文

[摘要]
该研究将稀疏自编码器应用于三种脑电图基础模型，提取出稀疏特征字典，并通过概念控制和频谱解码实现了生理可解释的临床特征干预，提升了医疗AI的可解释性与信任度。

SCRIPT: Scalable Diffusion Policy with Multi-stage Training for Language-driven Physics-Based Humanoid Control 82

Tags: 具身智能 机器人 扩散模型 强化学习
Source: arXiv Machine Learning | 阅读原文

[摘要]
提出 SCRIPT 框架，结合联合动作-状态-文本扩散 Transformer 与多阶段训练，显著提升了语言驱动的人形机器人物理控制精度与动作质量。

Tags: 评测基准 大模型 鲁棒性分析
Source: arXiv Machine Learning | 阅读原文

[摘要]
该研究将大模型刷榜行为视为选举操纵，利用社会选择理论分析了基准测试排行榜的鲁棒性，实证表明平均胜率（mean win rate）是最难被操纵的评估指标。

Unextractable Protocol Models: Collaborative Training and Inference without Weight Materialization 82

Tags: 去中心化训练 模型安全 协作推理 大模型
Source: arXiv Machine Learning | 阅读原文

[摘要]
提出不可提取协议模型（UPM），通过在去中心化协作训练和推理中引入时变随机可逆变换，防止参与者提取完整模型权重，在极低开销下保障了去中心化AI的模型资产安全。

The physics of AI weather models 82

Tags: 气象大模型 可解释性 表示学习
Source: arXiv Machine Learning | 阅读原文

[摘要]
该研究探讨了AI气象模型的物理机制，通过分析GraphCast和Aurora，发现不同架构的模型以相似方式表示大气，并提出模型通过在高维潜在空间中遵循梯度流来模拟大气粒子运动。

Contrastive Distribution Matching for Amortized Sequential Monte Carlo in Discrete Diffusion 82

Tags: 扩散模型 推理优化 大模型对齐 生物信息学
Source: arXiv Machine Learning | 阅读原文

[摘要]
针对离散扩散模型采样效率低的难题，提出对比分布匹配（CDM）框架，通过学习参数化twist函数摊销SMC推理成本，在极低计算开销下显著提升了文本生成、生物序列设计及扩散LLM对齐的性能。

FuRA: Full-Rank Parameter-Efficient Fine-Tuning with Spectral Preconditioning 82

Tags: 参数高效微调 大模型 模型训练
Source: arXiv Machine Learning | 阅读原文

[摘要]
提出 FuRA，一种基于谱预条件和块张量列分解的全秩参数高效微调方法，在保持与 LoRA 相当的显存和速度的同时，在 LLM 和 VLM 微调任务上表现优于 Full FT 和 LoRA。

Convex Optimization for Alignment and Preference Learning on a Single GPU 82

Tags: 大模型 对齐技术 微调优化
Source: arXiv Machine Learning | 阅读原文

[摘要]
论文提出 COALA 算法，首次将凸优化应用于 LLM 偏好微调。该方法无需参考模型，在单 GPU 上即可高效运行，计算量仅为 DPO 的 17.6%，显著降低了对齐训练的显存和时间成本。

2026-05-26 ​

Decoding the Critique Mechanism in Large Reasoning Models 83 ​

Fast-dDrive: Efficient Block-Diffusion VLM for Autonomous Driving 83 ​

OPPO: Bayesian Value Recursion for Token-Level Credit Assignment in LLM Reasoning 83 ​

ChainFlow-VLA: Causal Flow Planning with Vision-Language Models 83 ​

When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions 83 ​

Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography 83 ​

苹果据称正使用定制版1.2T参数Google模型重塑下一代Siri 82 ​

TrapDoor供应链攻击：AI助手成新型攻击面 82 ​

Distill to Think, Foresee to Act: Cognitive-Physical Reinforcement Learning for Autonomous Driving 82 ​

DCC: Data-Centric Compilation of Machine Learning Kernels for Processing-In-Memory Architectures 82 ​

AGZO: Activation-Guided Zeroth-Order Optimization for LLM Fine-Tuning 82 ​

Energy-Guided Generative Modeling for Low-Energy Molecular Structure Discovery 82 ​

Mechanistic Interpretability of EEG Foundation Models via Sparse Autoencoders 82 ​

SCRIPT: Scalable Diffusion Policy with Multi-stage Training for Language-driven Physics-Based Humanoid Control 82 ​

How Hard is it to Rig a Benchmark? A Social Choice Analysis of Leaderboard Robustness 82 ​

Unextractable Protocol Models: Collaborative Training and Inference without Weight Materialization 82 ​

The physics of AI weather models 82 ​

Contrastive Distribution Matching for Amortized Sequential Monte Carlo in Discrete Diffusion 82 ​

FuRA: Full-Rank Parameter-Efficient Fine-Tuning with Spectral Preconditioning 82 ​

Convex Optimization for Alignment and Preference Learning on a Single GPU 82 ​

2026-05-26

Decoding the Critique Mechanism in Large Reasoning Models 83

Fast-dDrive: Efficient Block-Diffusion VLM for Autonomous Driving 83

OPPO: Bayesian Value Recursion for Token-Level Credit Assignment in LLM Reasoning 83

ChainFlow-VLA: Causal Flow Planning with Vision-Language Models 83

When Do LLMs Reason? A Dynamical Systems View via Entropy Phase Transitions 83

Sparse Autoencoders Map Brain-LLM Alignment onto Cortical Semantic Topography 83

苹果据称正使用定制版1.2T参数Google模型重塑下一代Siri 82

TrapDoor供应链攻击：AI助手成新型攻击面 82

Distill to Think, Foresee to Act: Cognitive-Physical Reinforcement Learning for Autonomous Driving 82

DCC: Data-Centric Compilation of Machine Learning Kernels for Processing-In-Memory Architectures 82

AGZO: Activation-Guided Zeroth-Order Optimization for LLM Fine-Tuning 82

Energy-Guided Generative Modeling for Low-Energy Molecular Structure Discovery 82

Mechanistic Interpretability of EEG Foundation Models via Sparse Autoencoders 82

SCRIPT: Scalable Diffusion Policy with Multi-stage Training for Language-driven Physics-Based Humanoid Control 82

How Hard is it to Rig a Benchmark? A Social Choice Analysis of Leaderboard Robustness 82

Unextractable Protocol Models: Collaborative Training and Inference without Weight Materialization 82

The physics of AI weather models 82

Contrastive Distribution Matching for Amortized Sequential Monte Carlo in Discrete Diffusion 82

FuRA: Full-Rank Parameter-Efficient Fine-Tuning with Spectral Preconditioning 82

Convex Optimization for Alignment and Preference Learning on a Single GPU 82