论文来源 | 14B视频模型单卡19.5FPS

重点关注

$V_1$: Unifying Generation and Self-Verification for Parallel Reasoners score 12
机构: Berkeley；入选 HF Daily Papers；HF 热度: 12 upvotes (+3)；有代码实现；关键词(3): scaling, code generation, reasoning
CubeComposer: Spatio-Temporal Autoregressive 4K 360° Video Generation from Perspective Video score 11
入选 HF Daily Papers；HF 热度: 11 upvotes (+3)；有代码实现；顶会接收: CVPR
Helios: Real Real-Time Long Video Generation Model score 10
入选 HF Daily Papers；HF 热度: 132 upvotes (+4)；有代码实现；关键词(2): quantization, real-time
T2S-Bench & Structure-of-Thought: Benchmarking and Prompting Comprehensive Text-to-Structure Reasoning score 10
入选 HF Daily Papers；HF 热度: 105 upvotes (+4)；有代码实现；关键词(2): fine-tuning, reasoning
MOOSE-Star: Unlocking Tractable Training for Scientific Discovery by Breaking the Complexity Barrier score 13
机构: Cornell；入选 HF Daily Papers；HF 热度: 75 upvotes (+4)；有代码实现；关键词(2): scaling, reasoning
ArtHOI: Articulated Human-Object Interaction Synthesis by 4D Reconstruction from Video Priors score 9
入选 HF Daily Papers；HF 热度: 19 upvotes (+3)；有代码实现；关键词(1): reasoning
Phi-4-reasoning-vision-15B Technical Report score 9
入选 HF Daily Papers；HF 热度: 15 upvotes (+3)；有代码实现；关键词(2): reasoning, data curation
InfinityStory: Unlimited Video Generation with World Consistency and Character-Aware Shot Transitions score 8
机构: Allen Institute；入选 HF Daily Papers；HF 热度: 7 upvotes (+2)
Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory score 7
入选 HF Daily Papers；HF 热度: 11 upvotes (+3)；关键词(2): scaling, reasoning
RIVER: A Real-Time Interaction Benchmark for Video LLMs score 7
入选 HF Daily Papers；HF 热度: 4 upvotes (+1)；有代码实现；关键词(1): real-time

也值得关注

MEM: Multi-Scale Embodied Memory for Vision Language Action Models score 4
机构: MIT；关键词(1): embodied
NuMuon: Nuclear-Norm-Constrained Muon for Compressible LLM Training score 4
机构: Amazon；关键词(3): compression, deployment, pretraining
Parallax to Align Them All: An OmniParallax Attention Mechanism for Distributed Multi-View Image Compression score 4
关键词(2): compression, coding；顶会接收: CVPR
A Rubric-Supervised Critic from Sparse Real-World Outcomes score 4
机构: Carnegie Mellon；关键词(3): scaling, coding, data curation
UniRain: Unified Image Deraining with RAG-based Dataset Distillation and Multi-objective Reweighted Optimization score 4
关键词(3): distillation, MoE, RAG；顶会接收: CVPR
Discriminative Perception via Anchored Description for Reasoning Segmentation score 4
关键词(1): reasoning；顶会接收: CVPR
EgoPoseFormer v2: Accurate Egocentric Human Motion Estimation for AR/VR score 4
关键词(2): distillation, latency；顶会接收: CVPR
When Sensors Fail: Temporal Sequence Models for Robust PPO under Sensor Drift score 4
关键词(3): PPO, state space, reasoning；顶会接收: ICLR
MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation score 3
机构: Tsinghua
QD-PCQA: Quality-Aware Domain Adaptation for Point Cloud Quality Assessment score 3
顶会接收: CVPR
TAP: A Token-Adaptive Predictor Framework for Training-Free Diffusion Acceleration score 3
顶会接收: CVPR
Fine-grained Image Aesthetic Assessment: Learning Discriminative Scores from Relative Ranks score 3
顶会接收: CVPR
BD-Merging: Bias-Aware Dynamic Model Merging with Evidence-Guided Contrastive Learning score 3
顶会接收: CVPR
STEM Faculty Perspectives on Generative AI in Higher Education score 3
顶会接收: AAAI
NOVA3R: Non-pixel-aligned Visual Transformer for Amodal 3D Reconstruction score 3
顶会接收: ICLR