AI Research Brief
Search
Methodology
中文
120B on One GPU, and 40% of Video Benchmarks Are Guessable
11 selected from 76 papers
Featured
ClawsBench: Evaluating Capability and Safety of LLM Productivity Agents in Simulated Workspaces
score 11
机构: Apple; 入选 HF Daily Papers; HF 热度: 16 upvotes (+3); 有代码实现
Video-MME-v2: Towards the Next Stage in Benchmarks for Comprehensive Video Understanding
score 10
入选 HF Daily Papers; HF 热度: 201 upvotes (+4); 有代码实现; 关键词(2): reasoning, leaderboard
Watch Before You Answer: Learning from Visually Grounded Post-Training
score 10
入选 HF Daily Papers; HF 热度: 26 upvotes (+4); 有代码实现; 关键词(4): post-training, reasoning, vision-language, data curation
MegaTrain: Full Precision Training of 100B+ Parameter Large Language Models on a Single GPU
score 10
入选 HF Daily Papers; HF 热度: 25 upvotes (+4); 有代码实现; 关键词(1): throughput
General Multimodal Protein Design Enables DNA-Encoding of Chemistry
score 10
入选 HF Daily Papers; HF 热度: 21 upvotes (+4); 有代码实现; 关键词(1): scaling
MedGemma 1.5 Technical Report
score 6
入选 HF Daily Papers; HF 热度: 9 upvotes (+2); 关键词(1): reasoning
Also Worth Noting
Improving Sparse Memory Finetuning
score 4
机构: Carnegie Mellon; 关键词(2): finetuning, open-source
PaperOrchestra: A Multi-Agent Framework for Automated AI Research Paper Writing
score 3
机构: Yale
Memory Dial: A Training Framework for Controllable Memorization in Language Models
score 3
顶会接收: ACL
Multilingual Language Models Encode Script Over Linguistic Structure
score 3
顶会接收: ACL
XMark: Reliable Multi-Bit Watermarking for LLM-Generated Texts
score 3
顶会接收: ACL