公開日 Dialectics of Alignment: Harnessing Unsafe Knowledge for Dynamic Safety Routing M. Hashemzadeh, Jerry Huang, Minseon Kim, Marc-Alexandre Côté, Sarath Chandar May 2026 arXiv | May 2026
公開日 Test-Time Learning with an Evolving Library Weijia Xu, Alessandro Sordoni, Chandan Singh, Zelalem Gero, Michel Galley, Xingdi Yuan, Jianfeng Gao May 2026 arXiv | May 2026
公開日 Confident in a Confidence Score: Investigating the Sensitivity of Confidence Scores to Supervised Fine-Tuning Lorenzo Jaime Flores, Cesare Spinoso di-Piano, Jackie Cheung April 2026 arXiv | April 2026
公開日 Trade-offs in Ensembling, Merging and Routing Among Parameter-Efficient Experts Sanae Lotfi, Lucas Caccia, Alessandro Sordoni, Jordan Ash, Miroslav Dudík March 2026 arXiv | March 2026
公開日 Effect of Document Packing on the Latent Multi-Hop Reasoning Capabilities of Large Language Models Gabriele Prato, Shagun Sodhani, Alessandro Sordoni, Sarath Chandar, Alessandro Sordoni December 2025 arXiv | December 2025
公開日 Learning to Solve Complex Problems via Dataset Decomposition Wanru Zhao, Lucas Caccia, Zhengyan Shi, Minseon Kim, Xingdi Yuan, Weijia Xu, Marc-Alexandre Côté, Alessandro Sordoni NeurIPS 2025 | July 2025
公開日 Dehumanizing Machines: Mitigating Anthropomorphic Behaviors in Text Generation Systems Myra Cheng, Su Lin Blodgett, Alicia DeVrio, Lisa Egede, Alexandra Olteanu Meeting of the Association for Computational Linguistics | July 2025 SAC Highlight 動画 プロジェクト
公開日 Understanding and Meeting Practitioner Needs When Measuring Representational Harms Caused by LLM-Based Systems Emma Harvey, Emily Sheng, Su Lin Blodgett, Alex Chouldechova, Jean Garcia-Gathright, Alexandra Olteanu, Hanna Wallach Findings of the Association for Computational Linguistics: ACL 2025 | July 2025
公開日 Rigor in AI: Doing Rigorous AI Work Requires a Broader, Responsible AI-Informed Conception of Rigor Alexandra Olteanu, Su Lin Blodgett, Agathe Balayn, Angelina Wang, Fernando Diaz, Flavio du Pin Calmon, Margaret Mitchell, Michael Ekstrand, Reuben Binns, Solon Barocas NeurIPS 2025 | June 2025
公開日 Measuring Machine Learning Harms from Stereotypes: Requires Understanding Who is Being Harmed by Which Errors in What Ways Angelina Wang, Xuechunzi Bai, Solon Barocas, Su Lin Blodgett 2025 ACM Conference on Fairness, Accountability, and Transparency | June 2025