Journal Club: Advanced Machine Learning

The Research Seminar: Advanced Machine Learning is a weekly reading group. The focus is on discussing current research papers and results in machine learning. The target audience is active researchers (postdocs, phd students and advanced master students) in the field who want to discuss and stay up to date with recent developments.

Contact Peter Lippmann (peter.lippmann [at] iwr.uni-heidelberg.de) for further details.

Next Seminar: 29.04.2024 in INF 205, SR 4.300 starting at 1:00pm
Paper to be discussed:

A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups
Marc Finzi, Max Welling, Andrew Gordon Wilson
https://proceedings.mlr.press/v139/finzi21a.html

Recently discussed papers:

22.04.24
Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI
Theodore Papamarkou, Maria Skoularidou, Konstantina Palla et al.
https://arxiv.org/abs/2402.00809

15.04.24
Self-Rewarding Language Models
Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason Weston
https://arxiv.org/abs/2401.10020

25.03.24
Learning in High Dimension Always Amounts to Extrapolation
Randall Balestriero, Jérôme Pesenti, and Yann LeCun
https://arxiv.org/abs/2110.09485

18.03.24
Deep Double Descent: Where Bigger Models and More Data Hurt
Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, Ilya Sutskever
https://arxiv.org/abs/1912.02292

11.03.24
Deep Networks Always Grok and Here is Why
Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
https://arxiv.org/abs/2402.15555

04.03.24
Lie Point Symmetry and Physics-Informed Networks
Tara Akhound-Sadegh, Laurence Perreault-Levasseur, Johannes Brandstetter, Max Welling, Siamak Ravanbakhsh
https://proceedings.neurips.cc/paper_files/paper/2023/hash/8493c860bec41705f7743d5764301b94-Abstract-Conference.html

26.02.2024
Solving olympiad geometry without human demonstrations
Trieu H. Trinh, Yuhuai Wu, Quoc V. Le, He He & Thang Luong
https://www.nature.com/articles/s41586-023-06747-5

19.02.24
EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations
Yi-Lun Liao, Brandon Wood, Abhishek Das, Tess Smidt
https://arxiv.org/abs/2306.12059

12.02.24
TensorNet: Cartesian Tensor Representations for Efficient Learning of Molecular Potentials
Guillem Simeon, Gianni de Fabritiis
https://arxiv.org/abs/2306.06482

05.02.24
A foundation model for atomistic materials chemistry
Ilyes Batatia, Philipp Benner, Yuan Chiang et al.
https://arxiv.org/abs/2401.00096

29.01.24 Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Evan Hubinger, Carson Denison, Jesse Mu, Mike Lambert et al.
https://arxiv.org/abs/2401.05566

22.01.24
Learning Distributions on Manifolds with Free-form Flows
Peter Sorrenson, Felix Draxler, Armand Rousselot, Sander Hummerich, Ullrich Köthe
https://arxiv.org/abs/2312.09852

13.12.23
Deep Nets Don’t Learn via Memorization
David Krueger, Nicolas Ballas, Stanislaw Jastrzebski, Devansh Arpit* et al.*
https://openreview.net/forum?id=rJv6ZgHYg

04.01.12
Progress measures for grokking via mechanistic interpretability
Neel Nanda, Lawrence Chan, Tom Lieberum, Jess Smith, Jacob Steinhardt
https://arxiv.org/abs/2301.05217

27.11.23
GraphCast: Learning skillful medium-range global weather forecasting
Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson et al.
https://arxiv.org/abs/2212.12794

13.11.23
Transformers are efficient hierarchical chemical graph learners
Zihan Pengmei, Zimu Li, Chih-chan Tien, Risi Kondor, Aaron R. Dinner
https://arxiv.org/abs/2310.01704

06.11.23
Free-form Flows: Make Any Architecture a Normalizing Flow
Felix Draxler, Sorrenson, Peter Rangi, Rousselot, Armand Louis Amedee, Zimmermann, Lea, Ullrich Köthe
https://arxiv.org/abs/2310.16624

30.10.23
Nougat: Neural Optical Understanding for Academic Documents
Lukas Blecher, Guillem Cucurull, Thomas Scialom, Robert Stojnic
https://arxiv.org/abs/2308.13418

23.10.23
White-Box Transformers via Sparse Rate Reduction
Yaodong Yu, Sam Buchanan, Druv Pai, Tianzhe Chu, Ziyang Wu, Shengbang Tong, Benjamin D. Haeffele, Yi Ma
https://arxiv.org/abs/2306.01129

16.10.23
Emergence of Segmentation with Minimalistic White-Box Transformers
Yaodong Yu, Tianzhe Chu, Shengbang Tong, Ziyang Wu, Druv Pai, Sam Buchanan, Yi Ma
https://arxiv.org/abs/2308.16271

09.10.23
Large Language Models as Optimizers
Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, Xinyun Chen
https://arxiv.org/abs/2309.03409

02.10.23
PointLLM: Empowering Large Language Models to Understand Point Clouds
Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin
https://arxiv.org/abs/2308.16911

25.09.23
Loss of Plasticity in Deep Continual Learning
Shibhansh Dohare, J. Fernando Hernandez-Garcia, Parash Rahman, Richard S. Sutton, A. Rupam Mahmood
https://arxiv.org/abs/2306.13812

18.09.23
Tuning Computer Vision Models With Task Rewards
André Susano Pinto, Alexander Kolesnikov, Yuge Shi, Lucas Beyer, Xiaohua Zhai
https://openreview.net/forum?id=zzOooeAqtT

11.09.23
Deep Learning on Implicit Neural Representations of Shapes
Luca De Luigi, Adriano Cardace, Riccardo Spezialetti, Pierluigi Zama Ramirez, Samuele Salti, Luigi Di Stefano
https://arxiv.org/abs/2302.05438

28.08.23
Equivariant Diffusion for Molecule Generation in 3D
Emiel Hoogeboom, Victor Garcia Satorras, Clément Vignac, Max Welling
https://arxiv.org/abs/2203.17003

21.08.23
Fourier Neural Operator for Parametric Partial Differential Equations
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar
https://arxiv.org/abs/2010.08895

14.08.23
Equivariant Architectures for Learning in Deep Weight Spaces
Aviv Navon, Aviv Shamsian, Idan Achituve, Ethan Fetaya, Gal Chechik, Haggai Maron
https://openreview.net/forum?id=SCU1xlr9Y4

07.08.23
VectorAdam for Rotation Equivariant Geometry Optimization
Selena Ling, Nicholas Sharp, Alec Jacobson
https://openreview.net/forum?id=df1g_KeEjQ

31.07.23
Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang and Maneesh Agrawala
https://arxiv.org/pdf/2302.05543.pdf

24.07.23
DeepSea: An efficient deep learning model for single-cell segmentation and tracking of time-lapse microscopy images
Zargari, Abolfazl, et al.
https://www.biorxiv.org/content/10.1101/2021.03.10.434806v2.abstract

17.07.23
Track Anything: Segment Anything Meets Videos
Jinyu Yang, Mingqi Gao, Zhe Li, Shang Gao, Fangjing Wang, Feng Zheng
https://arxiv.org/abs/2304.11968

10.07.23
Trans-Dimensional Generative Modeling via Jump Diffusion Models
Andrew Campbell, William Harvey, Christian Weilbach, Valentin De Bortoli, Tom Rainforth, Arnaud Doucet
https://arxiv.org/abs/2305.16261

03.07.23
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
https://arxiv.org/abs/2304.11062

26.06.23
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
Xingang Pan, Ayush Tewari, Thomas Leimkühler, Lingjie Liu, Abhimitra Meka, Christian Theobalt
https://arxiv.org/abs/2305.10973

19.06.23
Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability
Ziming Liu, Eric Gan, Max Tegmark
https://arxiv.org/abs/2305.08746

12.06.23
Supervised Training of Conditional Monge Maps
Charlotte Bunne, Andreas Krause, Marco Cuturi
https://arxiv.org/abs/2206.14262

05.06.23
Causal Reasoning and Large Language Models: Opening a New Frontier for Causality
Emre Kıcıman, Robert Ness, Amit Sharma, Chenhao Tan
https://arxiv.org/abs/2305.00050

22.05.23
How Attentive are Graph Attention Networks?
Shaked Brody, Uri Alon, Eran Yahav
https://arxiv.org/abs/2105.14491

15.05.23 Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan et al
https://arxiv.org/abs/2303.12712

08.05.23 Discrete Variational Autoencoders
Jason Tyler Rolfe
https://arxiv.org/abs/1609.02200

24.04.23
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification
Takashi Ishida, Ikko Yamane, Nontawat Charoenphakdee, Gang Niu, Masashi Sugiyama
https://openreview.net/forum?id=FZdJQgy05rz

17.04.23
Image as Set of Points
Xu Ma, Yuqian Zhou, Huan Wang, Can Qin, Bin Sun, Chang Liu, Yun Fu
https://openreview.net/forum?id=awnvqZja69

03.04.23
Advancing mathematics by guiding human intuition with AI
Davies, A., Veličković, P., Buesing, L., Blackwell, S., Zheng, D., Tomašev, N., … & Kohli, P
https://www.nature.com/articles/s41586-021-04086-x

27.03.23
DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole, Ajay Jain, Jonathan T. Barron, Ben Mildenhall
https://openreview.net/forum?id=FjNys5c7VyY

20.03.23 Flow Matching for Generative Modeling
Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, Matt Le
https://arxiv.org/pdf/2210.02747.pdf