Scientific AI Hamprecht Lab, IWR, Heidelberg University

Journal Club: Advanced Machine Learning

The Research Seminar: Advanced Machine Learning is a weekly reading group. The focus is on discussing current research papers and results in machine learning. The target audience is active researchers (postdocs, phd students and advanced master students) in the field who want to discuss and stay up to date with recent developments.

Contact Peter Lippmann (peter.lippmann [at] iwr.uni-heidelberg.de) for further details.

Next Seminar: 18.11.2024 in INF 205, SR 4.300 starting at 11:00am
Paper to be discussed:

Does equivariance matter at scale?
Johann Brehmer, Sönke Behrends, Pim de Haan, Taco Cohen
https://arxiv.org/abs/2410.23179


Recently discussed papers:

11.11.2024
Let’s Verify Step by Step
Hunter Lightman, Vineet Kosaraju, Yura Burda et al.
https://arxiv.org/abs/2305.20050

04.11.2024
SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning
Yizhou Chi, Yizhang Lin, Sirui Hong et al.
https://arxiv.org/abs/2410.17238v1

28.10.2024
Homomorphism Counts for Graph Neural Networks: All About That Basis
Emily Jin, Michael Bronstein, İsmail İlkan Ceylan, Matthias Lanzinger
https://arxiv.org/abs/2402.08595

14.10.2024
Smooth, exact rotational symmetrization for deep learning on point clouds
Sergey Pozdnyakov and Michele Ceriotti
https://proceedings.neurips.cc/paper_files/paper/2023/hash/fb4a7e3522363907b26a86cc5be627ac-Abstract-Conference.html

07.10.2024
Position: Topological Deep Learning is the New Frontier for Relational Learning
Theodore Papamarkou, Tolga Birdal, Michael M. Bronstein et al.
https://openreview.net/forum?id=Nl3RG5XWAt

23.09.2024 Unifying O(3) equivariant neural networks design with tensor-network formalism
Li, Zimu, Zihan Pengmei, Han Zheng, Erik Thiede, Junyu Liu, and Risi Kondor
https://iopscience.iop.org/article/10.1088/2632-2153/ad4a04/pdf

16.09.2024
Memorization Through the Lens of Curvature of Loss Function Around Samples
Isha Garg, Deepak Ravikumar, Kaushik Roy
https://proceedings.mlr.press/v235/garg24a.html

09.09.2024
Weisfeiler-Leman at the margin: When more expressivity matters
Billy Joe Franks, Christopher Morris, Ameya Velingker, Floris Geerts
https://proceedings.mlr.press/v235/franks24a.html

02.09.2024
No “Zero-Shot” Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
Vishaal Udandarao, Ameya Prabhu, Adhiraj Ghosh, Yash Sharma, Philip H.S. Torr, Adel Bibi, Samuel Albanie, Matthias Bethge
https://arxiv.org/abs/2404.04125

26.08.2024
The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning
Micah Goldblum, Marc Finzi, Keefer Rowan, Andrew Gordon Wilson
https://arxiv.org/abs/2304.05366

19.08.2024
Scaling rectified flow transformers for high-resolution image synthesis
P Esser, S Kulal, A Blattmann, R Entezari et al.
https://openreview.net/pdf?id=FPnUhsQJ5B

12.08.2024
No “Zero-Shot” Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
Vishaal Udandarao, Ameya Prabhu, Adhiraj Ghosh, Yash Sharma, Philip H.S. Torr, Adel Bibi, Samuel Albanie, Matthias Bethge
https://arxiv.org/abs/2404.04125

05.08.2024
Discovering Symmetry Breaking in Physical Systems with Relaxed Group Convolution
R Wang, E Hofgard, H Gao, R Walters, T Smidt
https://openreview.net/pdf?id=59oXyDTLJv

30.07.2024
Tensor Frames – How To Make Any Message Passing Network Equivariant
Peter Lippmann, Gerrit Gerhartz, Roman Remme, Fred A. Hamprecht
https://arxiv.org/abs/2405.15389

15.07.2024
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu, Kai Zhang, Yuan Li et al.
https://arxiv.org/abs/2402.17177

08.07.2024
Accurate structure prediction of biomolecular interactions with AlphaFold 3
Josh Abramson, Jonas Adler, Jack Dunger et al.
https://www.nature.com/articles/s41586-024-07487-w

01.07.2024
A Hitchhiker’s Guide to Geometric GNNs for 3D Atomic Systems
Alexandre Duval, Simon V. Mathis, Chaitanya K. Joshi, Victor Schmidt, Santiago Miret, Fragkiskos D. Malliaros, Taco Cohen, Pietro Liò, Yoshua Bengio, Michael Bronstein
https://arxiv.org/abs/2312.07511

24.06.2024
Why do tree-based models still outperform deep learning on typical tabular data?
Leo Grinsztajn, Edouard Oyallon, Gael Varoquaux
https://proceedings.neurips.cc/paper_files/paper/2022/hash/0378c7692da36807bdec87ab043cdadc-Abstract-Datasets_and_Benchmarks.html

17.06.2024
The Free Energy Principle made simpler but not too simple
Karl Friston, Lancelot Da Costa, Noor Sajid, Conor Heins, Kai Ueltzhöffer, Grigorios A. Pavliotis, Thomas Parr
https://www.sciencedirect.com/science/article/pii/S037015732300203X

10.06.2024
Nonlocal Machine-Learned Exchange Functional for Molecules and Solids
Kyle Bystrom, Boris Kozinsky
https://arxiv.org/abs/2303.00682

03.06.2024
xLSTM: Extended Long Short-Term Memory
Maximilian Beck, Korbinian Pöppel, Markus Spanring et al.
https://arxiv.org/abs/2405.04517

28.05.2024
Highly accurate protein structure prediction with AlphaFold
John Jumper, Richard Evans, Alexander Pritzel et al.
https://www.nature.com/articles/s41586-021-03819-2

13.05.2024
KAN: Kolmogorov-Arnold Networks
Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljacic, Thomas Y. Hou, Max Tegmark
https://arxiv.org/abs/2404.19756

06.05.2024
Categorical Deep Learning: An Algebraic Theory of Architectures
Bruno Gavranović, Paul Lessard, Andrew Dudzik, Tamara von Glehn, João G. M. Araújo, Petar Veličković
https://arxiv.org/pdf/2402.15332

29.04.2024
A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups
Marc Finzi, Max Welling, Andrew Gordon Wilson
https://proceedings.mlr.press/v139/finzi21a.html

22.04.24
Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI
Theodore Papamarkou, Maria Skoularidou, Konstantina Palla et al.
https://arxiv.org/abs/2402.00809

15.04.24
Self-Rewarding Language Models
Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason Weston
https://arxiv.org/abs/2401.10020

25.03.24
Learning in High Dimension Always Amounts to Extrapolation
Randall Balestriero, Jérôme Pesenti, and Yann LeCun
https://arxiv.org/abs/2110.09485

18.03.24
Deep Double Descent: Where Bigger Models and More Data Hurt
Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, Ilya Sutskever
https://arxiv.org/abs/1912.02292

11.03.24
Deep Networks Always Grok and Here is Why
Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
https://arxiv.org/abs/2402.15555

04.03.24
Lie Point Symmetry and Physics-Informed Networks
Tara Akhound-Sadegh, Laurence Perreault-Levasseur, Johannes Brandstetter, Max Welling, Siamak Ravanbakhsh
https://proceedings.neurips.cc/paper_files/paper/2023/hash/8493c860bec41705f7743d5764301b94-Abstract-Conference.html

26.02.2024
Solving olympiad geometry without human demonstrations
Trieu H. Trinh, Yuhuai Wu, Quoc V. Le, He He & Thang Luong
https://www.nature.com/articles/s41586-023-06747-5

19.02.24
EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations
Yi-Lun Liao, Brandon Wood, Abhishek Das, Tess Smidt
https://arxiv.org/abs/2306.12059

12.02.24
TensorNet: Cartesian Tensor Representations for Efficient Learning of Molecular Potentials
Guillem Simeon, Gianni de Fabritiis
https://arxiv.org/abs/2306.06482

05.02.24
A foundation model for atomistic materials chemistry
Ilyes Batatia, Philipp Benner, Yuan Chiang et al.
https://arxiv.org/abs/2401.00096

29.01.24 Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Evan Hubinger, Carson Denison, Jesse Mu, Mike Lambert et al.
https://arxiv.org/abs/2401.05566

22.01.24
Learning Distributions on Manifolds with Free-form Flows
Peter Sorrenson, Felix Draxler, Armand Rousselot, Sander Hummerich, Ullrich Köthe
https://arxiv.org/abs/2312.09852

13.12.23
Deep Nets Don’t Learn via Memorization
David Krueger, Nicolas Ballas, Stanislaw Jastrzebski, Devansh Arpit* et al.*
https://openreview.net/forum?id=rJv6ZgHYg

04.01.12
Progress measures for grokking via mechanistic interpretability
Neel Nanda, Lawrence Chan, Tom Lieberum, Jess Smith, Jacob Steinhardt
https://arxiv.org/abs/2301.05217

27.11.23
GraphCast: Learning skillful medium-range global weather forecasting
Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson et al.
https://arxiv.org/abs/2212.12794

13.11.23
Transformers are efficient hierarchical chemical graph learners
Zihan Pengmei, Zimu Li, Chih-chan Tien, Risi Kondor, Aaron R. Dinner
https://arxiv.org/abs/2310.01704

06.11.23
Free-form Flows: Make Any Architecture a Normalizing Flow
Felix Draxler, Sorrenson, Peter Rangi, Rousselot, Armand Louis Amedee, Zimmermann, Lea, Ullrich Köthe
https://arxiv.org/abs/2310.16624

30.10.23
Nougat: Neural Optical Understanding for Academic Documents
Lukas Blecher, Guillem Cucurull, Thomas Scialom, Robert Stojnic
https://arxiv.org/abs/2308.13418

23.10.23
White-Box Transformers via Sparse Rate Reduction
Yaodong Yu, Sam Buchanan, Druv Pai, Tianzhe Chu, Ziyang Wu, Shengbang Tong, Benjamin D. Haeffele, Yi Ma
https://arxiv.org/abs/2306.01129

16.10.23
Emergence of Segmentation with Minimalistic White-Box Transformers
Yaodong Yu, Tianzhe Chu, Shengbang Tong, Ziyang Wu, Druv Pai, Sam Buchanan, Yi Ma
https://arxiv.org/abs/2308.16271

09.10.23
Large Language Models as Optimizers
Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, Xinyun Chen
https://arxiv.org/abs/2309.03409

02.10.23
PointLLM: Empowering Large Language Models to Understand Point Clouds
Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin
https://arxiv.org/abs/2308.16911

25.09.23
Loss of Plasticity in Deep Continual Learning
Shibhansh Dohare, J. Fernando Hernandez-Garcia, Parash Rahman, Richard S. Sutton, A. Rupam Mahmood
https://arxiv.org/abs/2306.13812

18.09.23
Tuning Computer Vision Models With Task Rewards
André Susano Pinto, Alexander Kolesnikov, Yuge Shi, Lucas Beyer, Xiaohua Zhai
https://openreview.net/forum?id=zzOooeAqtT

11.09.23
Deep Learning on Implicit Neural Representations of Shapes
Luca De Luigi, Adriano Cardace, Riccardo Spezialetti, Pierluigi Zama Ramirez, Samuele Salti, Luigi Di Stefano
https://arxiv.org/abs/2302.05438

28.08.23
Equivariant Diffusion for Molecule Generation in 3D
Emiel Hoogeboom, Victor Garcia Satorras, Clément Vignac, Max Welling
https://arxiv.org/abs/2203.17003

21.08.23
Fourier Neural Operator for Parametric Partial Differential Equations
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar
https://arxiv.org/abs/2010.08895

14.08.23
Equivariant Architectures for Learning in Deep Weight Spaces
Aviv Navon, Aviv Shamsian, Idan Achituve, Ethan Fetaya, Gal Chechik, Haggai Maron
https://openreview.net/forum?id=SCU1xlr9Y4

07.08.23
VectorAdam for Rotation Equivariant Geometry Optimization
Selena Ling, Nicholas Sharp, Alec Jacobson
https://openreview.net/forum?id=df1g_KeEjQ

31.07.23
Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang and Maneesh Agrawala
https://arxiv.org/pdf/2302.05543.pdf

24.07.23
DeepSea: An efficient deep learning model for single-cell segmentation and tracking of time-lapse microscopy images
Zargari, Abolfazl, et al.
https://www.biorxiv.org/content/10.1101/2021.03.10.434806v2.abstract

17.07.23
Track Anything: Segment Anything Meets Videos
Jinyu Yang, Mingqi Gao, Zhe Li, Shang Gao, Fangjing Wang, Feng Zheng
https://arxiv.org/abs/2304.11968

10.07.23
Trans-Dimensional Generative Modeling via Jump Diffusion Models
Andrew Campbell, William Harvey, Christian Weilbach, Valentin De Bortoli, Tom Rainforth, Arnaud Doucet
https://arxiv.org/abs/2305.16261

03.07.23
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
https://arxiv.org/abs/2304.11062

26.06.23
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
Xingang Pan, Ayush Tewari, Thomas Leimkühler, Lingjie Liu, Abhimitra Meka, Christian Theobalt
https://arxiv.org/abs/2305.10973

19.06.23
Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability
Ziming Liu, Eric Gan, Max Tegmark
https://arxiv.org/abs/2305.08746

12.06.23
Supervised Training of Conditional Monge Maps
Charlotte Bunne, Andreas Krause, Marco Cuturi
https://arxiv.org/abs/2206.14262

05.06.23
Causal Reasoning and Large Language Models: Opening a New Frontier for Causality
Emre Kıcıman, Robert Ness, Amit Sharma, Chenhao Tan
https://arxiv.org/abs/2305.00050

22.05.23
How Attentive are Graph Attention Networks?
Shaked Brody, Uri Alon, Eran Yahav
https://arxiv.org/abs/2105.14491

15.05.23 Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan et al
https://arxiv.org/abs/2303.12712

08.05.23 Discrete Variational Autoencoders
Jason Tyler Rolfe
https://arxiv.org/abs/1609.02200

24.04.23
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification
Takashi Ishida, Ikko Yamane, Nontawat Charoenphakdee, Gang Niu, Masashi Sugiyama
https://openreview.net/forum?id=FZdJQgy05rz

17.04.23
Image as Set of Points
Xu Ma, Yuqian Zhou, Huan Wang, Can Qin, Bin Sun, Chang Liu, Yun Fu
https://openreview.net/forum?id=awnvqZja69

03.04.23
Advancing mathematics by guiding human intuition with AI
Davies, A., Veličković, P., Buesing, L., Blackwell, S., Zheng, D., Tomašev, N., … & Kohli, P
https://www.nature.com/articles/s41586-021-04086-x

27.03.23
DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole, Ajay Jain, Jonathan T. Barron, Ben Mildenhall
https://openreview.net/forum?id=FjNys5c7VyY

20.03.23 Flow Matching for Generative Modeling
Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, Matt Le
https://arxiv.org/pdf/2210.02747.pdf