Journal Club: Advanced Machine Learning
The Research Seminar: Advanced Machine Learning is a weekly reading group. The focus is on discussing current research papers and results in machine learning. The target audience is active researchers (postdocs, phd students and advanced master students) in the field who want to discuss and stay up to date with recent developments.
Contact Peter Lippmann (peter.lippmann [at] iwr.uni-heidelberg.de) for further details.
Next Seminar: 18.11.2024 in INF 205, SR 4.300 starting at 11:00am
Paper to be discussed:
Does equivariance matter at scale?
Johann Brehmer, Sönke Behrends, Pim de Haan, Taco Cohen
https://arxiv.org/abs/2410.23179
Recently discussed papers:
11.11.2024
Let’s Verify Step by Step
Hunter Lightman, Vineet Kosaraju, Yura Burda et al.
https://arxiv.org/abs/2305.20050
04.11.2024
SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning
Yizhou Chi, Yizhang Lin, Sirui Hong et al.
https://arxiv.org/abs/2410.17238v1
28.10.2024
Homomorphism Counts for Graph Neural Networks: All About That Basis
Emily Jin, Michael Bronstein, İsmail İlkan Ceylan, Matthias Lanzinger
https://arxiv.org/abs/2402.08595
14.10.2024
Smooth, exact rotational symmetrization for deep learning on point clouds
Sergey Pozdnyakov and Michele Ceriotti
https://proceedings.neurips.cc/paper_files/paper/2023/hash/fb4a7e3522363907b26a86cc5be627ac-Abstract-Conference.html
07.10.2024
Position: Topological Deep Learning is the New Frontier for Relational Learning
Theodore Papamarkou, Tolga Birdal, Michael M. Bronstein et al.
https://openreview.net/forum?id=Nl3RG5XWAt
23.09.2024
Unifying O(3) equivariant neural networks design with tensor-network formalism
Li, Zimu, Zihan Pengmei, Han Zheng, Erik Thiede, Junyu Liu, and Risi Kondor
https://iopscience.iop.org/article/10.1088/2632-2153/ad4a04/pdf
16.09.2024
Memorization Through the Lens of Curvature of Loss Function Around Samples
Isha Garg, Deepak Ravikumar, Kaushik Roy
https://proceedings.mlr.press/v235/garg24a.html
09.09.2024
Weisfeiler-Leman at the margin: When more expressivity matters
Billy Joe Franks, Christopher Morris, Ameya Velingker, Floris Geerts
https://proceedings.mlr.press/v235/franks24a.html
02.09.2024
No “Zero-Shot” Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
Vishaal Udandarao, Ameya Prabhu, Adhiraj Ghosh, Yash Sharma, Philip H.S. Torr, Adel Bibi, Samuel Albanie, Matthias Bethge
https://arxiv.org/abs/2404.04125
26.08.2024
The No Free Lunch Theorem, Kolmogorov Complexity, and the Role of Inductive Biases in Machine Learning
Micah Goldblum, Marc Finzi, Keefer Rowan, Andrew Gordon Wilson
https://arxiv.org/abs/2304.05366
19.08.2024
Scaling rectified flow transformers for high-resolution image synthesis
P Esser, S Kulal, A Blattmann, R Entezari et al.
https://openreview.net/pdf?id=FPnUhsQJ5B
12.08.2024
No “Zero-Shot” Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance
Vishaal Udandarao, Ameya Prabhu, Adhiraj Ghosh, Yash Sharma, Philip H.S. Torr, Adel Bibi, Samuel Albanie, Matthias Bethge
https://arxiv.org/abs/2404.04125
05.08.2024
Discovering Symmetry Breaking in Physical Systems with Relaxed Group Convolution
R Wang, E Hofgard, H Gao, R Walters, T Smidt
https://openreview.net/pdf?id=59oXyDTLJv
30.07.2024
Tensor Frames – How To Make Any Message Passing Network Equivariant
Peter Lippmann, Gerrit Gerhartz, Roman Remme, Fred A. Hamprecht
https://arxiv.org/abs/2405.15389
15.07.2024
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu, Kai Zhang, Yuan Li et al.
https://arxiv.org/abs/2402.17177
08.07.2024
Accurate structure prediction of biomolecular interactions with AlphaFold 3
Josh Abramson, Jonas Adler, Jack Dunger et al.
https://www.nature.com/articles/s41586-024-07487-w
01.07.2024
A Hitchhiker’s Guide to Geometric GNNs for 3D Atomic Systems
Alexandre Duval, Simon V. Mathis, Chaitanya K. Joshi, Victor Schmidt, Santiago Miret, Fragkiskos D. Malliaros, Taco Cohen, Pietro Liò, Yoshua Bengio, Michael Bronstein
https://arxiv.org/abs/2312.07511
24.06.2024
Why do tree-based models still outperform deep learning on typical tabular data?
Leo Grinsztajn, Edouard Oyallon, Gael Varoquaux
https://proceedings.neurips.cc/paper_files/paper/2022/hash/0378c7692da36807bdec87ab043cdadc-Abstract-Datasets_and_Benchmarks.html
17.06.2024
The Free Energy Principle made simpler but not too simple
Karl Friston, Lancelot Da Costa, Noor Sajid, Conor Heins, Kai Ueltzhöffer, Grigorios A. Pavliotis, Thomas Parr
https://www.sciencedirect.com/science/article/pii/S037015732300203X
10.06.2024
Nonlocal Machine-Learned Exchange Functional for Molecules and Solids
Kyle Bystrom, Boris Kozinsky
https://arxiv.org/abs/2303.00682
03.06.2024
xLSTM: Extended Long Short-Term Memory
Maximilian Beck, Korbinian Pöppel, Markus Spanring et al.
https://arxiv.org/abs/2405.04517
28.05.2024
Highly accurate protein structure prediction with AlphaFold
John Jumper, Richard Evans, Alexander Pritzel et al.
https://www.nature.com/articles/s41586-021-03819-2
13.05.2024
KAN: Kolmogorov-Arnold Networks
Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljacic, Thomas Y. Hou, Max Tegmark
https://arxiv.org/abs/2404.19756
06.05.2024
Categorical Deep Learning: An Algebraic Theory of Architectures
Bruno Gavranović, Paul Lessard, Andrew Dudzik, Tamara von Glehn, João G. M. Araújo, Petar Veličković
https://arxiv.org/pdf/2402.15332
29.04.2024
A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups
Marc Finzi, Max Welling, Andrew Gordon Wilson
https://proceedings.mlr.press/v139/finzi21a.html
22.04.24
Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI
Theodore Papamarkou, Maria Skoularidou, Konstantina Palla et al.
https://arxiv.org/abs/2402.00809
15.04.24
Self-Rewarding Language Models
Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason Weston
https://arxiv.org/abs/2401.10020
25.03.24
Learning in High Dimension Always Amounts to Extrapolation
Randall Balestriero, Jérôme Pesenti, and Yann LeCun
https://arxiv.org/abs/2110.09485
18.03.24
Deep Double Descent: Where Bigger Models and More Data Hurt
Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, Ilya Sutskever
https://arxiv.org/abs/1912.02292
11.03.24
Deep Networks Always Grok and Here is Why
Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk
https://arxiv.org/abs/2402.15555
04.03.24
Lie Point Symmetry and Physics-Informed Networks
Tara Akhound-Sadegh, Laurence Perreault-Levasseur, Johannes Brandstetter, Max Welling, Siamak Ravanbakhsh
https://proceedings.neurips.cc/paper_files/paper/2023/hash/8493c860bec41705f7743d5764301b94-Abstract-Conference.html
26.02.2024
Solving olympiad geometry without human demonstrations
Trieu H. Trinh, Yuhuai Wu, Quoc V. Le, He He & Thang Luong
https://www.nature.com/articles/s41586-023-06747-5
19.02.24
EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations
Yi-Lun Liao, Brandon Wood, Abhishek Das, Tess Smidt
https://arxiv.org/abs/2306.12059
12.02.24
TensorNet: Cartesian Tensor Representations for Efficient Learning of Molecular Potentials
Guillem Simeon, Gianni de Fabritiis
https://arxiv.org/abs/2306.06482
05.02.24
A foundation model for atomistic materials chemistry
Ilyes Batatia, Philipp Benner, Yuan Chiang et al.
https://arxiv.org/abs/2401.00096
29.01.24
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Evan Hubinger, Carson Denison, Jesse Mu, Mike Lambert et al.
https://arxiv.org/abs/2401.05566
22.01.24
Learning Distributions on Manifolds with Free-form Flows
Peter Sorrenson, Felix Draxler, Armand Rousselot, Sander Hummerich, Ullrich Köthe
https://arxiv.org/abs/2312.09852
13.12.23
Deep Nets Don’t Learn via Memorization
David Krueger, Nicolas Ballas, Stanislaw Jastrzebski, Devansh Arpit* et al.*
https://openreview.net/forum?id=rJv6ZgHYg
04.01.12
Progress measures for grokking via mechanistic interpretability
Neel Nanda, Lawrence Chan, Tom Lieberum, Jess Smith, Jacob Steinhardt
https://arxiv.org/abs/2301.05217
27.11.23
GraphCast: Learning skillful medium-range global weather forecasting
Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson et al.
https://arxiv.org/abs/2212.12794
13.11.23
Transformers are efficient hierarchical chemical graph learners
Zihan Pengmei, Zimu Li, Chih-chan Tien, Risi Kondor, Aaron R. Dinner
https://arxiv.org/abs/2310.01704
06.11.23
Free-form Flows: Make Any Architecture a Normalizing Flow
Felix Draxler, Sorrenson, Peter Rangi, Rousselot, Armand Louis Amedee, Zimmermann, Lea, Ullrich Köthe
https://arxiv.org/abs/2310.16624
30.10.23
Nougat: Neural Optical Understanding for Academic Documents
Lukas Blecher, Guillem Cucurull, Thomas Scialom, Robert Stojnic
https://arxiv.org/abs/2308.13418
23.10.23
White-Box Transformers via Sparse Rate Reduction
Yaodong Yu, Sam Buchanan, Druv Pai, Tianzhe Chu, Ziyang Wu, Shengbang Tong, Benjamin D. Haeffele, Yi Ma
https://arxiv.org/abs/2306.01129
16.10.23
Emergence of Segmentation with Minimalistic White-Box Transformers
Yaodong Yu, Tianzhe Chu, Shengbang Tong, Ziyang Wu, Druv Pai, Sam Buchanan, Yi Ma
https://arxiv.org/abs/2308.16271
09.10.23
Large Language Models as Optimizers
Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, Xinyun Chen
https://arxiv.org/abs/2309.03409
02.10.23
PointLLM: Empowering Large Language Models to Understand Point Clouds
Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin
https://arxiv.org/abs/2308.16911
25.09.23
Loss of Plasticity in Deep Continual Learning
Shibhansh Dohare, J. Fernando Hernandez-Garcia, Parash Rahman, Richard S. Sutton, A. Rupam Mahmood
https://arxiv.org/abs/2306.13812
18.09.23
Tuning Computer Vision Models With Task Rewards
André Susano Pinto, Alexander Kolesnikov, Yuge Shi, Lucas Beyer, Xiaohua Zhai
https://openreview.net/forum?id=zzOooeAqtT
11.09.23
Deep Learning on Implicit Neural Representations of Shapes
Luca De Luigi, Adriano Cardace, Riccardo Spezialetti, Pierluigi Zama Ramirez, Samuele Salti, Luigi Di Stefano
https://arxiv.org/abs/2302.05438
28.08.23
Equivariant Diffusion for Molecule Generation in 3D
Emiel Hoogeboom, Victor Garcia Satorras, Clément Vignac, Max Welling
https://arxiv.org/abs/2203.17003
21.08.23
Fourier Neural Operator for Parametric Partial Differential Equations
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar
https://arxiv.org/abs/2010.08895
14.08.23
Equivariant Architectures for Learning in Deep Weight Spaces
Aviv Navon, Aviv Shamsian, Idan Achituve, Ethan Fetaya, Gal Chechik, Haggai Maron
https://openreview.net/forum?id=SCU1xlr9Y4
07.08.23
VectorAdam for Rotation Equivariant Geometry Optimization
Selena Ling, Nicholas Sharp, Alec Jacobson
https://openreview.net/forum?id=df1g_KeEjQ
31.07.23
Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang and Maneesh Agrawala
https://arxiv.org/pdf/2302.05543.pdf
24.07.23
DeepSea: An efficient deep learning model for single-cell segmentation and tracking of time-lapse microscopy images
Zargari, Abolfazl, et al.
https://www.biorxiv.org/content/10.1101/2021.03.10.434806v2.abstract
17.07.23
Track Anything: Segment Anything Meets Videos
Jinyu Yang, Mingqi Gao, Zhe Li, Shang Gao, Fangjing Wang, Feng Zheng
https://arxiv.org/abs/2304.11968
10.07.23
Trans-Dimensional Generative Modeling via Jump Diffusion Models
Andrew Campbell, William Harvey, Christian Weilbach, Valentin De Bortoli, Tom Rainforth, Arnaud Doucet
https://arxiv.org/abs/2305.16261
03.07.23
Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev
https://arxiv.org/abs/2304.11062
26.06.23
Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
Xingang Pan, Ayush Tewari, Thomas Leimkühler, Lingjie Liu, Abhimitra Meka, Christian Theobalt
https://arxiv.org/abs/2305.10973
19.06.23
Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability
Ziming Liu, Eric Gan, Max Tegmark
https://arxiv.org/abs/2305.08746
12.06.23
Supervised Training of Conditional Monge Maps
Charlotte Bunne, Andreas Krause, Marco Cuturi
https://arxiv.org/abs/2206.14262
05.06.23
Causal Reasoning and Large Language Models: Opening a New Frontier for Causality
Emre Kıcıman, Robert Ness, Amit Sharma, Chenhao Tan
https://arxiv.org/abs/2305.00050
22.05.23
How Attentive are Graph Attention Networks?
Shaked Brody, Uri Alon, Eran Yahav
https://arxiv.org/abs/2105.14491
15.05.23
Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan et al
https://arxiv.org/abs/2303.12712
08.05.23
Discrete Variational Autoencoders
Jason Tyler Rolfe
https://arxiv.org/abs/1609.02200
24.04.23
Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification
Takashi Ishida, Ikko Yamane, Nontawat Charoenphakdee, Gang Niu, Masashi Sugiyama
https://openreview.net/forum?id=FZdJQgy05rz
17.04.23
Image as Set of Points
Xu Ma, Yuqian Zhou, Huan Wang, Can Qin, Bin Sun, Chang Liu, Yun Fu
https://openreview.net/forum?id=awnvqZja69
03.04.23
Advancing mathematics by guiding human intuition with AI
Davies, A., Veličković, P., Buesing, L., Blackwell, S., Zheng, D., Tomašev, N., … & Kohli, P
https://www.nature.com/articles/s41586-021-04086-x
27.03.23
DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole, Ajay Jain, Jonathan T. Barron, Ben Mildenhall
https://openreview.net/forum?id=FjNys5c7VyY
20.03.23
Flow Matching for Generative Modeling
Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, Matt Le
https://arxiv.org/pdf/2210.02747.pdf