Journal Club: Advanced Machine Learning

The Research Seminar: Advanced Machine Learning is a weekly reading group. The focus is on discussing current research papers and results in machine learning. The target audience is active researchers (postdocs, phd students and advanced master students) in the field who want to discuss and stay up to date with recent developments.

Contact Peter Lippmann (peter.lippmann [at] for further details.

Next Seminar: 15.07.2024 in INF 205, SR 4.300 starting at 3:33pm
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models
Yixin Liu, Kai Zhang, Yuan Li et al.

Recently discussed papers:

Accurate structure prediction of biomolecular interactions with AlphaFold 3
Josh Abramson, Jonas Adler, Jack Dunger et al.

A Hitchhiker’s Guide to Geometric GNNs for 3D Atomic Systems
Alexandre Duval, Simon V. Mathis, Chaitanya K. Joshi, Victor Schmidt, Santiago Miret, Fragkiskos D. Malliaros, Taco Cohen, Pietro Liò, Yoshua Bengio, Michael Bronstein

Why do tree-based models still outperform deep learning on typical tabular data?
Leo Grinsztajn, Edouard Oyallon, Gael Varoquaux

The Free Energy Principle made simpler but not too simple
Karl Friston, Lancelot Da Costa, Noor Sajid, Conor Heins, Kai Ueltzhöffer, Grigorios A. Pavliotis, Thomas Parr

Nonlocal Machine-Learned Exchange Functional for Molecules and Solids
Kyle Bystrom, Boris Kozinsky

xLSTM: Extended Long Short-Term Memory
Maximilian Beck, Korbinian Pöppel, Markus Spanring et al.

Highly accurate protein structure prediction with AlphaFold
John Jumper, Richard Evans, Alexander Pritzel et al.

KAN: Kolmogorov-Arnold Networks
Ziming Liu, Yixuan Wang, Sachin Vaidya, Fabian Ruehle, James Halverson, Marin Soljacic, Thomas Y. Hou, Max Tegmark

Categorical Deep Learning: An Algebraic Theory of Architectures
Bruno Gavranović, Paul Lessard, Andrew Dudzik, Tamara von Glehn, João G. M. Araújo, Petar Veličković

A Practical Method for Constructing Equivariant Multilayer Perceptrons for Arbitrary Matrix Groups
Marc Finzi, Max Welling, Andrew Gordon Wilson

Position Paper: Bayesian Deep Learning in the Age of Large-Scale AI
Theodore Papamarkou, Maria Skoularidou, Konstantina Palla et al.

Self-Rewarding Language Models
Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho, Xian Li, Sainbayar Sukhbaatar, Jing Xu, Jason Weston

Learning in High Dimension Always Amounts to Extrapolation
Randall Balestriero, Jérôme Pesenti, and Yann LeCun

Deep Double Descent: Where Bigger Models and More Data Hurt
Preetum Nakkiran, Gal Kaplun, Yamini Bansal, Tristan Yang, Boaz Barak, Ilya Sutskever

Deep Networks Always Grok and Here is Why
Ahmed Imtiaz Humayun, Randall Balestriero, Richard Baraniuk

Lie Point Symmetry and Physics-Informed Networks
Tara Akhound-Sadegh, Laurence Perreault-Levasseur, Johannes Brandstetter, Max Welling, Siamak Ravanbakhsh

Solving olympiad geometry without human demonstrations
Trieu H. Trinh, Yuhuai Wu, Quoc V. Le, He He & Thang Luong

EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations
Yi-Lun Liao, Brandon Wood, Abhishek Das, Tess Smidt

TensorNet: Cartesian Tensor Representations for Efficient Learning of Molecular Potentials
Guillem Simeon, Gianni de Fabritiis

A foundation model for atomistic materials chemistry
Ilyes Batatia, Philipp Benner, Yuan Chiang et al.

29.01.24 Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training
Evan Hubinger, Carson Denison, Jesse Mu, Mike Lambert et al.

Learning Distributions on Manifolds with Free-form Flows
Peter Sorrenson, Felix Draxler, Armand Rousselot, Sander Hummerich, Ullrich Köthe

Deep Nets Don’t Learn via Memorization
David Krueger, Nicolas Ballas, Stanislaw Jastrzebski, Devansh Arpit* et al.*

Progress measures for grokking via mechanistic interpretability
Neel Nanda, Lawrence Chan, Tom Lieberum, Jess Smith, Jacob Steinhardt

GraphCast: Learning skillful medium-range global weather forecasting
Remi Lam, Alvaro Sanchez-Gonzalez, Matthew Willson et al.

Transformers are efficient hierarchical chemical graph learners
Zihan Pengmei, Zimu Li, Chih-chan Tien, Risi Kondor, Aaron R. Dinner

Free-form Flows: Make Any Architecture a Normalizing Flow
Felix Draxler, Sorrenson, Peter Rangi, Rousselot, Armand Louis Amedee, Zimmermann, Lea, Ullrich Köthe

Nougat: Neural Optical Understanding for Academic Documents
Lukas Blecher, Guillem Cucurull, Thomas Scialom, Robert Stojnic

White-Box Transformers via Sparse Rate Reduction
Yaodong Yu, Sam Buchanan, Druv Pai, Tianzhe Chu, Ziyang Wu, Shengbang Tong, Benjamin D. Haeffele, Yi Ma

Emergence of Segmentation with Minimalistic White-Box Transformers
Yaodong Yu, Tianzhe Chu, Shengbang Tong, Ziyang Wu, Druv Pai, Sam Buchanan, Yi Ma

Large Language Models as Optimizers
Chengrun Yang, Xuezhi Wang, Yifeng Lu, Hanxiao Liu, Quoc V. Le, Denny Zhou, Xinyun Chen

PointLLM: Empowering Large Language Models to Understand Point Clouds
Runsen Xu, Xiaolong Wang, Tai Wang, Yilun Chen, Jiangmiao Pang, Dahua Lin

Loss of Plasticity in Deep Continual Learning
Shibhansh Dohare, J. Fernando Hernandez-Garcia, Parash Rahman, Richard S. Sutton, A. Rupam Mahmood

Tuning Computer Vision Models With Task Rewards
André Susano Pinto, Alexander Kolesnikov, Yuge Shi, Lucas Beyer, Xiaohua Zhai

Deep Learning on Implicit Neural Representations of Shapes
Luca De Luigi, Adriano Cardace, Riccardo Spezialetti, Pierluigi Zama Ramirez, Samuele Salti, Luigi Di Stefano

Equivariant Diffusion for Molecule Generation in 3D
Emiel Hoogeboom, Victor Garcia Satorras, Clément Vignac, Max Welling

Fourier Neural Operator for Parametric Partial Differential Equations
Zongyi Li, Nikola Kovachki, Kamyar Azizzadenesheli, Burigede Liu, Kaushik Bhattacharya, Andrew Stuart, Anima Anandkumar

Equivariant Architectures for Learning in Deep Weight Spaces
Aviv Navon, Aviv Shamsian, Idan Achituve, Ethan Fetaya, Gal Chechik, Haggai Maron

VectorAdam for Rotation Equivariant Geometry Optimization
Selena Ling, Nicholas Sharp, Alec Jacobson

Adding Conditional Control to Text-to-Image Diffusion Models
Lvmin Zhang and Maneesh Agrawala

DeepSea: An efficient deep learning model for single-cell segmentation and tracking of time-lapse microscopy images
Zargari, Abolfazl, et al.

Track Anything: Segment Anything Meets Videos
Jinyu Yang, Mingqi Gao, Zhe Li, Shang Gao, Fangjing Wang, Feng Zheng

Trans-Dimensional Generative Modeling via Jump Diffusion Models
Andrew Campbell, William Harvey, Christian Weilbach, Valentin De Bortoli, Tom Rainforth, Arnaud Doucet

Scaling Transformer to 1M tokens and beyond with RMT
Aydar Bulatov, Yuri Kuratov, Mikhail S. Burtsev

Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold
Xingang Pan, Ayush Tewari, Thomas Leimkühler, Lingjie Liu, Abhimitra Meka, Christian Theobalt

Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability
Ziming Liu, Eric Gan, Max Tegmark

Supervised Training of Conditional Monge Maps
Charlotte Bunne, Andreas Krause, Marco Cuturi

Causal Reasoning and Large Language Models: Opening a New Frontier for Causality
Emre Kıcıman, Robert Ness, Amit Sharma, Chenhao Tan

How Attentive are Graph Attention Networks?
Shaked Brody, Uri Alon, Eran Yahav

15.05.23 Sparks of Artificial General Intelligence: Early experiments with GPT-4
Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan et al

08.05.23 Discrete Variational Autoencoders
Jason Tyler Rolfe

Is the Performance of My Deep Network Too Good to Be True? A Direct Approach to Estimating the Bayes Error in Binary Classification
Takashi Ishida, Ikko Yamane, Nontawat Charoenphakdee, Gang Niu, Masashi Sugiyama

Image as Set of Points
Xu Ma, Yuqian Zhou, Huan Wang, Can Qin, Bin Sun, Chang Liu, Yun Fu

Advancing mathematics by guiding human intuition with AI
Davies, A., Veličković, P., Buesing, L., Blackwell, S., Zheng, D., Tomašev, N., … & Kohli, P

DreamFusion: Text-to-3D using 2D Diffusion
Ben Poole, Ajay Jain, Jonathan T. Barron, Ben Mildenhall

20.03.23 Flow Matching for Generative Modeling
Yaron Lipman, Ricky T. Q. Chen, Heli Ben-Hamu, Maximilian Nickel, Matt Le