Tag Archives: machine learning

Layer-Wise Linear Mode Connectivity

We presented our work on layer-wise linear mode connectivity at ICLR 2024 let by Linara Adilova, with Maksym Andriushchenko, Michael Kamp, Asja Fischer and Martin Jaggi.

We know that linear mode connectivity doesn’t hold for two independently trained models. But what about *layer-wise* LMC? Well, it is very different!

We investigate layer-wise averaging and discover that for multiple networks, tasks, and setups averaging only one layer does not affect the performance! This is inline with the research showing that re-initialization of individual layers does not change accuracy.

Nevertheless, is there some critical amount of layers needed to be averaged to get to a high loss point? It turns out that barrier-prone layers are concentrated in the middle of a model.

Is there a way to gain more insights on this phenomenon? Let’s see how it looks like for a minimalistic example of a deep linear network. Ultimately, linear network is convex with respect to any of its layer cuts.

Can robustness explain this property, i.e., all the neural networks have a particular weight changes robustness that allows to compensate for one layer modifications? For some layers the answer is yes, it is indeed much harder to get to a high loss for a more robust model.

It also means that we cannot treat random directions as uniformly representative of the loss surface: our experiment shows particular subspaces to be more stable than others. Especially, single layer subspaces have a different tolerance to noise!

Nothing but Regrets – Federated Causal Discovery

Discovering causal relationships enables us to build more reliable, robust, and ultimately trustworthy models. It requires large amounts of observational data, though. In healthcare, for most diseases the amount of available data is large, but this data is scattered over thousands of hospitals worldwide. Since this data in most cases mustn’t be pooled for privacy reasons, we need a way to learn a structural causal model in a federated fashion.

At this year’s AISTATS, my co-authors Osman Mian, David Kaltenpoth, Jilles Vreeken and me presented the paper “Nothing but Regrets – Privacy-Preserving Federated Causal Discovery” in which we show that you can discover causal relationships by sharing only regret values with a server: The server sends a candidate causal model to each client and the clients reply with how much worse single-edge extensions of this global model are compared to the original global model. From this information alone, the server can compute the best extension of the current global model.

In practice, the environments at the local clients are not the same. We should expect local differences that could be modeled by interventions into the global causal structure. In our AAAI paper “Information-Theoretic Causal Discovery and Intervention Detection over Multiple Environments” we have shown how to discover a global causal structure as well as local interventions in a centralized setting. Our current goal is to combine these two works to provide an approach to federated causal discovery from heterogeneous environments.