NeurIPS 2024 Group and Shuffle: Efficient Structured Orthogonal Parametrization

  1. Photo Mikhail Gorbunov Mikhail Gorbunov
  2. Photo Nikolay Yudin Nikolay Yudin
  3. Photo Vera Soboleva Vera Soboleva
  4. Photo Aibek Alanov Aibek Alanov
  5. Photo Alexey Naumov Alexey Naumov
  6. Photo Maxim Rakhuba Maxim Rakhuba

The increasing size of neural networks has led to a growing demand for methods of efficient fine-tuning. Recently, an orthogonal fine-tuning paradigm was introduced that uses orthogonal matrices for adapting the weights of a pretrained model. In this paper, we introduce a new class of structured matrices, which unifies and generalizes structured classes from previous works. We examine properties of this class and build a structured orthogonal parametrization upon it. We then use this parametrization to modify the orthogonal fine-tuning framework, improving parameter and computational efficiency. We empirically validate our method on different domains, including adapting of text-to-image diffusion models and downstream task fine-tuning in language modeling. Additionally, we adapt our construction for orthogonal convolutions and conduct experiments with 1-Lipschitz neural networks.