<  Retour au portail Polytechnique Montréal

Neural attentive circuits

Martin Weiss, Nasim Rahaman, Francesco Locatello, Christopher J. Pal, Yoshua Bengio, Bernhard Scholkopf, Nicolas Ballas et Li Erran Li

Affiche (2022)

Un lien externe est disponible pour ce document
Afficher le résumé
Cacher le résumé

Abstract

Recent work has seen the development of general purpose neural architectures that can be trained to perform tasks across diverse data modalities. General purpose models typically make few assumptions about the underlying data-structure and are known to perform well in the large-data regime. At the same time, there has been growing interest in modular neural architectures that represent the data using sparsely interacting modules. These models can be more robust out-of-distribution, computationally efficient, and capable of sample-efficient adaptation to new data. However, they tend to make domain-specific assumptions about the data, and present challenges in how module behavior (i.e., parameterization) and connectivity (i.e., their layout) can be jointly learned. In this work, we introduce a general purpose, yet modular neural architecture called Neural Attentive Circuits (NACs) that jointly learns the parameterization and a sparse connectivity of neural modules without using domain knowledge. NACs are best understood as the combination of two systems that are jointly trained end-to-end: one that determines the module configuration and the other that executes it on an input. We demonstrate qualitatively that NACs learn diverse and meaningful module configurations on the Natural Language and Visual Reasoning for Real (NLVR2) dataset without additional supervision. Quantitatively, we show that by incorporating modularity in this way, NACs improve upon a strong non-modular baseline in terms of low-shot adaptation on CIFAR and Caltech-UCSD Birds dataset (CUB) by about 10 percent, and OOD robustness on Tiny ImageNet-R by about 2.5 percent. Further, we find that NACs can achieve an 8x speedup at inference time while losing less than 3 percent performance. Finally, we find NACs to yield competitive results on diverse data modalities spanning point-cloud classification, symbolic processing and text-classification from ASCII bytes, thereby confirming its general purpose nature.

Mots clés

Matériel d'accompagnement:
Département: Département de génie informatique et génie logiciel
Organismes subventionnaires: NSERC, IVADO, BMBF
Numéro de subvention: FKZ: 01IS18039B
URL de PolyPublie: https://publications.polymtl.ca/54584/
Nom de la conférence: 36th Conference on Neural Information Processing Systems (NeurIPS 2022)
Lieu de la conférence: New Orleans, LA, USA
Date(s) de la conférence: 2022-11-28 - 2022-12-09
Maison d'édition: Neural information processing systems foundation
URL officielle: https://nips.cc/virtual/2022/poster/53181
Date du dépôt: 29 août 2023 16:52
Dernière modification: 21 janv. 2026 11:12
Citer en APA 7: Weiss, M., Rahaman, N., Locatello, F., Pal, C. J., Bengio, Y., Scholkopf, B., Ballas, N., & Li, L. E. (novembre 2022). Neural attentive circuits [Affiche]. 36th Conference on Neural Information Processing Systems (NeurIPS 2022), New Orleans, LA, USA. https://nips.cc/virtual/2022/poster/53181

Statistiques

Aucune statistique n'est disponible.

Actions réservées au personnel

Afficher document Afficher document