Generalisation with Generative Models

Christopher Beckham

Thèse de doctorat (2024)

Document en libre accès dans PolyPublie

Affichage préliminaire

Libre accès au plein texte de ce document
Conditions d'utilisation: Tous droits réservés
Télécharger (26MB)

Afficher le résumé

Cacher le résumé

Résumé

Avant tout, je voudrais exprimer ma plus grande gratitude à Christopher Pal. Je vous suis à jamais redevable, dans tous les bons sens du terme. Je n’arrive toujours pas à comprendre l’anomalie statistique qui a résulté de notre rencontre lorsque vous étiez en congé sabbatique à l’université de Waikato alors que j’étais en dernière année de licence. À l’époque, je ne savais rien des réseaux neuronaux profonds, de MILA ou de Yoshua Bengio, et mon seul autre projet était d’essayer d’obtenir une bourse ultra prestigieuse de type Rhodes pour faire mon doctorat en bio-informatique à l’université de Cambridge. Comme je ne l’ai pas obtenue, j’ai été sauvé par votre offre de devenir votre étudiant de maîtrise à Montréal. Bien que mes premières années à Montréal aient été parmi les plus difficiles de ma vie, j’en suis ressortie beaucoup plus forte et plus résistante. J’aimerais remercier L’institut de valorisation des données (IVADO) de m’avoir généreusement accordé sa bourse d’excellence pour le doctorat. Par extension, je remercie également Yoshua Bengio de m’avoir écrit une lettre de référence, ainsi que sa secrétaire Julie Mongeau d’avoir fait en sorte que cette lettre soit envoyée en temps voulu. Je remercie également Yoshua Bengio pour les discussions intéressantes que nous avons eues au cours de mon doctorat. Je remercie très généreusement David Vasquez, Catherine Martin et Valérie Bécaert de ServiceNow Research (anciennement Element AI) qui m’ont permis de faire un stage chez eux pendant une période ridiculement longue, plus précisément entre la mi-2019 et la fin 2022. Certaines (ou la plupart) de ces opportunités ont été accordées dans le cadre de bourses Mitacs Accelerate et je leur suis donc également très reconnaissant d’avoir fourni autant d’opportunités de mener des recherches académiques dans le contexte d’une entreprise. Je remercie Patrick Murphy et Stéphane Turbide (également titulaire d’une bourse Mitacs) de m’avoir permis d’être chercheur invité chez Maket Technologies, ainsi qu’Anima Anandkumar et Kamyar Azzizzadenesheli pour mon bref passage chez NVIDIA, où je travaillais sur les ‘neural operators’ pour l’analyse du climat. Je remercie également tous mes coauteurs (et ils sont nombreux !). Je souhaite également adresser une mention spéciale à Sina Honari, avec qui j’ai travaillé des années avant de commencer mon doctorat. J’apprécie vraiment ton souci du détail et ta rigueur, qui ont sans doute déteint sur moi. Je remercie mon cercle d’amis le plus proche : Eugène Vorontsov, Florence Martin et Wendy Yu, ainsi que leur zoo d’animaux domestiques : Mina ‘Croquette’ Capuccina, Cleo Potato, Thunder McFunface, Easy ‘Peasy L.’ Squeezy, Minxie Chouette, et Pepper Pig. Je remercie Vaughan Chittock, le propriétaire du Pub Saint Pierre, qui est probablement ce qui me rapproche le plus d’un séjour en Nouvelle-Zélande lorsque je suis à Montréal. À tous les ‘anciens de MILA’ (la liste est trop longue pour être énumérée), vous savez qui vous êtes et je me souviendrai toujours du bon vieux temps. Merci à Marianne Chivi de m’avoir aidé à accepter mon enfance et le stress post-traumatique. Pour citer Robin Williams dans Good Will Hunting, ‘ce n’est pas de votre faute.’ Je n’ai pas de parents ou de famille élargie à remercier pour la rédaction de cette thèse. Ceci étant dit, j’espère que je n’aurai plus jamais à écrire deux cents pages de quoi que ce soit dans ma vie.

Abstract

This thesis explores important topics pertaining to generalisation within the context of gen- erative models. Generative models are an extremely broad class of model which is concerned with learning the underlying distribution of the data and how it was generated. Because of this, they find ample utility in many research topics involving improving the generalisation capabilities of other kinds of models. This is a thesis by publication, in which I present three publications as well as a journal manuscript that is currently in review. In chronological order: • The first article is motivated by generalisation with respect to compositionality. Any ground truth data distribution can be thought of as comprising a large collection of ground truth latent variables, and therefore the number of possible examples scales exponentially in the number of these variables. Because of this, many latent combinations may have zero probability under the empirical distribution of the data but not the ground truth. Intuitively however we should want the model to also have the ability to generate such unseen combinations of factors if we desired. In this work, we explore an adversarial form of a deterministic autoencoder where latent codes are stochastically mixed and trained to be indistinguishable from real latent codes when decoded back into data space. We present qualitatively appealing results as well evaluate the autoencoder’s performance as a representation learner on downstream classification tasks. • The second article explores the learning of 3D latent volumes under a weakly supervised setting, with the goal of testing its efficacy on difficult tasks involving mental rotations. Indeed, we take inspiration from the mental rotation literature in psychology and propose a new version of the CLEVR visual question answering dataset where the goal is for the model to answer a question from the point of view of another camera. We find that the use of question-conditioned neural modules which can perform rigid tranformations on an inferred 3D latent volume of the input performs significantly better than networks operat- ing in 2D. While this publication does not directly concern generative models, contrastive learning has a strong theoretical connection to density ratio estimation which is a way to estimate probability density functions. • The third article explores how to best fine-tune a generative model to perform few-shot data augmentation: that is, given a source domain with many labels and a target domain with few labels, how can we best adapt it to augment data conditioned on the target labels? Unlike most works, the synthetic images generated from such a model are intended to be used in a downstream task, which in this case is classification over the target classes. In this work we lay out a rigorous training and evaluation scheme which explores many different fine-tuning methods for a GAN. While this work largely highlights the difficulties of such a task, we also propose two interesting novelties which are simple to implement: an incremental fine-tuning trick as well as a semi-supervised variant of the adversarial loss which doesn’t require any additional parameters to be added to the discriminator. • Lastly, I conclude with a journal article which proposes a rigorous evaluation framework for model-based optimisation (MBO). In MBO, we wish to learn a generative model which can sample real-world designs conditioned on the value of a black box reward function, which describes a real world process that is expensive to compute. Since we want to find the best designs, we need such a model to extrapolate and be conditioned on rewards greater than that seen in the training set. However, evaluation is significantly less straightforward than in typical empirical risk minimisation due to distribution shift and the need to create our own test set, which does not have labels and requires the expensive ground truth to evaluate. To alleviate this issue, we propose finding validation metrics which correlate well with the ground truth, in the hopes that these can serve as cheap proxies for evaluation.

Département:	Département de génie informatique et génie logiciel
Programme:	Génie informatique
Directeurs ou directrices:	Christopher J. Pal
URL de PolyPublie:	https://publications.polymtl.ca/57730/
Université/École:	Polytechnique Montréal
Date du dépôt:	21 août 2024 15:12
Dernière modification:	25 août 2025 04:00

Citer en APA 7:	Beckham, C. (2024). Generalisation with Generative Models [Thèse de doctorat, Polytechnique Montréal]. PolyPublie. https://publications.polymtl.ca/57730/

Statistiques

Total des téléchargements à partir de PolyPublie

Téléchargements par année

Provenance des téléchargements

Actions réservées au personnel

Afficher document