Research on Task Discovery for Transfer Learning in Deep Neural Networks

Arda Akdemir

Research on Task Discovery for Transfer Learning in Deep Neural Networks

Arda Akdemir

Abstract Paper Share

Student Research Workshop SRW Paper

Session 1B: Jul 6 (06:00-07:00 GMT)

Session 12A: Jul 8 (08:00-09:00 GMT)

Abstract: Deep neural network based machine learning models are shown to perform poorly on unseen or out-of-domain examples by numerous recent studies. Transfer learning aims to avoid overfitting and to improve generalizability by leveraging the information obtained from multiple tasks. Yet, the benefits of transfer learning depend largely on task selection and finding the right method of sharing. In this thesis, we hypothesize that current deep neural network based transfer learning models do not achieve their fullest potential for various tasks and there are still many task combinations that will benefit from transfer learning that are not considered by the current models. To this end, we started our research by implementing a novel multi-task learner with relaxed annotated data requirements and obtained a performance improvement on two NLP tasks. We will further devise models to tackle tasks from multiple areas of machine learning, such as Bioinformatics and Computer Vision, in addition to NLP.

You can open the pre-recorded video in a separate window.

NOTE: The SlidesLive video may display a random order of the authors. The correct author list is shown at the top of this webpage.

Research on Task Discovery for Transfer Learning in Deep Neural Networks

Arda Akdemir

Similar Papers

In Neural Machine Translation, What Does Transfer Learning Transfer?

Alham Fikri Aji, Nikolay Bogoychev, Kenneth Heafield, Rico Sennrich,

Intermediate-Task Transfer Learning with Pretrained Language Models: When and Why Does It Work?

Yada Pruksachatkun, Jason Phang, Haokun Liu, Phu Mon Htut, Xiaoyi Zhang, Richard Yuanzhe Pang, Clara Vania, Katharina Kann, Samuel R. Bowman,

The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents

Kurt Shuster, Da JU, Stephen Roller, Emily Dinan, Y-Lan Boureau, Jason Weston,

jiant: A Software Toolkit for Research on General-Purpose Text Understanding Models

Yada Pruksachatkun, Phil Yeres, Haokun Liu, Jason Phang, Phu Mon Htut, Alex Wang, Ian Tenney, Samuel R. Bowman,