[Corpora-List] Research Internship on Controlled Text Generation at Naver Labs Europe

Germán Kruszewski germank at gmail.com
Tue Mar 29 20:32:42 CEST 2022

Research internship position at NAVER LABS Europe (Grenoble, France) on Energy-Based Models for Controlled Text Generation

Start date: June 2022 Duration: 5-6 months

DESCRIPTION Large language models can now be used to generate highly fluent texts. However, the synthesized utterances can be deficient on other important levels: semantic consistency, faithfulness to the facts, toxic or socially biased content.

Our team has developed several effective solutions on that front [1,2,3,4] exploiting the expressive power of Energy-Based Models in defining constraints over generative models. However, certain challenges remain: (1) How can we quickly adapt to changing control conditions without the need for model retraining? (2) Can we exploit these techniques to improve on hard-to-quantify features, such as safety, unbiasedness, textual coherence, or matching the human intention? (3) Can we improve training speed/robustness, for example, by leveraging techniques from RL?

We are looking for a motivated intern to help us develop techniques and algorithms addressing these challenges. Experiments will be conducted on selected text generation tasks using the state of art pre-trained language models.

The successful candidate should be enrolled in a graduate program, at the Master or (preferably) PhD level.

The intern will work in a team integrated by Hady Elsahar, Marc Dymetman, Germán Kruszewski, and Jos Rozen.

Publication of this internship's results in major conferences/journals will be strongly encouraged.

REQUIRED SKILLS - Strong programming skills - Relevant experience with training Deep Learning models for NLP - Strong mathematical skills - Ability to communicate research

OPTIONAL SKILLS - Knowledge of MCMC sampling techniques and/or Reinforcement Learning - Publications at peer-reviewed AI conferences

REFERENCES [1] Khalifa et al., A Distributional Approach to Controlled Text Generation, In ICLR-2021 [2] Eikema et al., Sampling from Energy-Based Models with Quality/Efficiency Trade-offs, In CtrlGen at Neurips 2021 [3] Korbak et al., Energy-Based Models for Code Generation under Compilability Constraints, In NLP4prog at ACL2021 [4] Korbak et al. Controlling Conditional Language Models with Distributional Policy Gradients, In CtrlGen at Neurips 2021

APPLICATION INSTRUCTIONS Please note that applicants must be registered students at a university or other academic institution and that this establishment will need to sign an 'Internship Convention' with NAVER LABS Europe before the student is accepted.

You can apply for this position online at https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-generation-internship-2/. Don't forget to upload your CV and cover letter before you submit. Incomplete applications will not be accepted.

ABOUT NAVER LABS NAVER is the #1 Internet portal in Korea with activities that span a wide range of businesses including search, commerce, content, financial and cloud platforms.

NAVER LABS, co-located in Korea and France, is the organization dedicated to preparing NAVER’s future. NAVER LABS Europe is located in a spectacular setting in Grenoble, in the heart of the French Alps. Scientists at NAVER LABS Europe are empowered to pursue long-term research problems that, if successful, can have significant impact and transform NAVER. We take our ideas as far as research can to create the best technology of its kind. Active participation in the academic community and collaborations with world-class public research groups are, among others, important tools to achieve these goals. Teamwork, focus and persistence are important values for us.

NAVER LABS Europe is an equal opportunity employer.

For more information and application see https://europe.naverlabs.com/job/energy-based-models-for-controlled-text-generation-internship-2/ -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: text/html Size: 4337 bytes Desc: not available URL: <https://mailman.uib.no/public/corpora/attachments/20220329/84cd9ab7/attachment.txt>

More information about the Corpora mailing list