Artem Zholus

I am a second year PhD student in MILA and Polytechnique Montréal supervised by Prof. Sarath Chandar. Besides that, I am a visiting researcher at AI @ Meta advised by Mido Assran. My ultimate research goal is to build adaptive and autonomous agents that solve open-ended tasks. To step towards this goal, I use Language and World Models, Structured Communication, and (Distributed) Reinforcement Learning.

👉 Expand to see what I mean by each of these ideas

  • Language and World Models. Learning large models that forecast the future (e.g. in the form of next token prediction) this way learning causality, among other things. These models lay the foundation for autonomy (e.g. through self-interaction/self-learning), adaptiveness (e.g. via continuous adaptation), and generality.
  • (Distributed) Reinforcement Learning.I am an adept of the Reward Hypothesis, but I still believe that only the data agent trains on matters in the end. Thus, to build truly capable agents we need powerful sampling/training loops that work at both small and large scales.
  • Structured Communication. To me, the most exciting thing about the current state of foundation models is how easily they can be combined (in parameter space, activation space or even token space) with each other or with themselves. By “connecting” models in a certain way we can achieve generalization of these models to new settings or even enable new types of learning. A good example for this is RLHF. I am not limiting myself to a specific type of learning. I coined this as “Structured Communication” which should be understood as any information exchange between models such that they don’t change the way they exchange info and how they use that info. This also encompasses gradient free learning enabled by LLMs.

My current research project focuses on scaling world models. My past research covered Model-based Reinforcement Learning via World Models, interactive learning of embodied agents, and boosting in-context memory of Model-based RL agents. Also, I spent some time in ML industry working as an ML Engineer doing drug discovery with RL and Language Models.

Previously I was a student researcher at Google DeepMind with Ross Goroshin. Also, I had two internships at EPFL: at the LIONS lab (in RL theory) under Prof. Volkan Cevher and at VILAB (in Multimodal Representation Learning) under Prof. Amir Zamir. I obtained my Masters degree at MIPT studying AI, ML, and Cognitive Sciences and working at the CDS lab under Prof. Aleksandr Panov on task generalization in model-based reinforcement learning. I received my BSc degree from ITMO University majoring in Computer Science and Applied Mathematics.

CV  /  Email  /  GitHub  /  Google Scholar  /  Twitter  /  LinkedIn

profile photo

First Author Papers

project image

BindGPT: A Scalable Framework for 3D Molecular Design via Language Modeling and Reinforcement Learning


Artem Zholus, Maksim Kuznetsov, Roman Schutski, Rim Shayakhmetov, Daniil Polykovskiy, Sarath Chandar, Alex Zhavoronkov
Under review, June 2024
website / arxiv / hugginface /

BindGPT is a new framework for building drug discovery models that leverages compute-efficient pretraining, supervised funetuning, prompting, reinforcement learning, and tool use of LMs. This allows BindGPT to build a single pre-trained model that exhibits state-of-the-art performance in 3D Molecule Generation, 3D Conformer Generation, Pocket-Conditioned 3D Molecule Generation, posing them as downstream tasks for a pretrained model, while previous methods build task-specialized models without task transfer abilities.

project image

Mastering Memory Tasks with World Models


Mohammad Reza Samsami*, Artem Zholus*, Janarthanan Rajendran, Sarath Chandar
(oral, top-1.2%) ICLR, 2024
website / arxiv / openreview / code /

The new State-of-the-Art performance in a diverse set of memory-intense Reinforcement Learning domains: bsuite (tabular, low dimensional), POPgym (tabular, high dimensional), Memory Maze (3D, embodied, high dimensional, long-term). Importantly, we reach super-human performance in Memory-Maze!

project image

IGLU Gridworld: Simple and Fast Environment for Embodied Dialog Agents


Artem Zholus, Alexey Skrynnik, Shrestha Mohanty, Zoya Volovikova, Julia Kiseleva, Artur Szlam, Marc-Alexandre Coté, Aleksandr I. Panov
Embodied AI workshop @ CVPR, 2022
arxiv / code / slides /

A lightweight reinforcement learning environment for building embodied agents with language context tasked to build 3D structures in Minecraft-like world.

project image

IGLU 2022: Interactive Grounded Language Understanding in a Collaborative Environment at NeurIPS 2022


Julia Kiseleva*, Alexey Skrynni*, Artem Zholus*, Shrestha Mohanty*, Negar Arabzadeh*, Marc-Alexandre Côté*, Mohammad Aliannejadi, Milagro Teruel, Ziming Li, Mikhail Burtsev, Maartje ter Hoeve, Zoya Volovikova, Aleksandr Panov, Yuxuan Sun, Kavya Srinet, Arthur Szlam, Ahmed Awadallah
NeurIPS, Competition Track, 2022
website / arxiv / code /

AI competition where the goal is to follow a language instruction with context while being embodied in a 3D blocks world (RL track) and to ask a clarifying question in the case of ambiguity (NLP track).

project image

Factorized World Models for Learning Causal Relationships


Artem Zholus, Yaroslav Ivchenkov, and Aleksandr Panov
OSC workshop, ICLR, 2022
arxiv / code /

An RL agent that can generalize behavior on unseen tasks, which is done by learning a structured world model and constraining task specific information.


Design and source code from Jon Barron's website