Skip to main content

Summary of Reinforcement Learning Frameworks

There are many reinforcement learning frameworks out there, but since relatively few people use them, documentation tends to be lacking.

In this article, I have put together a rough summary of the pros and cons of each framework.

I hope this serves as a helpful reference when choosing a framework.

What Is a Reinforcement Learning Framework?

There is no strict definition, but here we consider something a reinforcement learning framework if it satisfies at least one of the following:

  • Pre-implemented RL agent algorithms (e.g., TRPO, A3C, ...) are ready to use
  • Training progress can be easily visualized during learning (e.g., Tensorboard support)
  • You can add custom environments or agents for training
  • Training code is provided in an abstracted form so you don't have to implement the details yourself

In short, it is a library that makes at least one of the following easier: model and training implementation, training execution, or visualization.

Caveats

While convenient...

Once you hit a bug, it can be very hard to recover.

  • Since the user base is small, documentation tends to be scarce
  • The framework implementations themselves tend to be complex

These are notable drawbacks. For example, the OpenAI Spinning Up Document states:

RL libraries frequently make choices for abstraction that are good for code reuse between algorithms, but which are unnecessary if you're only writing a single algorithm or supporting a single use case.

The more feature-rich and abstract a framework becomes, the more complex its internal implementation gets, so it is recommended to consider:

  • Do you actually need a framework?
  • How much do you want the framework to handle for you?

when choosing a framework.

Viable Frameworks

RL Coach

A framework made by Intel that supports Tensorflow and MXnet.

Pros

  1. A very large number of algorithms are already implemented rl_coach.png

  2. Supports many training environments, including OpenAI Gym (https://github.com/NervanaSystems/coach#supported-environments)

  3. A preset feature allows you to create files containing only hyperparameters to pass to the model By writing multiple presets, you can try different hyperparameters without modifying the code

  4. You can display agent behavior in real time during training (https://nervanasystems.github.io/coach/usage.html#rendering-the-environment))

  5. The dashboard feature enables detailed visualization of training results

    • In multi-agent training, you can display all agents' training results on a single graph
    • When running multiple experiments with the same algorithm, it can automatically display a graph showing the average of those learning curves  -> This is useful for evaluating model performance when training is unstable

Cons

  1. The overall implementation is complex, making it difficult to add your own agents
  2. Distributing agents is harder than with RLlib (described below), and requires AWS S3
  3. The aforementioned dashboard is not user-friendly

RLlib

A framework that supports all deep learning frameworks and excels at agent distribution. Implementation of agents is somewhat easier with Tensorflow or PyTorch.

Pros

  1. You can set up multi-agent training just by implementing individual workers

    • The Trainer handles the "worker trains -> parameter update" process in multi-agent training
    • Multi-agent training is possible even with deep learning frameworks that don't natively support it
  2. Automatic hyperparameter tuning using distributed computing (tune library)

    • State-of-the-art hyperparameter tuning algorithms such as Population Based Training are also available
  3. Training environments can be added by writing them in OpenAI Gym format

  4. Tensorboard can be used for visualization

Cons

  1. The internal implementation is complex, making it difficult to add your own agents
  2. You need to learn the unique distributed processing notation of the Ray library used internally

dopamine

A reinforcement learning framework from Google (though not an official Google project).

Pros

  1. Records detailed logs during training (learning curves, replay buffers, model parameters) in a memory-efficient manner using gin
  2. Replay videos of agent behavior can be easily created

Cons

  1. Only 4 agent algorithms are implemented (DQN, Rainbow, C51, IQN)
  2. Insufficient documentation
  3. The only supported environment is Atari, and since there is no documentation on how to add environments, adding your own is quite laborious
  4. There are only 4 contributors, and the majority of the code was written by one of them, which raises concerns about the project's future

(Update: There have been no commits since December of last year.)

Stable Baseline

A fork of OpenAI Baselines that supports Tensorflow. More of a collection of model implementations than a framework.

Pros

  1. The API is modeled after scikit-learn, making it easy to use
  2. Basic models are already implemented stable_baseline
  3. The agent algorithm implementations are well-commented and easy to understand, making them easy to modify
  4. Multi-process execution is also supported

Cons

  1. Only environments supported by OpenAI Gym are supported, so you need to write your own wrapper for other environments
  2. You need to write code to run experiments

Frameworks That Seem Less Useful

Tensorforce

A framework for Tensorflow, as the name suggests.

Pros

Properly modularized, making it easy to study and write code.

Cons

  1. Not many agent algorithms are implemented, so you need to write your own
  2. The volume and depth of documentation is somewhat lacking

Keras-RL

A framework for Keras, as the name suggests.

Pros

There is so little documentation that it is not even clear what the advantages are.

Cons

  1. The documentation is unbelievably sparse. Truly sparse. (https://keras-rl.readthedocs.io/en/latest/core))
  2. Not many agent algorithms are implemented, so you need to write your own

References