-
Imitation Learning with PWIL
An exercise on the Acrobot Swingup task
-
Image Regression Lessons Learned (in JAX)
An exercise in image processing
-
Vanilla Policy Gradient In JAX
A simple implementation using Acme
-
A Bit on Baselines
Reducing variance and taking names
-
Continuous Log Likelihood
A quick and dirty derivation