PPO Config
Browser-based PPO implementation.
Uses TensorFlow.js for training.
CartPole-v1
IDLE
▶ Train
Global Step:
0
Updates:
0
Goal:
Balance the pole. +1 Reward/Step.
Waiting for data...
Avg Reward (Last 10)
0.0
Best Reward
-
Ready to start training...