Whether to offload optimizer to cpu for saving more GPU ram during training. Note that turn on offload_optimizer would further slow down training.